Anomaly Detection via Topological Feature Map (ADTM)

Integrated System Health Management (ISHM) is mission-critical for NASA spacecrafts and is going to become increasingly so as we explore farther and farther away. As we plan for human exploration of Mars, factors such as the long communication lag between Earth and Mars become very important factors. Autonomy is an important piece of the design puzzle. A mix of different types of autonomy are important to support the wide variety of manned and unmanned missions. Full autonomy will be appropriate for unmanned missions whereas systems designed for human-machine collaborations may be desirable for manned missions.

We have developed an artificial intelligence (AI) solution that significantly expands NASA’s real-time and offline ISHM capabilities for future deep-space exploration efforts. Our system, Anomaly Detection via Topological feature Map (ADTM), uses a combination of case-based reasoning (CBR), neural-network based clustering, and supervised classification techniques to predict, detect, and explain anomalies, and guide users in implementing effective mitigations. It also uses these techniques to provide remaining useful life estimates.  ADTM uses Self-Organizing Map (SOM)-based architecture to produce high-resolution clusters of nominal system behavior. ADTM combines a data-drive approach with case-based reasoning to aid in the localization and diagnosis of anomalies once detected. This approach provides the critical ability handle previously unknown anomalies and faults. We have also developed an active learning approach where the SOMs and the case-base can be updated based on user feedback. A unique feature of ADTM is the use of human physiology measurements to inform analysis of system health. Thus, humans serve as additional sensors that can serve as leading indicators of system faults. ADTM includes tools to allow users visualize the status of the system at various levels of granularity, configure and receive alerts about current or predicted future faults, and navigate the case-base to detect trace root causes. We have demonstrated the effectiveness of ADTM for several NASA systems, such as the xEMU Portable Life Support System (PLSS), Mars Transit Vehicle, a Graywater Recycling system maintained by NASA AMES, a Rotating Uninterrupted Power Supply system installed at NASA AMEs, and the wastewater processing subsystem on the International Space Station. We are currently expanding ADTM to include hybrid model-based and data-driven fault analysis capability that will significantly enhances its effectiveness.