Data & Analytics:
anomaly detection in time series

The automated detection of anomalies in data sets makes it possible to respond to changes more quickly and effectively

If individual data points or trends are identified in data sets that deviate from the expected patterns, these are considered anomalies. They provide valuable insights into potential trends, changing consumer behavior, or possible sources of error in the applications used. The data obtained in this way is primarily used to make any necessary strategy adjustments or modify business processes.

Data created at regular intervals

A univariate time series is a series of data points created at regular intervals. These are often used to monitor key performance indicators or industrial processes. In the age of the Internet of Things (IoT) and connected real-time data sources, numerous applications produce important data that changes over time. The analysis of such time series provides valuable insights for any application. 

The specific characteristics of time series mean that anomaly detection is often associated with certain challenges:

  • Time series data is not necessarily independent and identically distributed.
  • In a time series, the value of a variable observed at a given time may be influenced by its value in the past.
  • A time series may include trend, cyclical, seasonal, and/or irregular components.
  • Random effects (referred to as noise) also occur in time series and are sometimes difficult to distinguish from anomalies.

Anomaly detection to identify deviations

This involves finding patterns that deviate from the normal behavior. A distinction is made between three types of anomalies.

  • Point anomalies: A certain value deviates from the observed pattern.
  • Collective anomalies: A series of data points deviates from the rest of the data.
  • Contextual anomalies: One or more data points is anomalous in a specific context, but not otherwise.

Depending on whether the available data is labeled (each data point is marked as normal or abnormal) or unlabeled, there are three different methods of implementing anomaly detection. These can either be performed in the time range or in another segment (e.g. frequency range).

  • Supervised learning: On the basis of labeled input data, the system aims to find a hypothesis that can predict as accurately as possible whether new data points represent an anomaly or not.
  • Semi-supervised learning: The system learns what is normal and what is abnormal based on input data on normal behavior.
  • Unsupervised learning: This method uses unlabeled data and assumes that normal instances are the most common patterns. Consequently, anomalies are defined as data points or series that deviate from these patterns.

Pros and cons of the most common analysis methods

In the case of time series, anomaly detection has to take into account that this is a sequence of data. Typical examples of anomalies in time series from a business perspective include unexpected increases and decreases, changes in trends, as well as changes in levels.

Statistics-based: One of the many methods involves breaking down time series into trend, seasonal, and remaining components and then applying the mean absolute deviation to the remainder to ensure reliable anomaly detection. Another method is based on Robust Principal Component Analysis (RPCS) to detect low level representations of data, noise, and anomalies through repeated singular value decomposition (SVD). This also includes applying thresholds to singular values and errors in each individual operation.

Prediction-based: This includes methods such as the moving average, autoregressive moving average models (ARMA) and their extensions (ARIMA), exponential smoothing, Kalman filters, etc. These are used to build a prediction model for the signal. Anomaly detection is then carried out by comparing the predicted and original signal on the basis of statistical tests.

Hidden Markov model-based (HMM): These methods model the system as a Markov model. This is a finite automaton that characterizes a system based on observable states. It is assumed that the normal time series is produced by a hidden process. Probabilities are assigned to the observed data sequences. As a result, anomalies are always those observations that are highly improbable.

Decision-based: Current methods include, for example, long short term memory networks (LSTM). These are a kind of recurrent neural networks. Classification and regression trees (CART) are also used to perform binary classification (normal and abnormal). Extreme gradient boosting (XGBoost) is the most popular algorithm for CART training. Both methods can also be applied as prediction-based methods.

Use cases and examples of time series analysis

There are various key performance indicators (KPIs) in e-commerce that are suitable for time series analysis. These include:

  • Transaction value
  • Number of transactions
  • Average order value
  • Active users
  • Page views
  • Return on marketing investment (ROMI)

The diagram shows an example of anomaly detection in a consumer goods retailer’s sales time series. The time series data (marked blue) includes both trend and seasonal components. Point anomalies have been highlighted in red. The recorded increases and decreases in the graph are used for further analysis to determine the reasons and controlling factors.