Overview on extracted features¶
tsfresh calculates a comprehensive number of features. All feature calculators are contained in the submodule:
tsfresh.feature_extraction.feature_calculators |
This module contains the feature calculators that take time series as input and calculate the values of the feature. |
The following list contains all the feature calculations supported in the current version of tsfresh:
abs_energy (x) |
Returns the absolute energy of the time series which is the sum over the squared values |
absolute_maximum (x) |
Calculates the highest absolute value of the time series x. |
absolute_sum_of_changes (x) |
Returns the sum over the absolute value of consecutive changes in the series x |
agg_autocorrelation (x, param) |
Descriptive statistics on the autocorrelation of the time series. |
agg_linear_trend (x, param) |
Calculates a linear least-squares regression for values of the time series that were aggregated over chunks versus the sequence from 0 up to the number of chunks minus one. |
approximate_entropy (x, m, r) |
Implements a vectorized Approximate entropy algorithm. |
ar_coefficient (x, param) |
This feature calculator fits the unconditional maximum likelihood of an autoregressive AR(k) process. |
augmented_dickey_fuller (x, param) |
Does the time series have a unit root? |
autocorrelation (x, lag) |
Calculates the autocorrelation of the specified lag, according to the formula [1] |
benford_correlation (x) |
Useful for anomaly detection applications [1][2]. |
binned_entropy (x, max_bins) |
First bins the values of x into max_bins equidistant bins. |
c3 (x, lag) |
Uses c3 statistics to measure non linearity in the time series |
change_quantiles (x, ql, qh, isabs, f_agg) |
First fixes a corridor given by the quantiles ql and qh of the distribution of x. |
cid_ce (x, normalize) |
This function calculator is an estimate for a time series complexity [1] (A more complex time series has more peaks, valleys etc.). |
count_above (x, t) |
Returns the percentage of values in x that are higher than t |
count_above_mean (x) |
Returns the number of values in x that are higher than the mean of x |
count_below (x, t) |
Returns the percentage of values in x that are lower than t |
count_below_mean (x) |
Returns the number of values in x that are lower than the mean of x |
cwt_coefficients (x, param) |
Calculates a Continuous wavelet transform for the Ricker wavelet, also known as the “Mexican hat wavelet” which is |
energy_ratio_by_chunks (x, param) |
Calculates the sum of squares of chunk i out of N chunks expressed as a ratio with the sum of squares over the whole series. |
fft_aggregated (x, param) |
Returns the spectral centroid (mean), variance, skew, and kurtosis of the absolute fourier transform spectrum. |
fft_coefficient (x, param) |
Calculates the fourier coefficients of the one-dimensional discrete Fourier Transform for real input by fast |
first_location_of_maximum (x) |
Returns the first location of the maximum value of x. |
first_location_of_minimum (x) |
Returns the first location of the minimal value of x. |
fourier_entropy (x, bins) |
Calculate the binned entropy of the power spectral density of the time series (using the welch method). |
friedrich_coefficients (x, param) |
Coefficients of polynomial , which has been fitted to |
has_duplicate (x) |
Checks if any value in x occurs more than once |
has_duplicate_max (x) |
Checks if the maximum value of x is observed more than once |
has_duplicate_min (x) |
Checks if the minimal value of x is observed more than once |
index_mass_quantile (x, param) |
Calculates the relative index i of time series x where q% of the mass of x lies left of i. |
kurtosis (x) |
Returns the kurtosis of x (calculated with the adjusted Fisher-Pearson standardized moment coefficient G2). |
large_standard_deviation (x, r) |
Does time series have large standard deviation? |
last_location_of_maximum (x) |
Returns the relative last location of the maximum value of x. |
last_location_of_minimum (x) |
Returns the last location of the minimal value of x. |
lempel_ziv_complexity (x, bins) |
Calculate a complexity estimate based on the Lempel-Ziv compression algorithm. |
length (x) |
Returns the length of x |
linear_trend (x, param) |
Calculate a linear least-squares regression for the values of the time series versus the sequence from 0 to length of the time series minus one. |
linear_trend_timewise (x, param) |
Calculate a linear least-squares regression for the values of the time series versus the sequence from 0 to length of the time series minus one. |
longest_strike_above_mean (x) |
Returns the length of the longest consecutive subsequence in x that is bigger than the mean of x |
longest_strike_below_mean (x) |
Returns the length of the longest consecutive subsequence in x that is smaller than the mean of x |
matrix_profile (x, param) |
Calculates the 1-D Matrix Profile[1] and returns Tukey’s Five Number Set plus the mean of that Matrix Profile. |
max_langevin_fixed_point (x, r, m) |
Largest fixed point of dynamics :math:argmax_x {h(x)=0}` estimated from polynomial , |
maximum (x) |
Calculates the highest value of the time series x. |
mean (x) |
Returns the mean of x |
mean_abs_change (x) |
Average over first differences. |
mean_change (x) |
Average over time series differences. |
mean_n_absolute_max (x, number_of_maxima) |
Calculates the arithmetic mean of the n absolute maximum values of the time series. |
mean_second_derivative_central (x) |
Returns the mean value of a central approximation of the second derivative |
median (x) |
Returns the median of x |
minimum (x) |
Calculates the lowest value of the time series x. |
number_crossing_m (x, m) |
Calculates the number of crossings of x on m. |
number_cwt_peaks (x, n) |
Number of different peaks in x. |
number_peaks (x, n) |
Calculates the number of peaks of at least support n in the time series x. |
partial_autocorrelation (x, param) |
Calculates the value of the partial autocorrelation function at the given lag. |
percentage_of_reoccurring_datapoints_to_all_datapoints (x) |
Returns the percentage of non-unique data points. |
percentage_of_reoccurring_values_to_all_values (x) |
Returns the percentage of values that are present in the time series more than once. |
permutation_entropy (x, tau, dimension) |
Calculate the permutation entropy. |
quantile (x, q) |
Calculates the q quantile of x. |
query_similarity_count (x, param) |
This feature calculator accepts an input query subsequence parameter, compares the query (under z-normalized Euclidean distance) to all subsequences within the time series, and returns a count of the number of times the query was found in the time series (within some predefined maximum distance threshold). |
range_count (x, min, max) |
Count observed values within the interval [min, max). |
ratio_beyond_r_sigma (x, r) |
Ratio of values that are more than r * std(x) (so r times sigma) away from the mean of x. |
ratio_value_number_to_time_series_length (x) |
Returns a factor which is 1 if all values in the time series occur only once, and below one if this is not the case. |
root_mean_square (x) |
Returns the root mean square (rms) of the time series. |
sample_entropy (x) |
Calculate and return sample entropy of x. |
set_property (key, value) |
This method returns a decorator that sets the property key of the function to value |
skewness (x) |
Returns the sample skewness of x (calculated with the adjusted Fisher-Pearson standardized moment coefficient G1). |
spkt_welch_density (x, param) |
This feature calculator estimates the cross power spectral density of the time series x at different frequencies. |
standard_deviation (x) |
Returns the standard deviation of x |
sum_of_reoccurring_data_points (x) |
Returns the sum of all data points, that are present in the time series more than once. |
sum_of_reoccurring_values (x) |
Returns the sum of all values, that are present in the time series more than once. |
sum_values (x) |
Calculates the sum over the time series values |
symmetry_looking (x, param) |
Boolean variable denoting if the distribution of x looks symmetric. |
time_reversal_asymmetry_statistic (x, lag) |
Returns the time reversal asymmetry statistic. |
value_count (x, value) |
Count occurrences of value in time series x. |
variance (x) |
Returns the variance of x |
variance_larger_than_standard_deviation (x) |
Is variance higher than the standard deviation? |
variation_coefficient (x) |
Returns the variation coefficient (standard error / mean, give relative value of variation around mean) of x. |