Feature Engineering
This section covers the comprehensive feature extraction techniques provided by the VitalDSP library for analyzing physiological signals such as ECG and PPG. These methods help to derive meaningful features that describe various morphological, autonomic, and synchronization characteristics of the signals, with detailed clinical interpretations and applications.
Overview
The feature engineering module provides specialized tools for extracting clinically relevant features from physiological signals. These features are designed to provide insights into cardiovascular health, autonomic nervous system function, and disease progression.
- Key Capabilities
Morphological Features: Waveform shape and structure analysis with clinical significance
Autonomic Features: Heart rate variability and autonomic nervous system indicators
Synchronization Features: Multi-signal correlation and timing analysis
Light Source Features: PPG-specific features for different light wavelengths
Volume Features: Blood volume and pressure-related characteristics
Clinical Interpretation: Built-in clinical significance assessment and health indicators
- Clinical Applications
Cardiovascular Health Assessment: Evaluation of heart function and vascular compliance
Stress and Infection Detection: Early identification of physiological stress and infection severity
Disease Progression Monitoring: Tracking of chronic conditions and treatment response
Sleep and Respiratory Health: Analysis of breathing patterns and sleep quality
Mental Health Assessment: Evaluation of stress, anxiety, and cognitive load
- Feature Categories
Time-Domain Features: Statistical and temporal characteristics
Frequency-Domain Features: Spectral and power characteristics
Nonlinear Features: Complexity and entropy measures
Morphological Features: Waveform shape and structure
Cross-Signal Features: Multi-signal relationships and synchronization
Clinical Feature Interpretation
The following sections provide detailed clinical interpretation guidelines for extracted features, based on extensive research and clinical validation. These guidelines help healthcare professionals understand the clinical significance of extracted features.
ECG Morphological Features
ECG morphological features provide crucial information about cardiac health and disease progression:
- P-Wave Features
- P-Wave Duration:
Normal Range: 80-110 ms
Clinical Significance: Prolonged duration suggests atrial dilation, often associated with heart failure or infections affecting the heart
Changes in amplitude may indicate pericarditis (inflammation of the pericardium)
- PR Interval Features
- PR Interval Duration:
Normal Range: 120-200 ms
Clinical Significance: Prolonged PR interval may suggest electrolyte imbalances or autonomic dysfunction, often seen in sepsis
Correlates with atrioventricular (AV) nodal conduction
- QRS Complex Features
- QRS Duration:
Normal Range: 80-120 ms
Clinical Significance: Widened QRS complexes suggest conduction delays, often caused by myocardial ischemia, bundle branch blocks, or ventricular hypertrophy
Correlated with ventricular conduction and conditions like bundle branch blocks
- ST Segment Features
- ST Segment Duration:
Normal Range: 80-120 ms
Clinical Significance: ST elevation can indicate myocarditis, pericarditis, or acute myocardial infarction
ST depression suggests ischemia, which can occur during sepsis, shock, or cardiac complications
- QT Interval Features
- QT Interval Duration:
Normal Range: 350-450 ms (corrected for heart rate)
Clinical Significance: Prolonged QT interval indicates risk of life-threatening arrhythmias such as torsades de pointes
Can be triggered by electrolyte imbalances, medications, or infection-induced stress
PPG Morphological Features
PPG morphological features provide insights into cardiovascular and respiratory health:
- Systolic and Diastolic Features
- Systolic Duration:
Normal Ratio: 0.6-0.8 (Systolic:Diastolic)
Clinical Significance: Longer systolic durations indicate reduced arterial compliance
Alterations may reflect arterial stiffness, hypertension, or atherosclerosis
- Amplitude Features
- Systolic Amplitude:
Clinical Significance: Decrease in systolic amplitude suggests poor perfusion
Patients with systemic infections (sepsis) may show reduced systolic amplitude due to decreased cardiac output
- Pulse Wave Features
- Pulse Wave Transit Time (PWTT):
Normal Range: 100-300 ms
Clinical Significance: Shorter PWTT indicates increased arterial stiffness
Related to hypertension, atherosclerosis, or cardiovascular stress aggravated by infection
- Respiratory Features
- Respiratory Sinus Arrhythmia (RSA):
Normal Range: 5-20% variation during respiration
Clinical Significance: Reduced RSA indicates poor autonomic control, often associated with stress or chronic disease
Patients with respiratory infections may exhibit reduced RSA
Heart Rate Variability Features
HRV features provide insights into autonomic nervous system function and health status:
- Time-Domain HRV Features
- SDNN (Standard Deviation of NN Intervals):
Normal Range: 20-50 ms (1 min), 50-150 ms (5 min)
Clinical Significance: Decreasing SDNN indicates reduced HRV, reflecting stress, infection, or autonomic dysfunction
Low SDNN is associated with increased mortality in sepsis, cardiac dysfunction, and ARDS
- RMSSD (Root Mean Square of Successive Differences):
Normal Range: 20-50 ms (1 min), 30-60 ms (5 min)
Clinical Significance: Lower RMSSD suggests parasympathetic withdrawal, common in infections and sepsis
Indicates parasympathetic dysfunction and increased sympathetic dominance
- pNN50 (Proportion of NN Intervals differing by more than 50 ms):
Normal Range: 10-40% (1 min), 15-45% (5 min)
Clinical Significance: Decrease indicates early autonomic nervous system imbalance
Common in chronic diseases or infections
- Frequency-Domain HRV Features
- LF Power (Low Frequency):
Normal Range: 300-1200 ms²
Clinical Significance: Increased LF power can indicate elevated stress or infection levels
In sepsis or systemic infections, sympathetic activation may increase LF power
- HF Power (High Frequency):
Normal Range: 200-1000 ms²
Clinical Significance: Reduced HF power suggests stress, fatigue, or infection
In chronic or acute illness, HF power may drop due to reduced parasympathetic influence
- LF/HF Ratio:
Normal Range: 0.5-2.0
Clinical Significance: Higher ratio indicates sympathetic dominance (stress, acute infection)
In infectious diseases or sepsis, higher LF/HF ratio indicates autonomic imbalance
Morphological Features
Morphology Features
Techniques to extract morphological features from physiological waveforms, including the detection of peaks, troughs, and various segments in ECG and PPG signals with clinical interpretation.
Feature Engineering Module for Physiological Signal Processing
This module provides comprehensive capabilities for physiological signal processing including ECG, PPG, EEG, and other vital signs.
Author: vitalDSP Team Date: 2025-01-27 Version: 1.0.0
Key Features: - Object-oriented design with comprehensive classes - Multiple processing methods and functions - NumPy integration for numerical computations - SciPy integration for advanced signal processing - Configurable parameters and settings
Examples:
- Basic usage:
>>> import numpy as np >>> from vitalDSP.feature_engineering.morphology_features import MorphologyFeatures >>> signal = np.random.randn(1000) >>> processor = MorphologyFeatures(signal) >>> result = processor.process() >>> print(f'Processing result: {result}')
- class vitalDSP.feature_engineering.morphology_features.PhysiologicalFeatureExtractor(signal, fs=1000)[source]
Bases:
objectA class to extract various physiological features from ECG and PPG signals, such as durations, areas, amplitude variability, slope ratios, and dicrotic notch locations.
Features for PPG: - Systolic/diastolic duration, area, amplitude variability - Signal skewness, slope, peak trends, and dicrotic notch locations
Features for ECG: - QRS duration, area, T-wave area - Amplitude variability, QRS-T ratios, and QRS slope - Signal skewness, peak trends
- preprocess_signal(preprocess_config)
Preprocess the signal by applying bandpass filtering and noise reduction.
- extract_features(signal_type='ECG', preprocess_config=None)[source]
Extract all features (morphology, volume, amplitude variability, dicrotic notch) for ECG or PPG signals.
Examples
>>> import numpy as np >>> from vitalDSP.feature_engineering.morphology_features import PhysiologicalFeatureExtractor >>> from vitalDSP.preprocess.preprocess_operations import PreprocessConfig >>> >>> # Example 1: ECG feature extraction >>> ecg_signal = np.random.randn(1000) # Simulated ECG signal >>> extractor = PhysiologicalFeatureExtractor(ecg_signal, fs=256) >>> ecg_features = extractor.extract_features(signal_type="ECG") >>> print(f"ECG features extracted: {len(ecg_features)}") >>> print(f"QRS duration: {ecg_features.get('qrs_duration', 'N/A')}") >>> >>> # Example 2: PPG feature extraction with preprocessing >>> ppg_signal = np.random.randn(2000) # Simulated PPG signal >>> extractor_ppg = PhysiologicalFeatureExtractor(ppg_signal, fs=128) >>> config = PreprocessConfig( ... filter_type="bandpass", ... lowcut=0.5, ... highcut=8.0, ... noise_reduction_method="wavelet" ... ) >>> ppg_features = extractor_ppg.extract_features(signal_type="PPG", preprocess_config=config) >>> print(f"PPG features extracted: {len(ppg_features)}") >>> print(f"Systolic duration: {ppg_features.get('systolic_duration', 'N/A')}") >>> >>> # Example 3: EEG feature extraction >>> eeg_signal = np.random.randn(1500) # Simulated EEG signal >>> extractor_eeg = PhysiologicalFeatureExtractor(eeg_signal, fs=512) >>> eeg_features = extractor_eeg.extract_features(signal_type="EEG") >>> print(f"EEG features extracted: {len(eeg_features)}")
- compute_amplitude_variability(peaks)[source]
Compute the variability of the amplitudes at the given peak locations.
- Parameters:
peaks (numpy.ndarray) – The set of peaks.
- Returns:
variability – The amplitude variability (standard deviation of the peak amplitudes).
- Return type:
Examples
>>> signal = np.random.randn(1000) >>> peaks = np.array([100, 200, 300]) >>> extractor = PhysiologicalFeatureExtractor(signal) >>> variability = extractor.compute_amplitude_variability(peaks) >>> print(variability)
- compute_peak_trend(peaks)[source]
Compute the trend slope of peak amplitudes over time.
- Parameters:
peaks (numpy.ndarray) – The set of peaks.
- Returns:
trend_slope – The slope of the peak amplitude trend over time.
- Return type:
Examples
>>> signal = np.random.randn(1000) >>> peaks = np.array([100, 200, 300]) >>> extractor = PhysiologicalFeatureExtractor(signal) >>> trend_slope = extractor.compute_peak_trend(peaks) >>> print(trend_slope)
- detect_troughs(peaks)[source]
Detect troughs (valleys) in the signal based on the given peaks.
- Parameters:
peaks (numpy.ndarray) – The indices of detected peaks.
- Returns:
troughs – The indices of detected troughs.
- Return type:
Examples
>>> peaks = np.array([100, 200, 300]) >>> troughs = extractor.detect_troughs(peaks) >>> print(troughs)
- extract_features(signal_type='ECG', preprocess_config=None, peak_config=None, options=None)[source]
Extract all physiological features from the signal for either ECG or PPG.
- Parameters:
signal_type (str, optional) – The type of signal. Options: ‘ECG’, ‘PPG’, ‘EEG’. Default is “ECG”.
preprocess_config (PreprocessConfig, optional) – The configuration object for signal preprocessing. If None, default settings are used.
peak_config (dict, optional) – Configuration for peak detection parameters. If None, default settings are used.
- Returns:
features – A dictionary containing the extracted features, such as durations, areas, amplitude variability, slopes, skewness, peak trends, and dicrotic notch locations.
- Return type:
Examples
>>> signal = np.random.randn(1000) >>> preprocess_config = PreprocessConfig() >>> extractor = PhysiologicalFeatureExtractor(signal, fs=1000) >>> features = extractor.extract_features(signal_type="ECG", preprocess_config=preprocess_config) >>> print(features)
- get_preprocess_signal(preprocess_config)[source]
Preprocess the signal by applying bandpass filtering and noise reduction.
- Parameters:
preprocess_config (PreprocessConfig) – Configuration for both signal filtering and artifact removal.
- Returns:
clean_signal – The preprocessed signal, cleaned of noise and artifacts.
- Return type:
Examples
>>> signal = np.sin(np.linspace(0, 10, 1000)) + np.random.normal(0, 0.1, 1000) >>> preprocess_config = PreprocessConfig() >>> extractor = PhysiologicalFeatureExtractor(signal, fs=1000) >>> preprocessed_signal = extractor.preprocess_signal(preprocess_config) >>> print(preprocessed_signal)
Autonomic Features
ECG Autonomic Features
Extract autonomic nervous system indicators from ECG signals, including heart rate variability and autonomic balance measures.
ECG Autonomic Features Module for Physiological Signal Processing
This module provides comprehensive ECG feature extraction capabilities focusing on autonomic nervous system analysis. It implements advanced algorithms for detecting ECG waveform components, computing intervals, and identifying arrhythmias for cardiovascular health assessment.
Author: vitalDSP Team Date: 2025-01-27 Version: 1.0.0
Key Features: - P-wave analysis (duration, amplitude) - PR Interval computation (P-wave to QRS onset) - QRS Complex analysis (width, amplitude) - ST Segment analysis (elevation, depression) - QT Interval computation (QRS onset to T-wave end) - Arrhythmia detection (AFib, VTach, Bradycardia) - Waveform morphology analysis - Comprehensive ECG feature extraction
Examples:
- Basic ECG feature extraction:
>>> import numpy as np >>> from vitalDSP.feature_engineering.ecg_autonomic_features import ECGExtractor >>> ecg_signal = np.random.rand(1000) # Replace with actual ECG signal >>> fs = 250 # Sampling frequency in Hz >>> extractor = ECGExtractor(ecg_signal, fs) >>> p_wave_duration = extractor.compute_p_wave_duration() >>> pr_interval = extractor.compute_pr_interval() >>> qrs_width = extractor.compute_qrs_width() >>> print(f"P-wave Duration: {p_wave_duration}, PR Interval: {pr_interval}")
- Advanced ECG analysis:
>>> qt_interval = extractor.compute_qt_interval() >>> st_segment = extractor.compute_st_segment() >>> arrhythmias = extractor.detect_arrhythmias() >>> print(f"QT Interval: {qt_interval}, ST Segment: {st_segment}") >>> print(f"Arrhythmias detected: {arrhythmias}")
- Comprehensive feature extraction:
>>> all_features = extractor.extract_all_features() >>> print(f"Extracted {len(all_features)} ECG features")
- class vitalDSP.feature_engineering.ecg_autonomic_features.ECGExtractor(ecg_signal, sampling_frequency)[source]
Bases:
objectA class to extract ECG features including: - P-wave analysis (duration, amplitude) - PR Interval (P-wave to QRS onset) - QRS Complex (width, amplitude) - ST Segment (elevation, depression) - QT Interval (QRS onset to T-wave end) - Detection of Arrhythmias (AFib, VTach, Bradycardia)
Example usage:
ecg_signal = np.random.rand(1000) # Replace with actual ECG signal fs = 250 # Sampling frequency in Hz extractor = ECGExtractor(ecg_signal, fs) p_wave_duration = extractor.compute_p_wave_duration() pr_interval = extractor.compute_pr_interval() qrs_width = extractor.compute_qrs_width() qt_interval = extractor.compute_qt_interval() st_segment = extractor.compute_st_segment() arrhythmias = extractor.detect_arrhythmias() print(f"P-wave Duration: {p_wave_duration}, PR Interval: {pr_interval}, QRS Width: {qrs_width}")
- compute_p_wave_duration(r_peaks=None)[source]
Computes the P-wave duration by finding the onset and offset around each detected P-peak.
- Returns:
Mean duration of P-waves in seconds.
- Return type:
- compute_pr_interval(r_peaks=None)[source]
Computes the PR interval from P-wave onset to QRS onset (Q-valley).
- Returns:
Mean PR interval in seconds.
- Return type:
- compute_qrs_duration(r_peaks=None)[source]
Computes the QRS duration using WaveformMorphology.
- Returns:
The mean duration of QRS complexes in seconds.
- Return type:
- compute_qt_interval()[source]
Computes the QT interval (from QRS onset to T-wave end).
- Returns:
QT interval in seconds.
- Return type:
- compute_s_wave(r_peaks=None)[source]
Detects the S-wave based on the R-peaks using WaveformMorphology.
- Returns:
Indices of detected S-wave points.
- Return type:
np.array
- compute_st_interval()[source]
Computes the ST segment duration (from S-wave to T-wave peak).
- Returns:
Mean ST segment duration in seconds.
- Return type:
- detect_arrhythmias(r_peaks=None)[source]
Detects basic arrhythmias such as: - Atrial Fibrillation (AFib) - Ventricular Tachycardia (VTach) - Bradycardia (slow heart rate)
- Returns:
Dictionary containing the detected arrhythmias.
- Return type:
- detect_r_peaks()[source]
Detects R-peaks from the ECG signal using WaveformMorphology.
- Returns:
Array of indices where R-peaks are detected.
- Return type:
np.array
PPG Autonomic Features
Extract autonomic nervous system indicators from PPG signals, focusing on pulse rate variability and vascular tone measures.
Feature Engineering Module for Physiological Signal Processing
This module provides comprehensive capabilities for physiological signal processing including ECG, PPG, EEG, and other vital signs.
Author: vitalDSP Team Date: 2025-01-27 Version: 1.0.0
Key Features: - Object-oriented design with comprehensive classes - Multiple processing methods and functions - NumPy integration for numerical computations - SciPy integration for advanced signal processing - Feature extraction capabilities
Examples:
- Basic usage:
>>> import numpy as np >>> from vitalDSP.feature_engineering.ppg_autonomic_features import PpgAutonomicFeatures >>> signal = np.random.randn(1000) >>> processor = PpgAutonomicFeatures(signal) >>> result = processor.process() >>> print(f'Processing result: {result}')
- class vitalDSP.feature_engineering.ppg_autonomic_features.PPGAutonomicFeatures(ppg_signal, sampling_frequency)[source]
Bases:
objectA class to compute respiratory and autonomic features from PPG signals.
Features included: - Respiratory Rate Variability (RRV) - Respiratory Sinus Arrhythmia (RSA) - Autonomic Nervous System Balance (Fractal Dimension, DFA)
Example usage:
import numpy as np from vitalDSP.feature_engineering.ppg_autonomic_features import PPGAutonomicFeatures # Simulated PPG signal data ppg_signal = np.random.rand(1000) # Replace with actual PPG signal fs = 100 # Sampling frequency in Hz features = PPGAutonomicFeatures(ppg_signal, fs) rrv = features.compute_rrv() rsa = features.compute_rsa() fractal = features.compute_fractal_dimension() dfa_value = features.compute_dfa() print(f"RRV: {rrv}, RSA: {rsa}, Fractal Dimension: {fractal}, DFA: {dfa_value}")
- compute_dfa(min_scale=4, max_scale=None, num_scales=20)[source]
Computes the Detrended Fluctuation Analysis (DFA) of the PPG signal using proper multi-scale analysis.
DFA measures the fractal scaling properties by computing the fluctuation function F(n) at multiple window sizes n, then fitting log(F) vs log(n).
- compute_fractal_dimension(k_max=10)[source]
Computes the fractal dimension of the PPG signal using the Higuchi method.
- compute_rrv()[source]
Computes Respiratory Rate Variability (RRV) from the PPG signal.
- Returns:
Respiratory rate variability value.
- Return type:
Synchronization Features
ECG-PPG Synchronization Features
Analyze the synchronization and timing relationships between ECG and PPG signals for comprehensive cardiovascular assessment.
Backward-compatibility shim. The module has been renamed to ecg_ppg_synchronization_features (fixing the typo).
Light Source Features
PPG Light Features
Extract features specific to different light wavelengths in PPG signals, useful for multi-wavelength PPG analysis.
Feature Engineering Module for Physiological Signal Processing
This module provides comprehensive capabilities for physiological signal processing including ECG, PPG, EEG, and other vital signs.
Author: vitalDSP Team Date: 2025-01-27 Version: 1.0.0
Key Features: - Object-oriented design with comprehensive classes - Multiple processing methods and functions - NumPy integration for numerical computations - SciPy integration for advanced signal processing - Feature extraction capabilities
Examples:
- Basic usage:
>>> import numpy as np >>> from vitalDSP.feature_engineering.ppg_light_features import PpgLightFeatures >>> signal = np.random.randn(1000) >>> processor = PpgLightFeatures(signal) >>> result = processor.process() >>> print(f'Processing result: {result}')
- class vitalDSP.feature_engineering.ppg_light_features.PPGLightFeatureExtractor(ir_signal, red_signal=None, sampling_freq=100)[source]
Bases:
objectA class to extract physiological features from PPG signals based on raw data from infrared (IR) and red light sources. This includes SpO2, Perfusion Index (PI), Respiratory Rate (RR), and Photoplethysmogram Ratio (PPR).
Parameters:
- ir_signalnp.array
The infrared light PPG signal.
- red_signalnp.array
The red light PPG signal (optional for features like PI and RR).
- sampling_freqint
The sampling frequency of the signals in Hz.
Example usage:
ppg_extractor = PPGLightFeatureExtractor(ir_signal, red_signal, sampling_freq) spo2, times_spo2 = ppg_extractor.calculate_spo2() pi, times_pi = ppg_extractor.calculate_perfusion_index() rr, times_rr = ppg_extractor.calculate_respiratory_rate() ppr, times_ppr = ppg_extractor.calculate_ppr()
- calculate_perfusion_index(window_seconds=1)[source]
Calculate the Perfusion Index (PI) from the infrared (IR) PPG signal.
Parameters:
- window_secondsint, optional
The window length in seconds to calculate PI (default is 1 second).
Returns:
- pi_valuesnp.array
Calculated perfusion index values for each window.
- timestampsnp.array
Time (in seconds) for each PI value.
- calculate_ppr(window_seconds=1)[source]
Calculate the Photoplethysmogram Ratio (PPR) between infrared (IR) and red light PPG signals.
Parameters:
- window_secondsint, optional
The window length in seconds to calculate PPR (default is 1 second).
Returns:
- ppr_valuesnp.array
Calculated PPR values for each window.
- timestampsnp.array
Time (in seconds) for each PPR value.
- calculate_respiratory_rate(window_seconds=60)[source]
Calculate the Respiratory Rate (RR) from a PPG signal by isolating the respiratory modulation via bandpass filtering in the respiratory frequency band (0.1-0.5 Hz / 6-30 breaths per minute).
Parameters:
- window_secondsint, optional
The window length in seconds to calculate RR (default is 60 seconds).
Returns:
- rr_valuesnp.array
Calculated respiratory rate (in breaths per minute) for each window.
- timestampsnp.array
Time (in seconds) for each RR value.
- calculate_spo2(window_seconds=1)[source]
Calculate SpO2 based on infrared (IR) and red light PPG signals.
Parameters:
- window_secondsint, optional
The window length in seconds to calculate SpO2 (default is 1 second).
Returns:
- spo2_valuesnp.array
Calculated SpO2 values for each window of the signal.
- timestampsnp.array
Time (in seconds) for each SpO2 value.
Usage Examples
Basic Feature Extraction
from vitalDSP.feature_engineering.morphology_features import PhysiologicalFeatureExtractor
from vitalDSP.feature_engineering.ecg_autonomic_features import ECGAutonomicFeatures
# Extract morphological features
extractor = PhysiologicalFeatureExtractor(ecg_signal, fs=sampling_rate)
morph_features = extractor.extract_features(signal_type="ECG")
# Extract autonomic features
autonomic = ECGAutonomicFeatures(ecg_signal, sampling_rate)
autonomic_features = autonomic.extract_autonomic_features()
Multi-Signal Analysis
from vitalDSP.feature_engineering.ecg_ppg_synchronyzation_features import ECGPPSynchronizationFeatures
# Analyze ECG-PPG synchronization
sync = ECGPPSynchronizationFeatures(ecg_signal, ppg_signal, sampling_rate)
sync_features = sync.extract_synchronization_features()
PPG Light Analysis
from vitalDSP.feature_engineering.ppg_light_features import PPGLightFeatures
# Analyze multi-wavelength PPG
light = PPGLightFeatures(red_ppg, ir_ppg, sampling_rate)
light_features = light.extract_light_features()