Audio-visual Human Emotion Recognition Using Hierarchical Approach
Audio-visual Human Emotion Approach
Abstract
This paper presents automatic human emotion recognition from audio-visual data. Both the hierarchical and flat approaches were implemented to obtain higher classification performance. TheĀ hierarchical approach was based on Mahalanobis distance. The Interactive Emotional dyadic Motion Capture database (IEMOCAP) was acquired and six different emotions, i.e., anger, excited, frustration, sadness, happiness and neutral state were used for the analysis. The method consisted of feature extraction, normalization, different feature selection and classification techniques. For flat approach, the best accuracy of 95.60% was obtained with Support Vector Machine (SVM) classifier and Info Gain feature selection. In the case of hierarchical approach, the best accuracy of 97.53% was achieved with Random ForestĀ classifier and Correlation-based Feature Selection (CFS).