EFHQ Dataset

Abstract

The existing facial datasets, while having plentiful images at near frontal views, lack images with extreme head poses, leading to the downgraded performance of deep learning models when dealing with profile or pitched faces. This work aims to address this gap by introducing a novel dataset named Extreme Pose Face High-Quality Dataset (EFHQ), which includes a maximum of 450k high-quality images of faces at extreme poses. To produce such a massive dataset, we utilize a novel and meticulous dataset processing pipeline to curate two publicly available datasets, VFHQ and CelebV-HQ, which contain many high-resolution face videos captured in various settings. Our dataset can complement existing datasets on various facial-related tasks, such as facial synthesis with 2D/3D-aware GAN, diffusion-based text-to-image face generation, and face reenactment. Specifically, training with EFHQ helps models generalize well across diverse poses, significantly improving performance in scenarios involving extreme views, confirmed by extensive experiments. Additionally, we utilize EFHQ to define a challenging cross-view face verification benchmark, in which the performance of SOTA face recognition models drops 5-37% compared to frontal-to-frontal scenarios, aiming to stimulate studies on face recognition under severe pose conditions in the wild.

2D GAN-based Face Generation

Comparison between profile-view generated samples of StyleGAN2-ADA training with FFHQ+LPFF (left) and FFHQ+EFHQ (right), with truncation ψ=0.7.

Samples from StyleGAN2-ADA training with FFHQ+EFHQ, with truncation ψ=0.7.

Face Reenactment

Same-Identity Reenactment Comparison between TPS trained on VoxCeleb1 and VoxCeleb1+EFHQ.

Same-Identity Reenactment Comparison between LIA trained on VoxCeleb1 and VoxCeleb1+EFHQ.

Cross-Identity Reenactment Comparison between TPS trained on VoxCeleb1 and VoxCeleb1+EFHQ.

Cross-Identity Reenactment Comparison between LIA trained on VoxCeleb1 and VoxCeleb1+EFHQ.

@inproceedings{dao2024efhq, title={EFHQ: Multi-purpose ExtremePose-Face-HQ dataset}, author={Trung Tuan Dao and Duc Hong Vu and Cuong Pham and Anh Tran}, year={2024}, booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, }

[CVPR 2024] EFHQ: Multi-purpose ExtremePose-Face-HQ dataset

Abstract

2D GAN-based Face Generation

Comparison between profile-view generated samples of StyleGAN2-ADA training with FFHQ+LPFF (left) and FFHQ+EFHQ (right), with truncation ψ=0.7.

Samples from StyleGAN2-ADA training with FFHQ+EFHQ, with truncation ψ=0.7.

3D-aware GAN-based Face Generation

Comparison between multiview generated samples, with truncation ψ=0.8, of EG3D model trained with various datasets.
Top: FFHQ, Middle: FFHQ+LPFF, Bottom: FFHQ+EFHQ.

Face Reenactment

Same-Identity Reenactment Comparison between TPS trained on VoxCeleb1 and VoxCeleb1+EFHQ.

Same-Identity Reenactment Comparison between LIA trained on VoxCeleb1 and VoxCeleb1+EFHQ.

Cross-Identity Reenactment Comparison between TPS trained on VoxCeleb1 and VoxCeleb1+EFHQ.

Cross-Identity Reenactment Comparison between LIA trained on VoxCeleb1 and VoxCeleb1+EFHQ.

BibTeX

[CVPR 2024] EFHQ: Multi-purpose ExtremePose-Face-HQ dataset

Abstract

2D GAN-based Face Generation

Comparison between profile-view generated samples of StyleGAN2-ADA training with FFHQ+LPFF (left) and FFHQ+EFHQ (right), with truncation ψ=0.7.

Samples from StyleGAN2-ADA training with FFHQ+EFHQ, with truncation ψ=0.7.

3D-aware GAN-based Face Generation

Comparison between multiview generated samples, with truncation ψ=0.8, of EG3D model trained with various datasets. Top: FFHQ, Middle: FFHQ+LPFF, Bottom: FFHQ+EFHQ.

Face Reenactment

Same-Identity Reenactment Comparison between TPS trained on VoxCeleb1 and VoxCeleb1+EFHQ.

Same-Identity Reenactment Comparison between LIA trained on VoxCeleb1 and VoxCeleb1+EFHQ.

Cross-Identity Reenactment Comparison between TPS trained on VoxCeleb1 and VoxCeleb1+EFHQ.

Cross-Identity Reenactment Comparison between LIA trained on VoxCeleb1 and VoxCeleb1+EFHQ.

BibTeX

Comparison between multiview generated samples, with truncation ψ=0.8, of EG3D model trained with various datasets.
Top: FFHQ, Middle: FFHQ+LPFF, Bottom: FFHQ+EFHQ.