Pham Xuan Trung

Korea Advanced Institute of Science and Technology (KAIST)

Personal Information

πŸŽ“ Greetings, I have completed my integrated Ms-Ph.D. degree in Electrical Engineering at KAIST in 2018-2025 in South Korea, under the guidance of Professor Chang D. Yoo. My interest relates to Computer Vision, Deep Learning, Machine Learning, Generative AI, and Image/Video/Audio processing: Self-supervised Learning, Generative Models, Diffusion Models, Multimodal Learning, Speech Processing and Natural Language Processing. I continued doing Postdoctoral research at KAIST after my doctoral degree is finished.

πŸ™ In the past, I graduated in 2014 from Hanoi University of Science and Technology (HUST, a top-tier university in Vietnam), with the School of Electronics and Telecommunications ranked 10/526 students (top 1.9%). After that, I worked for VNPT Technology Corporation (employees > 1000+) in Hanoi (Vietnam) for 3 years until 2018, mainly doing research and deploying 2G, 3G & 4G mobile communication network projects with various vendors such as Alcatel-Lucent, Nokia Siemens, and SAMSUNG.

Email  /  CV  /  Google Scholar  /  LinkedIn

profile photo

International Conferences and Journals

I take immense pride in my contributions to the academic community, having disseminated my findings through publications in top-tier venues: ICML (1), ICLR (2), CVPR (4), NeurIPS (1), ECCV (1), Advanced Materials (1, IF: 32), Nano Energy (1, IF: 19), IEEE TCSVT (1, IF: 8.4), IEEE TBC (1, IF: 4.5), IEEE Access (3, IF: 3.9), ICASSP (1).

Reviewers & Program Committee Members

I served as a Reviewer and Program Committee at various prestigious conferences and journals

  1. ICML 2024, ICML 2025 [The International Conference on Machine Learning]
  2. ICLR 2024, ICLR 2025 [The International Conference on Learning Representations]
  3. NeurIPS 2023, 2024, 2025 [The Conference on Neural Information Processing Systems]
  4. CVPR 2023, 2024, 2025 [The Conference on Computer Vision and Pattern Recognition]
  5. AAAI 2024, AAAI 2025 [The Association for the Advancement of Artificial Intelligence]
  6. ICCV 2023, 2025 [The International Conference on Computer Vision]
  7. ECCV 2024 [The European Conference on Computer Vision]
  8. ACCV 2024 [The Asian Conference on Computer Vision]
  9. ICASSP 2024, ICASSP 2025 [The International Conference on Acoustics, Speech, and Signal Processing]
  10. AISTATS 2025 [International Conference on Artificial Intelligence and Statistics]
  11. IJCNN 2025 [International Joint Conference on Neural Networks]
  12. ACM Multimedia 2025 [ACM International Conference on Multimedia]
  13. WACV 2026 [Winter Conference on Applications of Computer Vision]
  14. Neural Networks (NN) 2023, Impact Factor: 8.67 [Certificate]
  15. IEEE Transaction on Multimedia (TMM) 2023, Impact Factor: 7.39
  16. Computer Vision and Image Understanding (CVIU) 2024, Impact Factor: 4.3 [Certificate]
  17. ISPRS Journal of Photogrammetry and Remote Sensing (ISPRS) 2024, Impact Factor: 11.83 [Certificate]
  18. Expert Systems With Applications (ESWA) 2024, 2025, Impact Factor: 8.5 [Certificate]
  19. Digital Signal Processing (DSP) 2025, Impact Factor: 3.4 [Certificate]
  20. Transactions on Machine Learning Research (TMLR) 2025
  21. Engineering Applications of Artificial Intelligence (EAAI) 2025, Impact Factor: 7.5 [Certificate]
  22. Information Processing and Management (IPM) 2025, Impact Factor: 7.4 [Certificate]
  23. IEEE Transactions on Circuits and Systems for Video Technology (TCSVT) 2025, Impact Factor: 8.4 [Certificate]

Awards

  • πŸ… Won Jang Youngsil Fellow Program, funded by KAIST. It is the most prestigious scholarship offered by KAIST (in South Korea) to support world-class researchers, 2025 [Certificate]
  • πŸ… Won Award of the Top 100 Best Korea National Researches, 2023 [Certificate]
  • πŸ… Annual Encourage Scholarship by Hanoi University of Science and Technology (HUST) for excellent students with outstanding performance for every semester, 2009 – 2014
  • πŸ… Won Trang Nguyen Flower award for the best student among thousands of students in Giao Thuy B High School, 2009
  • πŸ… Gold Medal: Won First Prize in Mathematics Contest for High School Students in Grade 12, 2009

Breaking News

  • [2025] Sep. 01: Dr. Trung X. Pham has started his research as a Postdoctoral Researcher at KAIST.
  • [2025] June 25: Dr. Trung X. Pham has accepted an invitation to serve as a Program Committee (PC) member for WACV 2026. He will contribute to the paper review and selection process for this leading conference in computer vision research.
  • >> Spotlight: [2025] May 29: πŸŽ“ Trung X. Pham has successfully defended his Ph.D. degree in the School of Electrical Engineering at KAIST. This marks the completion of his doctoral journey and the beginning of his postdoctoral research career.

  • [2025] May 15: Research paper of Zeroshot object customization has been released on Arxiv.
  • [2025] May 01: Trung X. Pham has accepted an invitation to serve as a Program Committee (PC) member for NeurIPS 2025. He will contribute to the paper review and selection process for this leading conference in AI and neural information processing systems.
  • >> Spotlight: [2025] April 09: πŸ… Trung X. Pham (Ph.D.) has been selected for the 2025 KAIST Jang Young Sil Fellowship Program (Postdoctoral Track). This prestigious program is KAIST’s most competitive fellowship, designed to support the development of world-class researchers. It attracts top talent from around the world.

  • [2025] Feb 22: One paper has been accepted to CVPR 2025.
  • [2025] Jan 10: One paper has been accepted to ICLR 2025.
  • [2024] August 08: One paper has been accepted to IEEE Transactions on Broadcasting (TBD) 2024.
  • [2024] May 02: One paper has been accepted to ICML 2024.
  • [2024] March 25: One paper has been accepted to IEEE Transactions on Circuits and Systems for Video Technology (TCSVT) 2024.
  • Read More...

Recent Publications

* denotes equal contributions. My research was first recorded in 2018.

[2025] E-MD3C: Taming Masked Diffusion Transformers for Efficient Zero-Shot Object Customization

Trung X. Pham, Zhang Kang, Hong Ji Woo, Xuran Zheng, Chang D. Yoo

Arxiv February 2025 [OpenReview] [Code]


A state-of-the-art framework, a novel framework for vision-inspired generation optimized for model parameter size, memory consumption, and inference speed using denoising masked diffusion transformers, facilitating efficient zero-shot object customization without reliance on giant pre-trained diffusion models as existing works.


[2025] MDSGen: Fast and Efficient Masked Diffusion Temporal-Aware Transformers for Open-Domain Sound Generation

Trung X. Pham*, Tri Ton*, and Chang D. Yoo

International Conference on Learning Representations (ICLR 2025), held in Singapore [OpenReview] [Code]


A state-of-the-art framework, a novel framework for vision-guided open-domain sound generation optimized for model parameter size, memory consumption, and inference speed using denoising masked diffusion transformers, facilitating efficient generation without reliance on pre-trained diffusion models.


[2025] ITA-MDT: Image-Timestep-Adaptive Masked Diffusion Transformer Framework for Image-Based Virtual Try-On

Ji Woo Hong, Tri Ton, Trung X. Pham, Gwanhyeong Koo, Sunjae Yoon, and Chang D. Yoo

The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2025), acceptance rate 22.1%, held in the United States of America [OpenReview] [Code]


A novel and state-of-the-art framework for Virtual Tryon using Masked Diffusion Models.


[2024] Cross-view Masked Diffusion Transformers for Person Image Synthesis

Trung X. Pham*, Zhang Kang*, and Chang D. Yoo

International Conference on Machine Learning (ICML 2024), held in Vienna, Austria [OpenReview] [Code]


A state-of-the-art framework for pose-guided human image synthesis using the cutting-edge technique of masked diffusion transformers.


[2024] ACDMSR: Accelerated conditional diffusion models for single image super-resolution

Axi Niu, Trung X. Pham, Kang Zhang, Jinqiu Sun, Yu Zhu, Qingsen Yan, In So Kweon, Yanning Zhang

IEEE Transactions on Broadcasting (TBD 2024), IF: 5.19 [Links IEEE]


A novel framework for Speeding up Image Super-Resolution.


[2024] Learning from multi-perception features for real-word image super-resolution

Axi Niu, Kang Zhang, Trung X. Pham, Pei Wang, Jinqiu Sun, In So Kweon, Yanning Zhang

IEEE Transactions on Circuits and Systems for Video Technology (TCSVT 2024), IF: 8.4 [Links IEEE]


MPF-Net for Single Image Super-Resolution.


[2023] Self-supervised visual representation learning via residual momentum

Trung X. Pham, Axi Niu, Kang Zhang, Tee Joshua Tian Jin, Ji Woo Hong, Chang D Yoo

IEEE Access 2023, acceptance rate 30% [Links IEEE]


Introduction of residual momentum that significantly improves self-supervised learning frameworks.


[2023] DimCL: Dimensional Contrastive Learning for Improving Self-Supervised Learning

Thanh Nguyen*, Trung X. Pham*, Chaoning Zhang, Tung M Luu, Thang Vu, Chang D Yoo

IEEE Access 2023, acceptance rate 30% [Links IEEE]


Introduction of a new regularization that significantly improves self-supervised learning frameworks.


[2023] Cdpmsr: Conditional diffusion probabilistic models for single image super-resolution

Axi Niu, Kang Zhang, Trung X. Pham, Jinqiu Sun, Yu Zhu, In So Kweon, Yanning Zhang

IEEE International Conference on Image Processing (ICIP) 2023, acceptance rate 47% [Links]


Conditional diffusion model for image super-resolution with post-process technique


[2022] Deep learning-based noise robust flexible piezoelectric acoustic sensors for speech processing

Young Hoon Jung*, Trung X. Pham*, Dias Issa, Hee Seung Wang, Jae Hee Lee, Mingi Chung, Bo-Yeon Lee, Gwangsu Kim, Chang D Yoo, Keon Jae Lee

Nano Energy (IF: 19.0) 2022 [Links]

>> Top 100 best Korea national researches 2023 [Certificate]


An excellent combination of deep learning and flexible piezoelectric acoustic sensor for >99% accuracy of speaker recognition.


[2022] How does simsiam avoid collapse without negative samples? a unified understanding with self-supervised contrastive learning

Chaoning Zhang, Kang Zhang, Chenshuang Zhang, Trung X. Pham, Chang D Yoo, In So Kweon

International Conference on Learning Representations (ICLR 2022), acceptance rate 32.9% [OpenReview]


A deep analysis of constrastive learning frameworks to clarify the collapse issue.


[2022] Dual temperature helps contrastive learning without many negative samples: Towards understanding and simplifying moco

Chaoning Zhang*, Kang Zhang*, Trung X. Pham*, Axi Niu, Zhinan Qiao, Chang D Yoo, In So Kweon

The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2022), acceptance rate 25.3% [Links]


The new loss function is proposed to reduce the massive number of negative samples in contrastive learning frameworks.


[2022] On the pros and cons of momentum encoder in self-supervised visual representation learning

Trung X. Pham, Chaoning Zhang, Axi Niu, Kang Zhang, Chang D Yoo

Arxiv 2022 [Links]


A deep investigation on the pros and cons of EMA-based contrastive learning frameworks.


[2022] Lad: A hybrid deep learning system for benign paroxysmal positional vertigo disorders diagnostic

Trung X. Pham*, Jin Woong Choi*, Rusty John Lloyd Mina, Thanh Xuan Nguyen, Sultan Rizky Madjid, Chang D Yoo

IEEE Access 2022, acceptance rate 30% [Links IEEE] [Code]


Using AI deep learning to diagnose BPPV disorders in patients in hospital, data from Chungnam National Hospital University.


[2021] Self-supervised Learning with Local Attention-Aware Feature

Trung X. Pham*, Rusty John Lloyd Mina, Dias Issa, Chang D. Yoo

Arxiv 2021 [Links]


Learning representation of the data without any labels.


[2021] Robust MAML: Prioritization task buffer with adaptive learning process for model-agnostic meta-learning

Thanh Nguyen, Tung Luu, Trung X. Pham, Sanzhar Rakhimkul, Chang D Yoo

IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2021, acceptance rate 48.4% [Links IEEE]


Learning meta model with adaptive learning rate scheme.


[2020] Learning augmentation network via influence functions

Donghoon Lee, Hyunsin Park, Trung X. Pham, Chang D Yoo

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2020), acceptance rate 22% [Links IEEE]


Learning augmentation with the differentiable neural network with a fancy influence approach.


[2020] Modality shifting attention network for multi-modal video question answering

Junyeong Kim, Minuk Ma, Trung X. Pham, Kyungsu Kim, Chang D Yoo

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2020), acceptance rate 22% [Links IEEE]


Video question answering via a smart design neural network.


[2019] Cascade rpn: Delving into high-quality region proposal network with adaptive convolution

Thang Vu, Hyunjun Jang, Trung X. Pham, Chang D Yoo

Advances in Neural Information Processing Systems (NeurIPS 2019), acceptance rate 21.6% [Links]


>> Spotlight (top 2.4%)

Significantly improve object detection of 2D images with a ground-breaking design.


[2020] Flexible piezoelectric acoustic sensors and machine learning for speech processing

Young Hoon Jung, Seong Kwang Hong, Hee Seung Wang, Jae Hyun Han, Trung X. Pham, Hyunsin Park, Junyeong Kim, Sunghun Kang, Chang D Yoo, Keon Jae Lee

Advanced Materials 2020 (IF: 32) [Links]


Combining AI/machine learning with flexible acoustic sensors for Speech Processing.


[2018] Fast and efficient image quality enhancement via desubpixel convolutional neural networks

Thang Vu, Cao Van Nguyen, Trung X. Pham, Tung M Luu, Chang D Yoo

Proceedings of the European Conference on Computer Vision (ECCV 2018), acceptance rate 31.8% [Links]


Efficient framework for image super-resolution


[2019] Short Convolutional Neural Network and MFCCs for Accurate Speaker Recognition Systems

Trung X. Pham and Chang D Yoo

The 34th International Technical Conference on Circuits/Systems, Computers and Communications (ITC-CSCC 2019) [Links]


A lightweight, accurate, and efficient deep neural network for speaker recognition systems.


Co-operations

I am open to collaborating with researchers on various topics in Deep Learning, Machine Learning, and AI, including but not limited to Computer Vision, Generative AI, Video/Image/Audio Processing, and Natural Language Processing (NLP). Feel free to contact me at: trungpx@kaist.ac.kr or phamxuantrungbk@gmail.com