I am currently a second-year Doctor Candidate at Harbin Institute of Technology (HIT), majoring in Computer Science and Technology under the guidance of Prof. Wanxiang Che. Before this, I graduated from Tsinghua University (THU) in 2024 with a Master’s degree in computer science, supervised by Prof. Yang Liu and Prof. Weidong Liu. I also closely worked with Prof. Peng Li and Shuo Wang when I was in Tsinghua University. Several years ago, I graduated from the National University of Defense Technology (NUDT) in 2015 with a Bachelor’s degree in Engineering. Afterward, I worked in a Chinese government department for a while before resigning (in 2020).

My current research interests are efficient Large Language Models (LLMs), LLM-based agents, and multilingual processing. I am very keen to connect with other friends in the research community to exchange and discuss ideas.

🔥 News

2024.09: 🎉🎉 OneBit is accepted by NeurIPS 2024.
2024.02: 🎉🎉 We are the first to attempt 1-bit quantization of LLMs, achieving 90% model compression while retaining 83% of the performance (on LLaMA series). This work is featured by AK(@_akhaliq).
2023.09: 🎉🎉 We explore playing Werewolf Game using LLMs. Some strategic and social behaviors emerged, such as trust, confrontation, etc.

📝 Publications

OneBit: Towards Extremely Low-bit Large Language Models
- Yuzhuang Xu, Xu Han, Zonghan Yang, Shuo Wang, Qingfu Zhu, Zhiyuan Liu, Weidong Liu, Wanxiang Che
- In Advances in Neural Information Processing Systems (NeurIPS 2024)
- Keywords: Onebit, Extreme Quantization, 1-bit Quantization
Exploring Large Language Models for Communication Games: An Empirical Study on Werewolf
- Yuzhuang Xu, Shuo Wang, Peng Li, Fuwen Luo, Xiaolong Wang, Weidong Liu, Yang Liu
- ArXiv Technical Report
- Keywords: Werewolf, LLM for Games, Social Behaviors
CAMERA: Multi-Matrix Joint Compression for MoE Models via Micro-Expert Redundancy Analysis
- Yuzhuang Xu, Xu Han, Yuanchi Zhang, Yixuan Wang, Yijun Liu, Shiyu Ji, Qingfu Zhu, Wanxiang Che
- In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI 2026)
- Keywords: Micro-expert, Pruning, Quantization, MoE
CRVQ: Channel-relaxed Vector Quantization for Extreme Compression of LLMs
- Yuzhuang Xu, Shiyu Ji, Qingfu Zhu, Wanxiang Che
- In Transactions of the Association for Computational Linguistics (TACL 2025), EMNLP 2025
- Keywords: Extreme Compression, Codebook, Hardware-friendly
Pluggable Neural Machine Translation Models via Memory-augmented Adapters
- Yuzhuang Xu, Shuo Wang, Peng Li, Xuebo Liu, Xiaolong Wang, Weidong Liu, Yang Liu
- In Proceedings of the Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
- Keywords: Machine Translation, Plugin, Style Translation
A Survey on Large Language Models with Multilingualism: Recent Advances and New Frontiers
- Kaiyu Huang, Fengran Mo, Hongliang Li, You Li, Yuanchi Zhang, Weijian Yi, Yulong Mao, Jinchen Liu, Yuzhuang Xu, Jinan Xu, Jian-Yun Nie, Yang Liu
- ArXiv Preprint
- Keywords: Survey, Multilingualism, Machine Translation
Delta-CoMe: Training-Free Delta-Compression with Mixed-Precision for Large Language Models
- Bowen Ping, Shuo Wang, Hanqing Wang, Xu Han, Yuzhuang Xu, Yukun Yan, Yun Chen, Baobao Chang, Zhiyuan Liu, Maosong Sun
- In Advances in Neural Information Processing Systems (NeurIPS 2024)
- Keywords: Delta-compression, Training-free, Mixed-precision
Judge Q: Trainable Queries for Optimized Information Retention in KV Cache Eviction
- Yijun Liu, Yixuan Wang, Yuzhuang Xu, Shiyu Ji, Yang Xu, Qingfu Zhu, Wanxiang Che
- In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI 2026)
- Keywords: KV Eviction, Trainable Query, KV Compression
ActiView: Evaluating Active Perception Ability for Multimodal Large Language Models
- Ziyue Wang, Chi Chen, Fuwen Luo, Yurui Dong, Yuanchi Zhang, Yuzhuang Xu, Xiaolong Wang, Peng Li, Yang Liu
- In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL 2025)
- Keywords: Active Perception, Visual Question Answering, Benchmark
UltraLink: An Open-Source Knowledge-Enhanced Multilingual Supervised Fine-tuning Dataset
- Haoyu Wang, Shuo Wang, Yukun Yan, Xujia Wang, Zhiyu Yang, Yuzhuang Xu, Zhenghao Liu, Ning Ding, Xu Han, Zhiyuan Liu, Maosong Sun
- In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL 2024)
- Keywords: Multilingual, Dataset, Knowledge-Enhanced
Lookahead Q-Cache: Achieving More Consistent KV Cache Eviction via Pseudo Query
- Yixuan Wang, Shiyu Ji, Yijun Liu, Yuzhuang Xu, Yang Xu, Qingfu Zhu, Wanxiang Che
- In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2025)
- Keywords: KV Eviction, Pseudo Query, Lookahead
Perspective Transition of Large Language Models for Solving Subjective Tasks
- Xiaolong Wang, Yuanchi Zhang, Ziyue Wang, Yuzhuang Xu, Fuwen Luo, Yile Wang, Peng Li, Yang Liu
- In Findings of the Association for Computational Linguistics: ACL 2025
- Keywords: Perspective Transition, Subjective Tasks, Multi-Agent
Think Before You Accept: Semantic Reflective Verification for Faster Speculative Decoding
- Yixuan Wang, Yijun Liu, Yuzhuang Xu, Yang Xu, Qingfu Zhu, Wanxiang Che
- ArXiv Preprint
- Keywords: Speculative Decoding, Semantic Reflective, Relaxed Verification

📖 Educations

2024.09 - Future, Harbin Institute of Technology, Harbin, China, Doctor student, Computer science and technology.
2021.09 - 2024.06, Tsinghua University, Beijing, China, Master, Computer science and technology.
2011.09 - 2015.06, National University of Defense Technology, Changsha, China, Undergraduate, Command and Automation.

💬 Invited Talks

2025.10, Compression of LLMs, HUAWEI, Shanghai, China.
2024.07, Hotpot Paper Oral (OneBit), CCL 2024, Taiyuan, China.
2024.03, Exploration and Innovation in Extreme Quantization Methods for Large Language Models, Jiqizhixin, Online.
2024.03, The Era of LLM-based Agents: Ability, Methodology and Future, Swarma Club, Online.

💻 Internships

2025.09 - 2026.08, ModelBest, Beijing, China.
2023.08 - 2024.08, National Laboratory, Beijing, China.
2022.01 - 2022.07, Chinese Academy of Sciences, Institute of Software, Beijing, China.

💻 Services