Xuandong Zhao

E-mail: csxuandongzhao at gmail
Address: Goleta, CA 93106, USA

GitHub  /  LinkedIn  /  Twitter  /  Google Scholar

About


I am currently a Postdoctoral Researcher at UC Berkeley as part of the RDI and BAIR, working with Prof. Dawn Song. I earned my PhD in Computer Science from UC Santa Barbara, where I was advised by Prof. Yu-Xiang Wang and Prof. Lei Li. Prior to that, I graduated with a Bachelor's degree in Computer Science from Zhejiang University. I have also interned at leading tech companies including Alibaba, Microsoft and Google.

My current research interests lie in Machine Learning, Natural Language Processing, and AI Safety, with a particular focus on Responsible Generative AI. I am always open to collaborations. If you share similar interests or see potential synergies, please feel free to reach out via email!

Selected Research


SoK: Watermarking for AI-Generated Content
Xuandong Zhao, Sam Gunn, Miranda Christ, Jaiden Fairoze, Andres Fabrega, Nicholas Carlini, Sanjam Garg, Sanghyun Hong, Milad Nasr, Florian Tramer, Somesh Jha, Lei Li, Yu-Xiang Wang, Dawn Song
arXiv, 2024
[Paper]

An Undetectable Watermark for Generative Image Models
Sam Gunn*, Xuandong Zhao*, Dawn Song
NeurIPS 2024 Safe Generative AI Workshop
[Paper] [Code]

Permute-and-Flip: An Optimally Robust and Watermarkable Decoder for LLMs
Xuandong Zhao, Lei Li, Yu-Xiang Wang
NeurIPS 2024 Safe Generative AI Workshop
[Paper] [Code] [Slides]

Invisible Image Watermarks Are Provably Removable Using Generative AI
Xuandong Zhao*, Kexun Zhang*, Zihao Su, Saastha Vasan, Ilya Grishchenko, Christopher Kruegel, Giovanni Vigna, Yu-Xiang Wang, Lei Li
Proceedings of NeurIPS 2024
[Paper] [Code] [Video] [Media]

Weak-to-Strong Jailbreaking on Large Language Models
Xuandong Zhao*, Xianjun Yang*, Tianyu Pang, Chao Du, Lei Li, Yu-Xiang Wang, William Yang Wang
ICML 2024 the Next Generation of AI Safety Workshop
[Paper] [Code]

Provable Robust Watermarking for AI-Generated Text
Xuandong Zhao, Prabhanjan Ananth, Lei Li, Yu-Xiang Wang
Proceedings of ICLR 2024
[Paper] [Code] [Video] [Demo]

Protecting Language Generation Models via Invisible Watermarking
Xuandong Zhao, Yu-Xiang Wang, Lei Li
Proceedings of ICML 2023
[Paper] [Code]

Pre-trained Language Models Can be Fully Zero-Shot Learners
Xuandong Zhao, Siqi Ouyang, Zhiguo Yu, Ming Wu, Lei Li
Proceedings of ACL 2023, Oral
[Paper] [Code] [Video] [Slides]

Provably Confidential Language Modelling
Xuandong Zhao, Lei Li, Yu-Xiang Wang
Proceedings of NAACL 2022, Oral
[Paper] [Code] [Video]

All Research


CodeHalu: Investigating Code Hallucinations in LLMs via Execution-based Verification
Yuchen Tian*, Weixiang Yan*, Qian Yang, Xuandong Zhao, Qian Chen, Wen Wang, Ziyang Luo, Lei Ma, Dawn Song
Proceedings of AAAI 2025 [Paper][Code]

Empowering Responsible Use of Large Language Models
Xuandong Zhao
PhD Dissertation, 2024 [Paper]
PrivAgent: Agentic-based Red-teaming for LLM Privacy Leakage
Yuzhou Nie, Zhun Wang, Ye Yu, Xian Wu, Xuandong Zhao, Wenbo Guo, Dawn Song
arXiv, 2024 [Paper] [Code]
SoK: Watermarking for AI-Generated Content
Xuandong Zhao, Sam Gunn, Miranda Christ, Jaiden Fairoze, Andres Fabrega, Nicholas Carlini, Sanjam Garg, Sanghyun Hong, Milad Nasr, Florian Tramer, Somesh Jha, Lei Li, Yu-Xiang Wang, Dawn Song
arXiv, 2024 [Paper]
An Undetectable Watermark for Generative Image Models
Sam Gunn*, Xuandong Zhao*, Dawn Song
NeurIPS 2024 Safe Generative AI Workshop [Paper] [Code]
Permute-and-Flip: An Optimally Robust and Watermarkable Decoder for LLMs
Xuandong Zhao, Lei Li, Yu-Xiang Wang
NeurIPS 2024 Safe Generative AI Workshop [Paper] [Code] [Slides]
Efficiently Identifying Watermarked Segments in Mixed-Source Texts
Xuandong Zhao*, Chenwen Liao*, Yu-Xiang Wang, Lei Li
NeurIPS 2024 Safe Generative AI Workshop [Paper]
A Practical Examination of AI-Generated Text Detectors for Large Language Models
Brian Tufts, Xuandong Zhao, Lei Li
NeurIPS 2024 Safe Generative AI Workshop [Paper]
Multimodal Situational Safety
Kaiwen Zhou*, Chengzhi Liu*, Xuandong Zhao, Anderson Compalas, Dawn Song, Xin Eric Wang
NeurIPS 2024 RBFM Workshop, Oral [Paper] [Code] [Website] [Dataset]
ClinicalLab: Aligning Agents for Multi-Departmental Clinical Diagnostics in the Real World
Weixiang Yan, Haitian Liu, Tengxiao Wu, Qian Chen, Wen Wang, Haoyuan Chai, Jiayi Wang, Weishan Zhao, Yixin Zhang, Renjun Zhang, Li Zhu, Xuandong Zhao
arXiv, 2024 [Paper][Code]
Evaluating Durability: Benchmark Insights into Image and Text Watermarking
Jielin Qiu*, William Han*, Xuandong Zhao, Shangbang Long, Christos Faloutsos, Lei Li
Journal of DMLR 2024 [Paper] [Code] [Website]
Watermarking for Large Language Model
Xuandong Zhao, Yu-Xiang Wang, Lei Li
Tutorials of NeurIPS 2024, Tutorials of ACL 2024 [Paper] [Website] [Video]
Bileve: Securing Text Provenance in Large Language Models Against Spoofing with Bi-level Signature
Tong Zhou, Xuandong Zhao, Xiaolin Xu, Shaolei Ren
Proceedings of NeurIPS 2024 [Paper]
Invisible Image Watermarks Are Provably Removable Using Generative AI
Xuandong Zhao*, Kexun Zhang*, Zihao Su, Saastha Vasan, Ilya Grishchenko, Christopher Kruegel, Giovanni Vigna, Yu-Xiang Wang, Lei Li
Proceedings of NeurIPS 2024 [Paper] [Code] [Video] [Media]
Erasing the Invisible: A Stress-Test Challenge for Image Watermarks
Mucong Ding*, Tahseen Rabbani*, Bang An*, Souradip Chakraborty, Chenghao Deng, Mehrdad Saberi, Yuxin Wen, Xuandong Zhao, Mo Zhou, Anirudh Satheesh, Mary-Anne Hartley, Lei Li, Yu-Xiang Wang, Vishal M. Patel, Soheil Feizi, Tom Goldstein, Furong Huang
Competitions of NeurIPS 2024 [Paper] [Website]
MarkLLM: An Open-Source Toolkit for LLM Watermarking
Leyi Pan, Aiwei Liu, Zhiwei He, Zitian Gao, Xuandong Zhao, Yijian Lu, Binglin Zhou, Shuliang Liu, Xuming Hu, Lijie Wen, Irwin King, Philip S. Yu
System Demonstrations of EMNLP 2024 [Paper] [Code] [Colab]
A Survey on Detection of LLMs-Generated Content
Xianjun Yang, Liangming Pan, Xuandong Zhao, Haifeng Chen, Linda Petzold, William Yang Wang, Wei Cheng
Findings of EMNLP 2024 [Paper] [Code]
Mapping the Increasing Use of LLMs in Scientific Papers
Weixin Liang*, Yaohui Zhang*, Zhengxuan Wu*, Haley Lepp, Wenlong Ji, Xuandong Zhao, Hancheng Cao, Sheng Liu, Siyu He, Zhi Huang, Diyi Yang, Christopher Potts, Christopher D Manning, James Y. Zou
Proceedings of COLM 2024 [Paper] [Code]
Weak-to-Strong Jailbreaking on Large Language Models
Xuandong Zhao*, Xianjun Yang*, Tianyu Pang, Chao Du, Lei Li, Yu-Xiang Wang, William Yang Wang
ICML 2024 the Next Generation of AI Safety Workshop [Paper] [Code]
Monitoring AI-Modified Content at Scale: A Case Study on the Impact of ChatGPT on AI Conference Peer Reviews
Weixin Liang*, Zachary Izzo*, Yaohui Zhang*, Haley Lepp, Hancheng Cao, Xuandong Zhao, Lingjiao Chen, Haotian Ye, Sheng Liu, Zhi Huang, Daniel A. McFarland, James Y. Zou
Proceedings of ICML 2024, Oral; Best Presentation Runner-up Award at ICSSI 2024 [Paper] [Code]
DE-COP: Detecting Copyrighted Content in Language Models Training Data
André Vicente Duarte, Xuandong Zhao, Arlindo L. Oliveira, Lei Li
Proceedings of ICML 2024; Best Scientific Paper Award at Portuguese Responsible AI Forum [Paper] [Code]
GumbelSoft: Diversified Language Model Watermarking via the GumbelMax-trick
Jiayi Fu, Xuandong Zhao, Ruihan Yang, Yuansen Zhang, Jiangjie Chen, Yanghua Xiao
Proceedings of ACL 2024 [Paper] [Code]
Pride and Prejudice: LLM Amplifies Self-Bias in Self-Refinement
Wenda Xu, Guanglei Zhu, Xuandong Zhao, Liangming Pan, Lei Li, William Yang Wang
Proceedings of ACL 2024, Oral [Paper] [Code]
Provable Robust Watermarking for AI-Generated Text
Xuandong Zhao, Prabhanjan Ananth, Lei Li, Yu-Xiang Wang
Proceedings of ICLR 2024 [Paper] [Code] [Video] [Demo]

Private Prediction Strikes Back! Private Kernelized Nearest Neighbors with Individual Renyi Filter
Yuqing Zhu, Xuandong Zhao, Chuan Guo, Yu-Xiang Wang
Proceedings of UAI 2023, Spotlight [Paper] [Code]
Protecting Language Generation Models via Invisible Watermarking
Xuandong Zhao, Yu-Xiang Wang, Lei Li
Proceedings of ICML 2023 [Paper] [Code]
Pre-trained Language Models Can be Fully Zero-Shot Learners
Xuandong Zhao, Siqi Ouyang, Zhiguo Yu, Ming Wu, Lei Li
Proceedings of ACL 2023, Oral [Paper] [Code] [Video] [Slides]

Distillation-Resistant Watermarking for Model Protection in NLP
Xuandong Zhao, Lei Li, Yu-Xiang Wang
Findings of EMNLP 2022 [Paper] [Code] [Video] [Blog]
Provably Confidential Language Modelling
Xuandong Zhao, Lei Li, Yu-Xiang Wang
Proceedings of NAACL 2022, Oral [Paper] [Code] [Video]
Compressing Sentence Representation for Semantic Retrieval via Homomorphic Projective Distillation
Xuandong Zhao, Zhiguo Yu, Ming Wu, Lei Li
Findings of ACL 2022 [Paper] [Code] [Video] [Poster]

An Optimal Reduction of TV-Denoising to Adaptive Online Learning
Dheeraj Baby, Xuandong Zhao, Yu-Xiang Wang
Proceedings of AISTATS 2021 [Paper] [Code]
A Multi-Semantic Metapath Model for Large Scale Heterogeneous Network Representation Learning
Xuandong Zhao, Jinbao Xue, Jin Yu, Xi Li, Hongxia Yang
arXiv, 2020 [Paper] [Code]
Predicting Alzheimer's Disease by Hierarchical Graph Convolution from Positron Emission Tomography Imaging
Jiaming Guo*, Wei Qiu*, Xiang Li*, Xuandong Zhao, Ning Guo, Quanzheng Li
Proceedings of Big Data 2019 [Paper] [Code]
Multi-size Computer-aided Diagnosis of Positron Emission Tomography Images Using Graph Convolutional Networks
Xuandong Zhao*, Xiang Li*, Ning Guo, Zhiling Zhou, Xiaxia Meng, Quanzheng Li
Proceedings of ISBI 2019 [Paper] [Code]

Education


UC Santa Barbara, USA

Ph.D. in Computer Science • Sept. 2019 - June 2024

Zhejiang University, China

B.E. in Computer Science • Sept. 2015 - June 2019, GPA: 3.96/4.00

Selected Honors & Awards


AdvML Rising Star Award, 2024

Chancellor's Fellowship, UC Santa Barbara, 2019, 2021, 2023

He Zhijun Scholarship (Highest honor in ZJU CS department), 2019

Alibaba-Zhejiang News Scholarship, 2018

National Scholarship (Top 0.2% Nationwide), 2016

First Prize in Chinese Physics Olympiad (CPhO; Top 0.1% in Shanxi Province, China), 2014


Selected Recent Talks



Last update: Dec. 2024