Yichuan Wang

Bio

I am a second-year PhD student fortunate to be advised by Joseph E. Gonzalez and Matei Zaharia at UC Berkeley in SkyLab. I also collaborate closely with Sewon Min. I received my B.E. in Computer Science and Technology from the ACM Honored Class at Shanghai Jiao Tong University (SJTU). My research focuses on machine learning systems, particularly systems for LLM training and serving, and developing tailored abstractions for unique ML computational patterns to achieve better performance. Currently, I am most interested in LLM traning/ serving/scheduling, agent systems, vector search, retrieval, large-scale RAG, and personalized AI application. If you share similar research interests, I'd love to connect and chat!

During my undergraduate studies, I collaborated with Prof. Jinyang Li at NYU and Dr. Minjie Wang on projects related to scaling up GNN training. At SJTU, I was supervised by Prof. Quan Chen at SJTU.

I enjoy connecting with researchers and practitioners in the field. Feel free to reach out via email, WeChat (15858459091), or Twitter. For more details about my background, you can view my CV and Statement of Purpose from my graduate school applications.

In my spare time, I contribute to open-source projects like SGLang. More recently, I have dedicated significant time to maintaining LEANN , evolving it from a research prototype into an open-source community project with over 20 active contributors and 40k+ downloads.

Work Email: yichuan_wang@berkeley.edu

Personal Email: yichuanmistygrass@gmail.com

News

[12/12/2025] Happy to release DS SERVE, a framework for efficient and scalable neural retrieval and the largest open vector store over pre-trained data. Check out our live demo! Fun fact: Jinjian is the first undergraduate I mentored at Berkeley—it was a wonderful collaboration!

[08/10/2025] We release LEANN , a fully local RAG with 97% storage savings. One of the first open-source engines to bring semantic search to Claude Code, all with zero cloud cost and full privacy. Drop me an email if you’d like to contribute — we’re on a mission to build an engine that understands every piece of data on your PC.

[06/20/2025] I will attend SIGMOD in Berlin this year. Let's chat about vector database, RAG, LLM and we can hang around!

[08/29/2024] After being rejected once, DiskGNN has finally been accepted by SIGMOD25. See you in Berlin!

[08/13/2024] Starting my PhD journey at SkyLab—excited.

[05/08/2024] We put DiskGNN on Arxiv If you want to train a super-large-scale GNN while balancing speed and accuracy, you should try it! We will integrate it into DGL asap.

Education

UC Berkeley

Starting from Sept. 2024

PhD in EECS

Shanghai Jiao Tong University

Sept. 2020 -- June 2024

B.Eng. in Computer Science at ACM Honors Class, advised by Prof. Quan Chen, Prof. Yong Yu

New York University Courant Institute

July. 2023 -- Dec. 2023

Research assistant, advised by Prof. Jinyang Li

Publications

1.LEANN: A Low-Storage Vector Index pdf poster repo

Yichuan Wang, Zhifei Li, Shu Liu, Yongji Wu, Ziming Mao, Yilong Zhao, Xiao Yan, Zhiying Xu, Yang Zhou, Ion Stoica, Sewon Min, Matei Zaharia, Joseph E. Gonzalez

Preprint (Short version in VecDB@ICML2025)

2.DS SERVE: A Framework for Efficient and Scalable Neural Retrieval pdf website poster repo

Jinjian Liu*, Yichuan Wang*, Xinxi Lyu, Rulin Shao, Joseph E. Gonzalez, Matei Zaharia, Sewon Min

*indicates equal contribution

Accepted by AAAI 2026 Demo

3.Locality-aware Fair Scheduling in LLM Serving pdf poster

Shiyi Cao*, Yichuan Wang*, Ziming Mao, Pin-Lun Hsu, Liangsheng Yin, Tian Xia, Dacheng Li, Shu Liu, Yineng Zhang, Yang Zhou, Ying Sheng, Joseph E. Gonzalez, Ion Stoica

*indicates equal contribution

Preprint

4.DiskGNN: Bridging I/O Efficiency and Model Accuracy for Out-of-Core GNN Training pdf poster slides

Renjie Liu*, Yichuan Wang*, Xiao Yan, Zhenkun Cai, Minjie Wang, Haitian Jiang, Bo Tang, Jinyang Li

*indicates equal contribution

Accepted by SIGMOD 2025(Oral)

5.Optimizing Dynamic Neural Networks with Brainstorm pdf

Weihao Cui, Zhenhua Han, Lingji Ouyang, Yichuan Wang, Ningxin Zheng, Lingxiao Ma, Yuqing Yang, Fan Yang, Jilong Xue, Lili Qiu, Lidong Zhou, Quan Chen, Haisheng Tan, Minyi Guo.

Accepted by OSDI 2023

6.Forming Scalable, Convergent GNN Layers that Minimize a Sampling-Based Energy pdf

Haitian Jiang, Renjie Liu, Zengfeng Huang, Yichuan Wang, Xiao Yan, Zhenkun Cai, Minjie Wang, David Wipf

Accepted by ICLR 2025

7.The Danger of Overthinking: Examining the Reasoning-Action Dilemma in Agentic Tasks pdf

Alejandro Cuadron, Dacheng Li, Wenjie Ma, Xingyao Wang, Yichuan Wang, Siyuan Zhuang, Shu Liu, Luis Gaspar Schroeder, Tian Xia, Huanzhi Mao, Nicholas Thumiger, Aditya Desai, Ion Stoica, Ana Klimovic, Graham Neubig, Joseph E. Gonzalez

Preprint

8.Autellix: An Efficient Serving Engine for LLM Agents as General Programs pdf

Michael Luo, Xiaoxiang Shi, Colin Cai, Tianjun Zhang, Justin Wong, Yichuan Wang, Chi Wang, Yanping Huang, Zhifeng Chen, Joseph E. Gonzalez, Ion Stoica

Accepted by NSDI 2026

Services

[04/2025] SIGMOD'25 Artifact Evaluation Committee
[03/2025] MLsys'25 Artifact Evaluation Committee
[02/2025] Eurosys'25 Artifact Evaluation Committee
[07/2024] SIGCOMM'24 Artifact Evaluation Committee
[04/2024] OSDI'24 Artifact Evaluation Committee
[04/2024] USENIX ATC'24 Artifact Evaluation Committee

Invited Talks

[10/2025] Lightspeed interview - LEANN: Towards Lightweight Vector Search and RAG Everything on PC
[09/2025] Bytedance Data Team - LEANN: Towards Lightweight Vector Search and RAG Everything on PC
[07/2025] SIGMOD 2025 - DiskGNN: Bridging I/O Efficiency and Model Accuracy for Out-of-Core GNN Training
[01/2025] LMsys - SGLang-FLPM [Video]

Misc/Hobbies

I like playing basketball! (But unfortunately, it seems that there aren't any good basketball courts near Berkeley.) So now, I might (or rather, I can only) play more badminton(I still enjoy). Besides that, I've been an NBA fan for 15 years, and I like Chris Paul! In recent years, I've also become very interested in the Premier League(I love Manchester City, come on Blue Moon!). If you share the same interests, we should chat/hang out!

Blog Posts

Here are my thoughts and comments on various research papers and topics in systems and ML, along with practical lessons learned from my development experience:

Research Vision

Rethinking the Search Stack for the AI Era - Why we need to move beyond traditional search APIs for LLM agents and build native retrieval stacks.

Paper Comments

My Thoughts on RAGCache paper - My thoughts on RAGCache paper arxiv
My Thoughts on Exploring Orak: A Unified Benchmark for LLM Agents in Games paper - My thoughts on Orak paper, a comprehensive evaluation on Game Plantform arxiv
My Thoughts on Qwen3 Embedding paper - My thoughts on Qwen3 Embedding paper, advancing text embedding through foundation models
My Thoughts on ReasonIR paper - My thoughts on ReasonIR paper, reasoning-based information retrieval

Development Experience & Lessons Learned

Mastering DiskANN: Practical Lessons from Building Large-Scale Vector Search Systems - Practical recipes for configuring DiskANN, from optimal compression rates to avoiding hidden metric pitfalls, based on experience building LEANN and DS-serve.
Lessons Learned in Development - LEANN Project - Practical insights from building RAG systems, including chunk overlap strategies, data format optimization, and embedding model comparisons

See all blog posts →