Weijia Shi
Try Notion

Weijia Shi

Pronouns: she/her
My name in Chinese: 施 惟佳
Email: swj0419@uw.edu

👋 Hi!

I am a PhD student in Computer Science at the University of Washington advised by Prof. Luke Zettlemoyer and Prof. Noah A. Smith. I have been a visiting research at Meta AI, working with Scott Yih. Prior to UW, I graduated from UCLA with a B.S. in Computer Science and Minor in Math. I am happy to mentor undergraduate or master students interested in research.

🌋 Research Interests

My main research focuses on natural language processing and machine learning. I am particularly interested in retrieval-augmented LMs and trustworthy AI. My goal is to build LMs that are able to communicate with external knowledge and personal data securely and robustly.
What’s NEW
☑️ Office hours: Starting November 2023, I will be holding office hours (1~2 hours a week) dedicated to offering mentorship and advice to undergraduate/master students. If you want to chat about research and grad school application, please fill out the form
☑️ Honored to be selected as 2023 Machine Learning Rising Star ☑️ Two workshops accepted to *CL conferences. Stay tuned!
The 3rd Workshop on Knowledge Augmented Methods for NLP (ACL 2024)
Workshop on Customizable NLP (EMNLP 2024)

📜 Selected Publications

Please see my Google Scholar or  Semantic Scholar profiles for the full list.
(*: equal contribution)
In-Context Pretraining: Language Modeling Beyond Document Boundaries
Weijia Shi, Sewon Min, Maria Lomeli, Chunting Zhou, Margaret Li, Xi Victoria Lin, Noah A. Smith, Luke Zettlemoyer, Scott Yih, Mike Lewis
arXiv preprint. 2023. [paper]
Detecting Pretraining Data from Large Language Models
Weijia Shi*, Anirudh Ajith*, Mengzhou Xia, Yangsibo Huang, Daogao Liu, Terra Blevins, Danqi Chen, Luke Zettlemoyer
arXiv preprint. 2023. [paper] [website][code]
REPLUG: Retrieval-Augmented Black-Box Language Models
Weijia Shi, Sewon Min, Michihiro Yasunaga, Minjoon Seo, Rich James, Mike Lewis, Luke Zettlemoyer, Wen-tau Yih
arXiv preprint. 2023. [paper]
One Embedder, Any Task: Instruction-Finetuned Text Embeddings
Hongjin Su*, Weijia Shi*, Jungo Kasai, Yizhong Wang, Yushi Hu, Mari Ostendorf, Scott Wen- tau Yih, Noah A. Smith, Luke Zettlemoyer, Tao Yu
ACL, 2023. [paper] [website][model (🌟 1M downloads on HuggingFace)]
Toward Human Readable Prompt Tuning: Kubrick’s The Shining is a good movie, and a good prompt too?
Weijia Shi*, Xiaochuang Han*, Hila Gonen, Ari Holtzman, Yulia Tsvetkov, Luke Zettlemoyer
EMNLP, 2023. [paper]
Fine-Grained Human Feedback Gives Better Rewards for Language Model Training
Zeqiu Wu*, Yushi Hu*, Weijia Shi, Nouha Dziri, Alane Suhr, Prithviraj Ammanabrolu, Noah A. Smith, Mari Ostendorf, Hannaneh Hajishirzi
NeurIPS (Spotlight), 2023. [paper] [website]
kNN-Prompt: Nearest neighbor zero-shot inference.
Weijia Shi, Julian Michael, Suchin Gururangan, Luke Zettlemoyer
EMNLP, 2022. [paper] [code]

🔬 Research Experience

University of Washington, 09/2020–Present
Ph.D. student, supervised by  Luke Zettlemoyer and Noah A. Smith
Meta AI, 06/2022–Present
Visiting Researcher, supervised by  Scott Yih
University of Pennsylvania, 05/2019–09/2019
Research Intern, supervised by  Dan Roth
UCLA, 04/2018–06/2020
Research assistant, supervised by  Kai-Wei Chang and Adnan Darwiche