Vishaal Udandarao

I am a final year ELLIS PhD student, jointly working with Matthias Bethge at The University of Tuebingen and Samuel Albanie at The University of Cambridge/Google DeepMind. My research is funded by a Google PhD Fellowship in Machine Intelligence.

I am very interested in data curation methods to improve multimodal pretraining. During my PhD, I have primarily worked on characterizing and constructing better pretraining datasets for multimodal models.

I previously interned in Google (DeepMind) and Apple, working on data-curation for multimodal models.

I'm currently on the industry job market! Please reach out if you have any open positions in pretraining.

Email  /  CV  /  Google Scholar  /  Twitter  /  Github

profile photo
Selected Publications

For an updated list, please see my google scholar

Data-Centric Lessons To Improve Speech-Language Pretraining
Vishaal Udandarao, Zhiyun Lu, Xuankai Chang, Yongqiang Wang, Violet Z. Yao, Albin Madapally Jose, Fartash Faghri, Josh Gardner, Chung-Cheng Chiu
ICLR, 2026
pdf

We studied three questions fundamental to speech-language pretraining data curation and sampling; our controlled experiments yielded a performant 3.8B SpeechLM that outperforms 3x-larger SpeechLMs.

A Practitioner’s Guide to Continual Multimodal Pretraining
Vishaal Udandarao*, Karsten Roth*, Sebastian Dziadzio, Ameya Prabhu, Mehdi Cherti, Oriol Vinyals, Olivier Henaff, Samuel Albanie, Matthias Bethge*, Zeynep Akata*
NeurIPS, 2024
pdf / code

We provide practical insights into how to continually pretraining contrastive multimodal models under compute and data constraints.

No "Zero-Shot" Without Exponential Data: Pretraining Concept Frequency Determines Multimodal Model Performance
Vishaal Udandarao*, Ameya Prabhu*, Adhiraj Ghosh, Yash Sharma, Philip Torr, Adel Bibi, Samuel Albanie, Matthias Bethge
NeurIPS, 2024
pdf / code

Our work showcases that the impressive empirical performance of multimodal models like CLIP and Stable Diffusion can be largely attributed to the presence of test concepts within their vast pretraining datasets, thus their reported empirical performance does not constitute "zero-shot" generalization. Quite the contrary, these models require exponentially more data on a concept to linearly improve their performance on tasks pertaining to that concept, highlighting extreme sample inefficiency.

Efficient Model Evaluation in an Era of Rapid Progress
Ameya Prabhu*, Vishaal Udandarao*, Philip Torr, Matthias Bethge, Adel Bibi, Samuel Albanie
NeurIPS, 2024
pdf / code

Our work introduces the concept of lifelong benchmarks, enabling effective comparisons of models and reducing overfitting to the biases of a particular dataset. We constructed large-scale lifelong classification benchmarks totalling over 1.5M samples. To facilitate more efficient evaluation, we introduce the Sort&Search method that reduces inference compute costs by 1000x.

SuS-X: Training-Free Name-Only Transfer of Vision-Language Models
Vishaal Udandarao, Ankush Gupta, Samuel Albanie
ICCV, 2023
pdf / code

We enhance CLIP's downstream classification performance by (1) curating a support set either by generating synthetic (Stable Diffusion) or retrieving natural (LAION-5B) samples, and (2) observing and fixing a mis-calibration issue with intra-modal distances in CLIP’s embedding space.

Teaching
Deep Learning (CSE641)
Worked as a Teaching Assistant for the Deep Learning course offered by Dr. Saket Anand in Spring 2020.
Machine Learning (CSE543)
Worked as a Teaching Assistant for the Machine Learning course offered by Dr. Jainendra Shukla in Fall 2019.
Introduction to Engineering Design (DES130)
Worked as a Teaching Assistant for the Introduction to Engineering Design course offered by Dr. Aman Parnami in Spring 2019.
Linear Algebra (MTH100)
Worked as a Teaching Assistant for the Linear Algebra course offered by Dr. Samaresh Chatterjee in Fall 2018.
Misc

Apart from my academic interests, I am a huge football fan and actively support FC Barcelona Paris Saint Germain Inter Miami CF. You've probably guessed already, Lionel Messi is my favourite player to ever touch a football. I also love watching Formula 1 and look up to Lewis Hamilton. I used to write stuff, but that was a long long time ago. I also dabble around with the guitar and the keyboard at times. Checkout my soundcloud profile!


Website template taken from here.