Tuan Le

prof_pic.jpg
📍 Berlin, Germany

I work on generative models for molecular design at Pfizer. I like thinking about problems through the lens of probability, statistics, geometry and combining that with domain knowledge in chemistry and bio-physics, to build systems that can generate and design molecules across different settings: from small molecule drug discovery to protein sequence design and other molecular inverse problems.

Background: Ph.D. in Computer Science from Freie Universität Berlin under Frank Noé, with a focus on machine learning for molecular design. My PhD research was shaped through close collaboration with Djork-Arné Clevert. I’ve continued this collaboration at Bayer and Pfizer, working with teams in computational chemistry and machine learning.

Research Interests

Generative Models for Molecular Design: I’m interested in how deep generative models particularly diffusion and flow matching can be applied to inverse design problems across different molecular domains. This includes small molecule drug discovery (starting from a desired drug property or protein target, how do we generate molecular candidates worth testing?), protein and sequence design, and other biomolecular inverse problems.

Incorporating Domain Knowledge: Throughout my work, I’ve found that understanding the underlying mathematics, physics, and chemistry is essential. Whether designing small molecules or sequences, respecting physical structure, incorporating 3D geometry, conditioning on relevant information, and using domain-motivated features lead to models that generate more realistic and useful molecules.

From Methods to Applications: Research is exciting in its own right, but I also find a lot of value in the step after—turning methods into tools that practitioners can actually use. A model that performs well on a benchmark is only part of the story – getting it into the hands of domain experts in a usable, reliable form matters just as much to me.

selected publications

  1. Coupled fragment-based generative modeling with stochastic interpolants
    Tuan Le*, Yanfei Guan, Djork-Arné Clevert, and Kristof T. Schütt
    Digital Discovery Mar 2026
  2. Diffusion Generative Modeling on Lie Group Representations
    Marco Bertolini*, Tuan Le*, and Djork-Arné Clevert
    In The Thirty-ninth Annual Conference on Neural Information Processing Systems Dec 2025
  3. Equivariant diffusion for structure-based de novo ligand generation with latent-conditioning
    Tuan Le*, Julian Cremer*, Djork-Arné Clevert, and Kristof T. Schütt
    Journal of Cheminformatics May 2025
  4. PILOT: equivariant diffusion for pocket-conditioned de novo ligand generation with multi-objective guidance via importance sampling
    Julian Cremer*, Tuan Le*, Frank Noé, Djork-Arné Clevert, and 1 more author
    Chem. Sci. May 2024
  5. Navigating the Design Space of Equivariant Diffusion-Based Generative Models for De Novo 3D Molecule Generation
    Tuan Le*, Julian Cremer*, Frank Noé, Djork-Arné Clevert, and 1 more author
    In The Twelfth International Conference on Learning Representations May 2024
  6. Representation Learning on Biomolecular Structures using
    Equivariant Graph Attention
    Tuan Le*, Frank Noé, and Djork-Arné Clevert
    In Learning on Graphs Conference May 2022
  7. Parameterized Hypercomplex Graph Neural Networks for Graph Classification
    Tuan Le*, Marco Bertolini, Frank Noé, and Djork-Arné Clevert
    In Artificial Neural Networks and Machine Learning – ICANN 2021 May 2021
  8. Neuraldecipher – reverse-engineering extended-connectivity fingerprints (ECFPs) to their molecular structures
    Tuan Le*, Robin Winter, Frank Noé, and Djork-Arné Clevert
    Chem. Sci. May 2020