Tom Monnier

Research Scientist at Meta

I am a Research Scientist at Meta working on computer vision and 3D modeling. I did my PhD in the amazing Imagine lab at ENPC under the guidance of Mathieu Aubry. During my PhD, I was fortunate to work with Jean Ponce (Inria), Matthew Fisher (Adobe Research), Alyosha Efros and Angjoo Kanazawa (UC Berkeley). Before that, I completed my engineer's degree (=M.Sc.) at Mines Paris.

My research mainly focuses on learning things from images without annotations, through self-supervised and unsupervised methods (see representative papers). I am always looking for PhD interns, feel free to reach out! scholar.twitter.



Differentiable Blocks World: Qualitative 3D Decomposition by Rendering Primitives
Tom Monnier, Jake Austin, Angjoo Kanazawa, Alexei A. Efros, Mathieu Aubry
NeurIPS 2023
paper | webpage | code | slides | bibtex

We compute a primitive-based 3D reconstruction from multiple views by optimizing textured superquadric meshes with learnable transparency.

MACARONS: Mapping And Coverage Anticipation with RGB Online Self-supervision
Antoine Guédon, Tom Monnier, Pascal Monasse, Vincent Lepetit
CVPR 2023
paper | webpage | code | video | slides | bibtex

We introduce MACARONS, a method that learns in a self-supervised fashion to explore new environments and reconstruct them in 3D using RGB images only.

The Learnable Typewriter: A Generative Approach to Text Line Analysis
Ioannis Siglidis, Nicolas Gonthier, Julien Gaubil, Tom Monnier, Mathieu Aubry
arXiv 2023
paper | webpage | code | bibtex

We build upon sprite-based image decomposition approaches to design a generative method for character analysis and recognition in text lines.

Towards Unsupervised Visual Reasoning: Do Off-The-Shelf Features Know How to Reason?
Monika Wysoczanska, Tom Monnier, Tomasz Trzcinski, David Picard
NeurIPS Workshops 2022
paper | bibtex

A Transformer-based framework to evaluate off-the-shelf features (object-centric and dense representations) for the reasoning task of VQA.

Share With Thy Neighbors: Single-View Reconstruction by Cross-Instance Consistency
Tom Monnier, Matthew Fisher, Alexei A. Efros, Mathieu Aubry
ECCV 2022
paper | webpage | code | video | slides | bibtex

We present UNICORN, a self-supervised approach leveraging the consistency across different single-view images for high-quality 3D reconstructions.

Representing Shape Collections with Alignment-Aware Linear Models
Romain Loiseau, Tom Monnier, Mathieu Aubry, Loïc Landrieu
3DV 2021
paper | webpage | code | bibtex

We characterize 3D shapes as affine transformations of linear families learned without supervision, and showcase its advantages on large shape collections.

Unsupervised Layered Image Decomposition into Object Prototypes
Tom Monnier, Elliot Vincent, Jean Ponce, Mathieu Aubry
ICCV 2021
paper | webpage | code | video | slides | bibtex

We discover the objects recurrent in unlabeled image collections by modeling images as a composition of learnable sprites.

Deep Transformation-Invariant Clustering
Tom Monnier, Thibault Groueix, Mathieu Aubry
NeurIPS 2020 (oral presentation)
paper | webpage | code | video | slides | bibtex

A simple adaptation of K-means to make it work on pixels! We align prototypes to each sample image before computing cluster distances.

docExtractor: An off-the-shelf historical document element extraction
Tom Monnier, Mathieu Aubry
ICFHR 2020 (oral presentation)
paper | webpage | code | video | slides | bibtex

Leveraging synthetic training data to efficiently extract visual elements from historical document images.

Academic activities

Invited talks

© You are welcome to copy the code, please attribute the source with a link back to this page.
Template inspired from [1], [2], [3]. Misspellings: monier, monnie, monie, monniert.

Last updated: October 2023