Saturday, April 01, 2023

Explainable multi-modal 3D Object Detection with Transformers

Topic and Goal of the Thesis

3D object detection refers to the task of detecting and localizing objects in a three-dimensional environment, typically using data from multiple sensors such as cameras, lidars, and radar. This is a challenging problem that has many applications, including robotics, autonomous vehicles, and augmented reality.

One approach to 3D object detection that has gained popularity in recent years is the use of transformers, which are a type of deep learning model that is particularly well suited to processing sequential data, such as the time-series data that is often generated by sensors. By using a transformer to combine the data from multiple sensors, it is possible to capture complex relationships and patterns that may not be apparent when each sensor is considered individually. However, the decision-making and reasoning of the model is often not understandable for humans.

A thesis on this topic could be the development of explainable 3D object detection methods, which aim to provide insights into the reasoning behind the model's predictions and improve its interpretability. This could involve the use of techniques such as attention mechanisms or visualization tools to understand how the transformer processes the data from multiple sensors to make its predictions.

[1] Explaining object detectors: the case of transformer architectures - Baptiste Abeloos, Stéphane Herbin – 2022

Working Points

  • Literature research on explainable 3D Object detection with Transformers
  • Development of a multi-modal 3D Object detector with explainability features
  • Training of the neural network on public available datasets on our compute cluster
  • Implementation of visualization tools which help to understand how the Transformer makes its predictions


  • Python / Machine Learning / Computer Vision
  • Enthusiasm for Machine Learning

What We Offer

  • private Nvidia A100 compute cluster with ssh access
  • Existing learning framework for machine learning applications
  • Remote work / Office

Note: Please attach brief resume and grade summary.


Till Beemelmanns M.Sc.
+49 241 80 26533

Type of work

Bachelorarbeit, Masterarbeit


Earliest possible date

Prior knowledge

Python, Machine Learning


Deutsch, Englisch

Research area

Fahrzeugintelligenz & Automatisiertes Fahren


Institute for Automotive Engineering
RWTH Aachen University
Steinbachstraße 7
52074 Aachen · Germany
+49 241 80 25600

We use cookies on our website. Some of them are essential for the operation of the site, while others help us to improve this site and the user experience (tracking cookies). You can decide for yourself whether you want to allow cookies or not. Please note that if you reject them, you may not be able to use all the functionalities of the site.