Research Landscape

This section contains information about our lab's active research projects, public datasets and benchmarks. Our research mainly focuses on perception, forecasting, and navigation/planning for embodied AI to perceive, predict and interact with the dynamic environment.

Basic Vision Tasks

Perception

Focuses on scene understanding tasks including object detection, segmentation (instance, semantic, panoptic), and depth estimation from multi-modal sequences using supervised, semi-supervised, few-shot and self-supervised learning.

Basic Vision

Multi-Object Tracking

Perception

Designing end-to-end MOT frameworks to track an unknown and time-varying number of objects in crowded, unconstrained environments, addressing track initiation, termination, and occlusion handling.

MOT

Human Face, Emotion, Action, and Social Group & Activity Detection

Perception

Human behavior understanding in videos: simultaneously grouping people by social interactions, predicting individual actions and social activity of each group.

Human Social Activity

3D Reconstruction and Mapping

Perception

3D localisation, reconstruction and mapping of objects and human body in dynamic environments for high-level 3D scene understanding.

3D Reconstruction

Multi-task 3D Visual Perception System for a Mobile Robot

Perception

Designing a multitask perception system for autonomous agents (e.g. social robots) including basic-level to high-level perception and reasoning. Creating large-scale datasets for training and evaluation.

Robot Visual Perception

Visual Reasoning

Perception

Interpreting, analyzing, and making sense of visual information: recognizing patterns, spatial relationships, and logical structures in images or diagrams.

Visual Reasoning

Human Trajectory/Body Motion Forecasting

Forecasting

Developing physically and socially plausible frameworks to predict human trajectory and body pose dynamics in complex environments.

Human Motion Forecasting

Active Visual Navigation in an Unexplored Environment

Navigation

Using deep learning to make informed predictions about scene layout, enabling robots to navigate unseen environments with human-like instructions.

Visual Navigation

Single or Multi-UAV Planning for Discovering and Tracking Mobile Objects

Navigation

Online path planning for UAV-based localisation and tracking of an unknown and time-varying number of objects. Tackling single or multiple (centralised/decentralised) UAVs.

UAV Planning

Datasets and Benchmarks

JRDB: JackRabbot Dataset and Benchmark

JRDB

MOT20: Multi-object Tracking in Crowded Scenes

MOT20

SoMoF: Social Motion Forecasting Benchmark

SoMoF

Completion3D: Stanford 3D Point Cloud Completion

Completion3D

LISC: Leukocyte Images for Segmentation and Classification

LISC