Learning-based Localization through Monocular Camera

Last updated on May 10, 2023

Detection

Through performing object detection (YOLO/deep sort/fastMot…), we could get the bounding box location in image frame:

Inference

We found there exist approximate quandratic relationship between L_img (distance from center of bounding box to the center of image in image frame and L_earth (distance from the object to the camera in earth frame). It is actually not surprising since this is actually a distance problem Lol.
Thus, we develop a Regression algorithm with quandratic kernel (Qua) to inference the L_earth from L_img
By comparing it with Gaussian process regression (GPR) and neural network (NN), the result shows below:

Interpolation and smoothing

This is done in three steps:

moving-average weighted smoothing and outlier-rejection (pandas)
spline interpolation (scipy)
extrapolation based on first-order kinematics

Final example result

Accomplishment

Won 2nd place in Navy AI-Track-at-sea competition (aacuracy: 57%). (2020 fall)

Ongoing

Neural network is added to inference in order to learn the nonlinearty, and the accuracy of localization is improved to 99.7%.