Provide 3D vision and recognition in robots can be very hard and expensive. 3D stereo cameras with good performances (like ZED from StereoLabs) are still expensive for hobbyist: furthermore the huge amount of data provided by this type of camera require high performance dedicated processor (GPU) increasing the total costs more.
On this post I will show a very cheap solution for 3D objects localization: you will only need a simple camera (like a webcam) and printed QRCode to attach to objects you want to track.
How it works
The idea is really simple: each QRCode contains a set of points used by reader to locate the reading boundaries. Since the QRCodes are bi-dimensional those points are enough to identify the QRCode plane. Note that a linear barcode (like EAN39) cannot be used for this purpose since they contains only 2 points (i.e they are mono-dimensional).
QRCode points can be used to locate the object in the space while its content can be used to identify the object.
To extract information from QRCode I used zxing library: processing a QRCode with this reader produce:
- the content of QRCode as string
- a set of points
Having points and contents of QRCode we can now use OpenCV in order to calculate position and orientation of the QRCode. The OpenCV function used for this purpose is solvePnP(…): this function is able to find pose of 3D object (in this case the QRCode) from the 2D projection of the object itself (in this case a set of points extracted from the camera image processed by zxing). It is based on pin-hole camera model and require:
- a set of 3D points that compose the object. In our scenario this points are the QRCode points using first point as origin.
- a set of 2D points that compose the 2D projection of 3D object’s points. In our scenario the points extracted by zxing.
- information about camera (focal length, distortion model etc…).
The results of siolvePnP are in two vectors containing the object position and orientation based on camera reference.
The ROS implementation
Of course I implemented a ROS package in order to integrate this mechanism in ROS based robots. All source codes and example are provided as open source on GitHub.
The implementation is based on two packages:
- zxing: contains the zxing porting in ROS (including utils class to import OpenCV generated images)
- zxing_cv: contains nodes (like qr_detector) using zxing
The qr_detector_node and qr_detector_nodelet
QR detection engine are privided by qr_detector_node and qr_detector_nodelet. Node and nodelet provide exactly the same behavior: refer to ROS Wiki in order to get more info about nodelet. Following I will refer to node only but the same sentences can be applied to nodelet too.
qr_detecor node subscribes:
- /camera/image (sensor_msgs::Image): provide the raw image to be analyzed. Note that also the camera info related to each image are read in order to get camera properties (focal length, distortion model…)
qr_detector node publish:
- image_optimized (sensor_msgs::Image): provide the pre-processed optimized image. Before to extract QRCodes from an image it required to be processed. In details the raw image are converted to gray scale than pass through an adaptive threshold filter to reduce noise. This phase is really important to have good result and this image are published fort debug purpose only
- image_debug (sensor_msgs::Image): provide the same image raw from camera with QRCode information impressed. This image is intended for debug only
- qr_codes (zxing_cv::QRCodeArray): provide an array of zxing_cv::QRCode message, one for every QRCode found on raw image. Each zxing_cv::QRCode contains the QRCode content, the position and orientation in the camera reference frame
- markers (visualization_msgs::MarkerArray): provide markers to be shown in RViz tool
- adaptive_threshold_block_size: size of a pixel neighborhood that is used to calculate a threshold value for the pixel: 3, 5, 7, and so on. See OpenCV documentation for more info
- adaptive_threshold_threshold: the threshold. See OpenCV documentation for more info
- qr_code_points: a set of (at least three) points representing the real QRCode. A simple way to get values for this parameter is to measure the distance between the QRCode points using a normal rule. You can use this formula when d is the measured distance:
[0 0 0 d 0 0 0 d 0]
- ignore_point_in_excess: qr_detector require that the number of points extracted by QRCode match the number of points provided by qr_code_points parameter. Setting this parameter to true will cause qr_detector node to ignore exceeding points in QRCode
- marker_scale: a three double value array (x, y, z) representing the scale of markers to be displayed by RViz
- marker_point_color: a four double value array (r, g ,b, a) representing the color of markers to be displayed by RViz
- marker_text_color: a four double value array (r, g ,b, a) representing the color of text markers to be displayed by RViz
adaptive_threshold_block_size and adaptive_threshold_threshold support dynamic reconfiguration.
zxing_cv contains a demo launch file that runs a cv_camera node to capture image, a qr_detector node to locate QRCode into 3D space and some GUI in order to check what is going on.
On the screen shot you can see:
- on the bottom-right corner the debug_image: this is the image provided by camera on which qr_detector add discovered QRCode points and content
- on the top-right corner the optimized_image: before QRCode extraction image provided by camera are converted to grayscale then passed to an adaptive threshold in order to have more performance by QRCode reader extraction
- on the top-left corner the dynamic reconfigure gui: you can use this gui to adjust the working parameter of adaptive threshold. Any changes are automatically reflected by the optimized and debug images
- on the bottom-left corner the topic gui show the qr_detector result in the /qr_detector/qr_codes topic. It contains an array of qr_code messages (one for every found QRCode): each qr _code message contains the extracted data, the position and orientation of the QRCode respect to camera
Finally we reach the goal to locate an object in 3D space in easy way and using cheap hardware. Of course this system does not provide same results of complex 3D vision systems based on 3D camera but provide a simple method suitable in controlled environment where attach QRCode to any interesting objects is not a problem.