Mission 7 took a monumental leap by requiring autonomous aerial robots to interact with and control autonomous ground robots. Teams were tasked with developing systems to herd ground robots out one end of an arena in the absence of 3D cues such as walls. The ground robots could only be interacted with by touch. A top touch would command a 45° clockwise turn and a blocking action would result in a 180° turn. To complicate matters, the ground robots do a 180° turn every 20 seconds and add up to 15 degrees of trajectory noise every 5 seconds. The ground robots also impact one another and quickly devolve into non-deterministic travel. In the midst of the arena were four obstacle robots to complicate navigation and obstacle avoidance. The aerial robots had to dynamically determine the best course of action to keep the ground robots from exiting on three of four sides of the arena. 


For target identification, the drone carries four Logitech C270 USB webcams and one Logitech C210 USB webcam. One of the cameras is located at the bottom of the drone and faces straight downward. The other four cameras are mounted on the bottom side of each arm facing outward. All five cameras are mounted for identifying as many target Roombas surrounding the drone as possible in order to send their coordinates to the strategy node.

There are two main objectives that need to be completed to successfully relay target information to the strategy node. The first objective is to take in the raw video data and detect the target Roombas so that pixel numbers of the centroids of the Roombas in the frame can be sent as output. The second objective is to take the pixel numbers of the centroids of the detected Roombas as input, and then output x and y coordinates relative to the drone using a coordinate transformation and the known height and orientation of the camera with respect to the floor.


In order to accomplish the first objective, an open source software package developed by Joseph Redmon called YOLO (You Only Look Once) is used. More specifically, the TinyYolo version of YOLO was elected to be used since it is lighter and can obtain higher framerates. YOLO is a neural network that can identify user-defined objects once it has been trained. After training the neural network by feeding in approximate 2000 photos of the target Roombas (with the Roombas identified manually), YOLO can reliably identify target Roombas in real time. An example is displayed Above

After YOLO outputs the pixel locations of each identified target Roomba, the final objective is to transform these pixel coordinates into x-y coordinates relative to the drone. In addition to the pixel coordinate (P IX#), the other knowns are height (z), camera rotation angle relative to downward facing (θ), and field of view (φ), as well as total pixels in x and y (P IX MAX).


Our system builds off of the robust open source project Arducopter. Arducopter, which runs on the Pixhawk, controls the stability of the drone by taking in data from the IMU, compass, altitude LiDAR, and the optical flow sensor. The Arducopter flight stack is maintained by hundreds of developers from around the world and the software is deployed on thousands of commercial and recreational drones. The vast community of developers and users ensures refined controls code with lots of features.

We have made slight modifications to the Arducopter firmware, which allow us to set the EKF origin. This has been done to make our drone capable of navigating in GPS denied environments. For general waypoint navigation, the drone utilizes the Arducopter EKF that takes in data from the IMU, compass, altitude LiDAR, and the optical flow sensor. The EKF position is then queried via mavros so that our strategy code can use our position to make intelligent decisions. In addition, the strategy node takes in data from computer vision and 2D LiDAR. The strategy node then determines a waypoint and mode, and the ROS controls node will do a sanity check on the waypoint and publish the waypoint until the drone has reached its destination. Over time, the EKF will have some error in comparison to reality. To combat this, the controls node will input a correction vector. This correction vector is formed from our computer vision algorithm that recognizes the absolute corners of the grid. This algorithm goes as follows: Gaussian blur, color threshold, erosion, dilation, Canny edge detection, Hough transform, line intersection, and non-maximum suppression.


To safely test our software throughout the development process, the team implemented a simulation using the Gazebo application. Using this tool, TAR was able to deploy full-scale software in the loop simulation without jeopardizing hardware with each design iteration.


To employ the Gazebo framework, the team imported a standard quadcopter model from the ArduCopter standard library and Roomba models from Gazebo model library, which were then updated with the color-coded plates and obstacle tubes prescribed for the competition. Additionally, the ground plane within the reproduction was changed to match the texture expected at the competition.


Gazebo was chosen because of its compatibility with ROS and Ardupilot, the internal communication protocol employed onboard our drone. With this congruence, the exact same software run onboard the drone could be run within the simulation, complete with sensor feedback detected within the simulated instance. C++ scripts could then be utilized to enforce the correct Roomba behaviors, scripting the robots using the same messaging environment as the quadcopter itself (ROS).


Overall, modeling our software behavior within a Gazebo simulation greatly expedited the development process. By standardizing the simulation setup, tests could be conducted on multiple computers at any time, instead of requiring the sluggish process of updating, calibrating, and flying the physical drone. Additionally, simulation protected the hardware from the accidents and unintentional flight behavior inherent to prototype software.


The vehicle utilizes an x-y axis symmetrical quad-rotor system. This provides a balanced airframe optimized for omnidirectional maneuvering. The frame is lifted by four T-Motor U3 KV700 motors. These motors provide 1.8 [kg] of thrust per motor at 100% throttle when paired with a 4-cell battery and 13 [in] propellers for total 7.2 [kg] of thrust. This is necessary for our 3.15 [kg] weight to maintain a 2.2 thrust-to-weight ratio, ideal for battery life and flight time.


The propellers used with the motors are 13 [in] long and made of carbon fiber. These TMotor propellers provide excellent precision, durability, and efficiency. The rotors are placed to counter-rotate for flight stability and to retain a one-to-one ratio of propellers to motors.

The drone has been designed to carry all the necessary GNC sensors, 2D LiDAR, an array of cameras, and 3 Nvidia Jetson TX2s. In addition, the drone carries a telemetry radio, kill switch, and RC receiver in order to carry out safe flight.

The quadcopter uses eight sensors for flight control. The main sensors used for the navigation of the quadcopter include a downward-facing 1D LiDAR module, a PX4FLOW optical sensor, and the built-in sensors of the Pixhawk 2 flight control board.


The 1D LiDAR module used is the LIDAR-Lite v3 from Garmin. This one-dimensional sensor is directed downward and provides data that is used to determine altitude. The PX4FLOW sensor measures velocity by comparing frame by frame images and measuring the distance traveled and direction. The Pixhawk 2 contains three 3-axis accelerometers, three 3-axis gyroscopes, and two 3-axis compasses. The data from these sensors is used by the Pixhawk onboard system for flight control and maneuvering of the quadcopter.

For completion of Mission 7, the drone must be able to detect Roombas and obstacles. The quadcopter is equipped with a single downward-facing camera and four outward-facing cameras for image recognition, as well as a two-dimensional plane LiDAR sensor for obstacle avoidance.


The singular bottom camera is a Logitech C210 webcam. This webcam is used for Roomba detection and tracking the targeted Roomba for autonomous interaction. The four peripheral cameras are Logitech 960 C270s. These are used to recognize and track the Roombas around the playing field. These five webcams provide enough information to allow the drone to incorporate the recognized Roombas into the strategy of the autonomous system. The 2D LiDAR sensor is the SWEEP sensor by Scanse. This is a single LiDAR that rotates to provide a full x-y plane view inline with the z position of the drone for obstacle detection. We chose to use this for obstacle detection rather than four separate LiDARs or webcams to save on computing power.