Tagging IoT Data in a Drone View-ENGINEERING & TECHNOLOGIES:Taiwan Research Highlight

Author(s)
Yu-Chee Tseng
Biography
Yu-Chee Tseng received his Ph.D. in Computer and Information Science from the Ohio State University. He is the Founding Dean of the College of Artificial Intelligence, National Chiao-Tung University, Taiwan. As a Lifetime Chair Professor in NCTU, his h-index is higher than 60. Recently, he published several works about integrating IoT and computer visions.
Academy/University/Organization
National Chiao-Tung University
Source
Demo paper “Tagging IoT Data in a Drone View”
TAGS

drone IoT computer vision people identification data fusion
Share this article

You are free to share this article under the Attribution 4.0 International license

Both cameras and IoT devices have their particular capabilities of tracking moving objects. Their correlations are, however, unclear. Through state-of-the-art deep learning technologies, tracking human objects in videos is not a difficult job. On the other hand, people also use wearable smart watches to track their own daily activities. In this work, we consider using a drone to track ground objects. We demonstrate how to retrieve IoT data from devices which are attached to human objects, and correctly tag them on the human objects captured in a drone view. Therefore, observing and tracking humans “from the air” is becoming possible, and we can even see their identities and personal profiles directly in drone videos. This is the first work correlating IoT data and computer vision from a drone camera. Potential applications of this work include aerial surveillance, people tracking, and intelligent human-drone interaction.

Drones (known as Unmanned Aerial Vehicles, UAVs) have been applied to a wide range of security and surveillance applications. When conducting video surveillance, one fundamental issue is person identification (PID), where human objects in videos need to be immediately tagged by their IDs and personal profiles. The cover image, which was taken by a drone, is an example. By a Convolutional Neural Network (CNN), we are only able to know that there are 4 persons. By our IoT-video fusion technology, we are able to not only know that these are people, but we can also tag their names, personal profiles, and past activities on the day on the image. This augmented information is apparently more user-friendly and can provide very intuitive human-computer interfaces.

Traditional technologies such as RFID and fingerprint/iris/face recognition have their limitations or require close contact with specific devices. Hence, they are not applicable to drones with changing height and view angle. In this work, we present an approach to detecting human objects in the videos taken by a drone and correctly tag their retrieved personal profiles, through wireless communications, from their wearable devices on those human objects. Through combining IoT and AI, we can display IoT data directly on the recognized image. Our target application is future aerial surveillance with much richer information. The displayed information can include the past activities of users even before they entered the camera view.

To achieve this goal, we propose a data fusion approach that combines inertial sensors with videos. The above figure shows our system architecture, which integrates an aerial camera and users’ wearable devices. There is a drone equipped with a camera to continuously record videos. A crowd of people are in the drone view, and some of them may be wearing wearable devices. Each wearable device has a 4G communication interface and some inertial sensors. This allows us to transmit user profiles and motion data. These video and inertial data are transmitted to a fusion server for correlation analysis. In order to correlate video data and inertial data, we conduct four major procedures. Firstly, we retrieve human objects by using a deep learning network. Secondly, we use some ArUco markers to transform the human objects in a pixel space to a real-world coordinate to handle the change of drone positions. Thirdly, we extract human motion features from both video data and inertial sensor data. Fourthly, we design a fusion algorithm to measure the similarity of any pair of normalized motion feature series, one from a human object in videos and the other from a wearable device. By quantifying all-pair similarity scores, we are able to couple human objects with their IoT data, achieving our goal of tagging wearable device data on human objects in drone-recorded videos.

To validate our idea, we have conducted a number of experiments. The experiment scene is shown in the figure above. We place four ArUco markers on a square ground. The drone view after coordinate transformation is shown in part (a). There are two users (named Neil and Katy), who walked in straight lines and circular paths, as shown in (b) and (c). Their walking paths interleave with each other in many cases, causing occasional occlusion in the videos, as shown in (d). However, with our fusion technology, it is still possible to get correct pairing results in the drone view.

To conclude, our results show an interesting new direction that future human identification is not limited to face recognition; it is also possible to use IoT for human identification. Furthermore, in addition to human identity, we are able to provide personal profiles/preferences/hobbies from IoT devices as well. This would make future surveillance applications even more informative.

STAY CONNECTED. SUBSCRIBE TO OUR NEWSLETTER.

Add your information below to receive daily updates.