Google today penned an explainer on the Soli radar-based technology that ships inside its Pixel 4 smartphones. While many of the hardware details were previously known, the company for the first time peeled back the curtains on Soli’s AI models, which are trained to detect and recognize motion gestures with low latency. While it’s early days — the Pixel 4 and the Pixel 4 XL are the first consumer devices to feature Soli — Google claims the tech could enable new forms of context and gesture awareness on devices like smartwatches, paving the way for experiences that better accommodate users with disabilities.
The Soli module within the Pixel 4, which was a collaborative effort among Google’s Advanced Technology and Projects (ATAP) group and the Pixel and Android product teams, contains a 60GHz radar and antenna receivers with a combined 180-degree field of view that record positional information in addition to things like range and velocity. (Over a window of multiple transmissions, displacements in an object’s position cause a timing shift that manifests as a Doppler frequency proportional to the object’s velocity.) Electromagnetic waves reflect information back to the antennas, and custom filters (including one that accounts for audio vibrations caused by music) boost the signal-to-noise ratio while attenuating unwanted interference and differentiating reflections from noise and clutter.
The signal transformations are fed into Soli’s machine learning models for “sub-millimeter” gesture classification. Developing them required overcoming several major challenges, according to Google:
- Every user performs even simple motions like swipes in myriad ways.
- Extraneous motions within the sensor’s range sometimes appear similar to gestures.
- From the point of view of the sensor, when the phone moves, it looks like the whole world is moving.
The teams behind Soli devised a system comprising models trained using millions of gestures recorded from thousands of Google volunteers, which were supplemented with hundreds of hours of radar recordings containing generic motions from other Google volunteers. The AI models were trained using Google’s TensorFlow machine learning framework and optimized to run directly on Pixel 4’s low-power digital signal processor, allowing them to track up to 18,000 frames per second even when the main processor is powered down.
“Remarkably, we developed algorithms that specifically do not require forming a well-defined image of a target’s spatial structure, in contrast to an optical imaging sensor, for example. Therefore, no distinguishable images of a person’s body or face are generated or used for Motion Sense presence or gesture detection,” Google research engineer Jaime Lien and Advanced Technology and Project software engineer Nicholas Gillian wrote. “[For this reason,] we are excited to continue researching and developing Soli to enable new radar-based sensing and perception capabilities.”
While Soli on the Pixel 4 remains in its infancy — we were somewhat underwhelmed when we tested it late last year — it continues to improve through software updates. Support for Soli came to Japan in early February (Google has to certify Soli with regulatory authorities to legally transmit at the required frequencies), and later that month, a new “tapping” gesture that pauses and resumes music and other media made its debut.