[ad_1]
Google has announced Robotics Transformer 2 (RT-2), which it says is a first-of-its-kind vision-action-language (VAL) model.
RT-2, according to Google, ‘brings us closer to a future of helpful robots.’
What is Google’s ‘RT-2’ model?
The tech giant describes RT-2 as a model that translates vision and language to robots in such a manner that these machines can perform new and complex, and more generalised tasks.
A Transfer-based model trained on text and images from the web, RT-2 transfers knowledge from web data to inform robot behaviour.
The need for ‘RT-2’
Unlike chatbots, robots must be ‘grounded’ in the real world and their own abilities. These must be able to handle complex, abstract tasks in highly variable environments, in particular, situations that the machines have never been in before.
This is where Robotics Transformer 2 comes in.
How does ‘RT-2’ function?
It eliminates the complexity that comes due to the sheer number of systems that run a robot. For example, if a piece of trash is to be thrown away, the existing systems will have to be trained on identifying the trash, picking it up, and throwing it away.
RT-2, on the other hand, already knows what trash is, and can identify it without any explicit training. In turn, it can make the robot perform the task.
Trials for ‘RT-2’
Robotics Transformer 2 was tested across more than 6000 trials, and was found to be functioning as well as its predecessor, the RT-1. On novel, unseen scenarios, the former ended with a score almost double of what the former achieved (62% vs 32%).
Source link

