On January 23, OpenAI, a major player in generative artificial intelligence, provided a sneak peek of a new AI agent designed to assist users with online tasks, aiming to improve its chatbot amid growing competition.
The tool, named Operator, utilizes a model that enables it to interact with on-screen elements like buttons, menus, and text boxes.
"This breakthrough represents a significant advancement in AI progress, empowering models to utilize familiar human tools and paving the way for a wide array of innovative applications," the company stated in a blog post.
Operator has the capacity to handle various tasks such as creating to-do lists and aiding in vacation planning. It also seeks user input upon task completion and asks for confirmation for certain actions, like inputting login information on websites.
The tool is currently accessible to Pro users in the U.S. as a research preview, according to the Microsoft-backed startup.
AI agents, capable of carrying out tasks like making purchases and scheduling meetings autonomously, have become a focal point for many companies.
In a parallel development on the same day, OpenAI’s competitor, Perplexity, unveiled an agent-based assistant for Android devices. This assistant can assist with dinner reservations, ride-hailing, setting reminders, and more.
Last year, Apple integrated Apple Intelligence into its voice assistant, Siri, and, through a collaboration with OpenAI, introduced the utilization of ChatGPT with user consent.
Previously considered a challenge for researchers, the advent of step-by-step reasoning methods like those employed in OpenAI's o1 model has made the execution of such tasks feasible, as discussed by business executives with Reuters in December.