On a Thursday that marked a significant leap in artificial intelligence, OpenAI unveiled an innovative feature for its renowned chatbot, ChatGPT, which permits the AI to perform actions on behalf of its users. This development aligns with a greater industry movement toward transforming the digital landscape, where major technology companies aspire for users to rely less on manual searches and app-hopping, and to instead depend on intelligent agents to manage tasks seamlessly.
The newly introduced agent mode for ChatGPT, which has officially begun its rollout, signifies an important step as tech giants reaffirm their commitment to enhancing digital assistants with markedly sophisticated functions. The advent of this feature escalates the competitive race among leading firms such as OpenAI and Google, the latter of which is also striving for progress through its Gemini agent, mirroring the functionalities of ChatGPT.
According to OpenAI, ChatGPT’s agent mode exhibits the capability to “think” and “act” using a dedicated virtual computer, which equips it to manage intricate action-oriented requests. This enhancement allows users to issue more complex commands than previously feasible, such as requesting the agent to “review my calendar and summarize upcoming client meetings in light of recent news” or to “organize and purchase ingredients necessary for preparing a traditional Japanese breakfast for a party of four.” Such examples were highlighted in a blog post released by the company.
Demonstrating the range of possibilities, OpenAI presented a video where employees crafted an extensive and detailed request for the AI’s assistance in preparing for an upcoming wedding. The request encompassed specific directives, such as “locate an outfit adherent to the event’s dress code,” and stipulated that the agent should propose five suitable options, along with suggestions for hotels that could accommodate a couple of extra days surrounding the event.
This advanced functionality will be accessible to users who subscribe to OpenAI’s Pro, Plus, or Team plans. The feature builds upon existing capabilities found in ChatGPT’s Operator and Deep Research tools; where the Operator assists in web browsing, and the Deep Research feature excels at analyzing online resources for tasks such as report compilation.
The introduction of this feature represents another critical move in OpenAI’s ambition to evolve ChatGPT into a comprehensive universal assistant. Nevertheless, the broader AI sector continues to grapple with various challenges, particularly regarding safety and privacy issues surrounding the technology. It is widely acknowledged that AI systems can still hallucinate, reflect biases, and act unpredictably, evidenced by incidents such as xAI’s Grok chatbot which veered into disseminating antisemitic content after being prompted.
In light of these potential perils, OpenAI openly recognized the risks associated with the new functionalities of ChatGPT in its blog post. The company indicated that it has limited the information accessible to the model, ensuring that certain actions—such as sending emails—must be performed under the user’s supervision. Furthermore, the AI is programmed to decline undertaking “high-risk tasks” like executing bank transfers, as a safety measure.
OpenAI’s CEO, Sam Altman, offered a measured perspective on this groundbreaking update, stating that he would characterize the technology to his own family as “cutting-edge and experimental; a chance to experience the future, but not yet suitable for high-stakes scenarios or sharing sensitive personal data until we have a better understanding and opportunity to refine it through real-world use.”
He further advised users to exercise caution whilst granting ChatGPT access to their personal information. For instance, while allowing the AI access to a calendar may be practical for coordinating group dinners, providing such access may not be necessary for shopping for clothing on behalf of the user.
This announcement coincides with a broader trend among tech corporations who are fervently working to create advanced AI agents, vying for a competitive edge in the evolving AI landscape. Google, for instance, made waves with a series of AI announcements during its developer conference in May, including the introduction of an agent capable of making restaurant reservations, securing event tickets, among other tasks. Meanwhile, Apple is reportedly advancing upon a next-generation version of Siri that aims to carry out user tasks through apps; however, updates on this release remain indefinitely delayed.