Artificial intelligence became eerily adept at replicating human behavior this year.
Humans saw chatbots answer complex, even philosophical, questions with lifelike insight. AI-generated images became so high quality, humans were tricked into thinking one of their own made it. Software visualized nearly the entirety of human proteins, a potential boon for drug discovery. Arginine Arg
“I probably started this year expecting 2022 to be more of the same,” said Peter Clark, the interim chief executive of the Allen Center for AI. “But there’s been some fairly ground-shaking developments this last year which I wasn’t expecting.”
Though the advances may have seemed sudden, they were the product of years of research, artificial intelligence experts said. The field of generative artificial intelligence — where software creates content like texts or images based on descriptions — had the most notable breakthroughs in 2022, experts said, largely because advances in math and computing power enabled new ways to train the software.
But AI skeptics continued to worry. Their long-standing complaints that AI models, trained on human-created data, would imitate the racism and sexism in society were proven true. There were renewed concerns that running AI software is energy intensive and harming the climate. The battle between humans creating the content which feeds AI models and the companies that profit off them became more fierce and entered the courts.
“You see a hint of the storm to come,” said Margaret Mitchell, chief ethics scientist at Hugging Face, an open source AI start-up.
Here are a few notable innovations in artificial intelligence that happened this year.
Several weeks ago, a new internet chatbot took the internet by storm. It was called chatGPT, and created by OpenAI — an organization launched several years ago with funding from Elon Musk and others.
Humans were asking it all kinds of questions. Some were humorous (write song lyrics about AI taking jobs in the style of rapper Eminem); others more innocent (how should a mother tell her 6-year old son Santa isn’t real, one person asked); while others seemed useful (complete this tricky computer code, people requested). Over a million users explored it within its first few days of launching on Nov. 30, touting how lifelike the answers were.
Some answers were creative, others were outright wrong and some carried racist and sexist assumptions. OpenAI said it installed filters to restrict answers that the chatbot spits out, but people found creative ways to bypass safeguards, exposing underlying bias.
It also wasn’t the only chatbot that made waves. Earlier this year, a former Google engineer claimed that LaMDA, the company’s artificially intelligent chatbot generator, was sentient. Character.ai was a chatbot start-up launched this year that let anyone talk with impersonations of people such as Donald Trump, Albert Einstein and Sherlock Holmes.
Still, ChatGPT was notable due to its prose and OpenAI’s marketing prowess. It is powered by a large language model, an AI system trained to predict the next word in a sentence by ingesting massive amounts of text from the internet and finding patterns through trial and error. The model was refined in a new way, with humans conversing with the model, playing user and chatbot, and ranking the quality of the bot’s responses to reinforce lifelike answers.
Craving a photo of a Dachsund puppy in space in the style of painted glass?
Before, you might have needed to commission an artist to get that, but now simply type that request into a text-to-image generator, and out pops an AI generated photo from thin air of such high quality, even AI doubters conceded it’s impressive — though they still note their many concerns.
This year saw an explosion of text-to-image generators.
Dall-E 2, created by OpenAI and named after painter Salvador Dali and Disney Pixar’s WALL-E, shook the internet after launching in July. In August, the start-up Stable Diffusion launched its own version, essentially an anti-DALL-E with fewer restrictions on how it was used. Research lab Midjourney released another during the summer, which created the picture that sparked a controversy in August when it won an art competition at the Colorado State Fair.
What these models do aren’t new, but how they did it was, experts said, causing the sharp increase in image quality. They were trained in a novel way, using a method called diffusion, which essentially breaks down images it is trained on and then reverses that process to generate them, making them faster, more flexible and better at photorealism.
Predictably, experts said, the surge in use came with problems. Artists felt these models were training off images they created and posted onto the internet, and weren’t getting profits from them. People quickly used them to create images of school shootings, war photos, and even child porn, according to a Reddit group and Discord chat channel.
DeepMind, Alphabet’s artificial intelligence subsidiary, announced over the summer it had visualized nearly every human protein in existence and made it available to scientists to examine, potentially turbocharging drug discovery.
Proteins are essential to developing medicine. Visualizing their shape helps scientists uncover how proteins operate, and can help them create drugs to counter disease. In the past, the technique was cumbersome, requiring laborious X-rays and microscopic examination.
The announcement followed a string of advances by DeepMind in the field. In 2020, the lab first announced it had the ability to predict the shape of proteins using AI software. A year later, it announced a tool they created to do it, called AlphaFold, along with roughly 350,000 proteins it visualized, including all proteins in the human genome.
AlphaFold is trained on data in the Protein Data Bank, a global database of protein information.
Earlier this year, the Microsoft-owned company GitHub widely released Copilot, a tool built on OpenAI technology that can translate basic human instructions into functional computer code.
The tool works similarly to other AI software, such as ChatGPT and Dall-E 2, wherein it analyzes large troves of data, much of it culled publicly from the internet. But experts said Copilot is most notable because it’s at the center of a lawsuit that essentially calls that kind of learning a form of piracy.
Matthew Butterick, a programmer and lawyer, is part of a team filing a class-action lawsuit against Microsoft and other companies behind the tool. In the lawsuit, Butterick and his team claim millions of programmers who wrote the original code Copilot is trained on are having their legal rights violated.
Microsoft’s GitHub, which allows programmers to share and collaborate on computer code, said it has been “committed to innovating responsibly with Copilot from the start, and will continue to evolve the product to best serve developers across the globe.”
Mitchell, of Hugging Face, said it’s “one of the most important lawsuits happening right now,” because it could impact U.S. copyright law and start setting a precedent related to public data being used with or without the informed consent of people generating the text.
Pharmaceutical Intermediates “It’s sort of a pivotal moment,” she said. “It’s really important to pay attention to right now.”