Skip to main content
Open this photo in gallery:

Medical workers putting on PPEs at the beginning of their shift at the emergency field hospital run by Samaritan's Purse and Mount Sinai Health System in Central Park on April 8, 2020 in New York.Misha Friedman/Getty Images

New research indicates that the coronavirus began to circulate in the New York area by mid-February, weeks before the first confirmed case, and that travelers brought in the virus mainly from Europe, not Asia.

“The majority is clearly European,” said Harm van Bakel, a geneticist at Icahn School of Medicine at Mount Sinai, who co-wrote a study awaiting peer review.

A separate team at NYU Grossman School of Medicine came to strikingly similar conclusions, despite studying a different group of cases. Both teams analyzed genomes from coronaviruses taken from New Yorkers starting in mid-March.

The research revealed a previously hidden spread of the virus that might have been detected if aggressive testing programs had been put in place.

On Jan. 31, President Donald Trump barred foreign nationals from entering the country if they had been in China during the prior two weeks.

It would not be until late February that Italy would begin locking down towns and cities, and March 11 when Trump said he would block travelers from most European countries. But New Yorkers had already been traveling home with the virus.

“People were just oblivious,” said Adriana Heguy, a member of the NYU team.

Heguy and van Bakel belong to an international guild of viral historians. They ferret out the history of outbreaks by poring over clues embedded in the genetic material of viruses taken from thousands of patients.

Viruses invade a cell and take over its molecular machinery, causing it to make new viruses.

The process is quick and sloppy. As a result, new viruses can gain a new mutation that wasn’t present in their ancestor. If a new virus manages to escape its host and infect other people, its descendants will inherit that mutation.

Tracking viral mutations demands sequencing all the genetic material in a virus — its genome. Once researchers have gathered the genomes from a number of virus samples, they can compare their mutations.

Sophisticated computer programs can then figure out how all of those mutations arose as viruses descended from a common ancestor. If they get enough data, they can make rough estimates about how long ago those ancestors lived. That’s because mutations arise at a roughly regular pace, like a molecular clock.

Maciej Boni of Penn State University and his colleagues recently used this method to see where the coronavirus, designated SARS-CoV-2, came from in the first place. While conspiracy theories might falsely claim the virus was concocted in a lab, the virus’s genome makes clear that it arose in bats.

There are many kinds of coronaviruses, which infect both humans and animals. Boni and his colleagues found that the genome of the new virus contains a number of mutations in common with strains of coronaviruses that infect bats.

The most closely related coronavirus is in a Chinese horseshoe bat, the researchers found. But the new virus has gained some unique mutations since splitting off from that bat virus decades ago.

Boni said that ancestral virus probably gave rise to a number of strains that infected horseshoe bats and perhaps sometimes other animals.

“Very likely there’s a vast unsampled diversity,” he said.

Copying mistakes aren’t the only way for new viruses to arise. Sometimes two kinds of coronaviruses will infect the same cell. Their genetic material gets mixed up in new viruses.

It’s entirely possible, Boni said, in the past 10 or 20 years, a hybrid virus arose in some bat that was well-suited to infect humans, too. Later, that virus somehow managed to cross the species barrier.

“Once in a while, one of these viruses wins the lottery,” he said.

In January, a team of Chinese and Australian researchers published the first genome of the new virus. Since then, researchers around the world have sequenced over 3,000 more. Some are genetically identical to each other, while others carry distinctive mutations.

That’s just a tiny sampling of the full diversity of the virus. As of April 8, there were 1.5 million confirmed cases of COVID-19, and the true total is probably many millions more. But already, the genomes of the virus are revealing previously hidden outlines of its history over the past few months.

As new genomes come to light, researchers upload them to an online database called GISAID. A team of virus evolution experts are analyzing the growing collection of genomes in a project called Nextstrain. They continually update the virus family tree.

The deepest branches of the tree all belong to lineages from China. The Nextstrain team has also used the mutation rate to determine that the virus probably first moved into humans from an animal host in late 2019. On Dec. 31, China announced that doctors in the city of Wuhan were treating dozens of cases of a mysterious new respiratory illness.

In January, as the scope of the catastrophe in China became clear, a few countries started an aggressive testing program. They were able to track the arrival of the virus on their territory and track its spread through their populations.

But the United States fumbled in making its first diagnostic kits and initially limited testing only to people who had come from China and displayed symptoms of COVID-19.

“It was a disaster that we didn’t do testing,” Heguy said.

A few cases came to light starting at the end of January. But it was easy to dismiss them as rare imports that did not lead to local outbreaks.

The illusion was dashed at the end of February by Trevor Bedford, an associate professor at the Fred Hutchinson Cancer Research Center and the University of Washington, and his colleagues.

Using Nextstrain, they showed that a virus identified in a patient in late February had a mutation shared by one identified in Washington state on Jan. 20.

The Washington viruses also shared other mutations in common with ones isolated in Wuhan, suggesting that a traveler had brought the coronavirus from China.

With that discovery, Bedford and his colleagues took the lead in sequencing coronavirus genomes. Sequencing more genomes around Washington gave them a better view of how the outbreak there got started.

“I’m quite confident that it was not spreading in December in the United States,” Bedford said. “There may have been a couple other introductions in January that didn’t take off in the same way.”

As new cases arose in other parts of the country, other researchers set up their own pipelines. The first positive test result in New York came on March 1, and after a couple of weeks, patients surged into the city’s hospitals.

“I thought, ‘We need to do this for New York,’” Heguy said.

Heguy and her colleagues found some New York viruses that shared unique mutations not found elsewhere. “That’s when you know you’ve had a silent transmission for a while,” she said.

Heguy estimated that the virus began circulating in the New York area a couple of months ago.

And researchers at Mount Sinai started sequencing the genomes of patients coming through their hospital. They found that the earliest cases identified in New York were not linked to later ones.

“Two weeks later, we start seeing viruses related to each other,” said Ana Silvia Gonzalez-Reiche, a member of the Mount Sinai team.

Gonzalez-Reiche and her colleagues found that these viruses were practically identical to viruses found around Europe. They cannot say on what particular flight a particular virus arrived in New York. But they write that the viruses reveal “a period of untracked global transmission between late January to mid-February.”

So far, the Mount Sinai researchers have identified seven separate lineages of viruses that entered New York and began circulating. “We will probably find more,” van Bakel said.

The coronavirus genomes are also revealing hints of early cross-country travel.

Van Bakel and his colleagues found one New York virus that was identical to one of the Washington viruses found by Bedford and his colleagues. In a separate study, researchers at Yale found another Washington-related virus. Combined, the two studies hint that the coronavirus has been moving from coast to coast for several weeks.

Sidney Bell, a computational biologist working with the Nextstrain team, cautions people not to read too much into these new mutations themselves. “Just because something is different doesn’t mean it matters,” Bell said.

Mutations do not automatically turn viruses into new, fearsome strains. They often don’t bring about any change at all. “To me, mutations are inevitable and kind of boring,” Bell said. “But in the movies, you get the X-Men.”

Peter Thielen, a virologist at the Applied Physics Laboratory at Johns Hopkins University, likes to think of the spread of viruses like a dandelion seed landing on an empty field.

The flower grows up and produces seeds of its own. Those seeds spread and sprout. New mutations arise over the generations as the dandelions fill the field. “But they’re all still dandelions,” Thielen said.

While the coronavirus mutations are useful for telling lineages apart, they don’t have any apparent effect on how the virus works.

That’s good news for scientists working on a vaccine.

Vaccine developers hope to fight COVID-19 by teaching our bodies to make antibodies that can grab onto the virus and block its entry into cells.

Some viruses evolve so quickly that they require vaccines that can produce several different antibodies. That’s not the case for COVID-19. Like other coronaviruses, it has a relatively slow mutation rate compared to some viruses, like influenza.

As hard as the fight against it may be, its mutations reveal that things can be a whole lot worse.

Of course, the coronavirus will continue to mutate as long as it still infects people. It’s possible that vaccines will have to change to keep up with the virus. And that’s why scientists need to keep tracking its history.

Follow related authors and topics

Authors and topics you follow will be added to your personal news feed in Following.

Interact with The Globe