What’s in a (gene) name

|


Look, no one is trying to get a dick joke into the human genome. If it happens, it won’t be by design. No one even really thought it was a possibility until the late 1990s, when the physical chemistry professor Paul W. May was having a beer with some other science friends and they got around to talking about funny molecules. Everyone knew about the ring-shaped molecule called Arsole. It didn’t take long to conjure up several more funny science terms. May began to collect these, and soon had so many that he turned the collection into a blog. By 2008 the blog had become a book. (NB: both blog and book are written in Comic Sans. Main text, sure, but also table of contents, acknowledgments, and references. The references!) The book has a whole separate section on gene names, and here you will find some of the spiciest names in science. By the time the book was published, however, some of them were already out of date – the Human Genome Nomenclature Committee had begun to take a keen interest in what geneticists were calling their new genes, and by 2006 had put the kibosh on 10 deemed the most offensive. They didn’t know how much stranger it could get.

Back when researchers and their grad students first started the project of identifying genes and their variants, there was no governing body to supervise the naming process. So, as Elah Feder and Helen Zaltzman recount in a recent episode of the Science Diction podcast, any names they conjured up became instant canon. No one cared too much about standards because a lot of the first studies were in fruit flies. And so we got gene names like ether a go go, eyeless, and antikevorkian.

But there was a method in their madness. A gene is often named after its function, or the loss thereof. Eyeless, well, you can imagine. Some of them got a little fancy, for example the tin man gene, so called because it caused a fruit fly to develop without a heart (geddit?).

Others got extremely fancy, probably the humanities double majors. Amontillado causes fruit fly larvae not to be able to hatch, a reference to Edgar Allan Poe’s gruesome story of a man buried alive in a small space. Things go full arcana with thisbe and pyramus. I don’t have the training for this explanation, either in genetics or in the Latin poetry canon.

What’s interesting about paging through May’s book is how these names reflect the culture in which they were conceived. Can you guess the decade they found Sonic hedgehog? Similarly of its time is Evander – a zebrafish with this mutant gene is missing an ear – after Evander Holyfield, whose ear was bitten off in a 1990s-famous fight with Mike Tyson. Antikevorkian prevents programmed cell death in plant cells. Where Sonic is a little flip, antikevorkian starts to edge right into the realm of bad taste. Similarly, in 2022 you’d probably think twice before christening a gene that makes fruit flies less tolerant of alcohol “cheap date“.

Which brings us to the dick jokes.

Celibate: Male flies are attracted to females but never mate. Dissatisfaction. “Involved in many aspects of sexual behaviour.” Farinelli: after the castrato. A plant gene that produces sterile male flowers. Fruity. Makes male flies uninterested in females (female version: icebox.) Superman: a gene whose mutation gives you extra dicks (where “you” = “a flower”). Kryptonite – you know what this does.

Finding one dick joke in a gene name is an unexpected delight. But as I leafed through pages of them in May’s book, I started to feel like I was trapped in a room full of seventh grade boys in 1997.

The HGNC was established to get rid of these poorly conceived names, striking when a gene named in a fruitfly or other model organism turned out to exist also in humans. (The disease caused by a mutation in lunatic fringe isn’t remotely funny.) But if they thought they could get rid of bad taste and call it a day, they had another thing coming: and that thing was Microsoft Excel and its draconican autocorrect feature.

Now the problem wasn’t the rude names – it was the boring ones. Namely, the ones that reminded Microsoft of dates. Dutifully, the software turned the gene MARCH 1 turned into the date March 1, and SEPT1 into the first day of September. So in 2019, 27 more genes got the chop, renamed something that wouldn’t confuse Clippy. That should have been the end of it. Ah, but they only got rid of the genes that fell afoul of American autocorrect. Clippy speaks Finnish too. In a paper called “Gene Names: Lessons Not Learned”, Mandhri Abeysooriya and her colleagues last year pointed out “a variety of additional novel error modes,” some of which were “likely related to locale language settings.”

A few papers had the human gene AGO2 converted to Aug-02 (where Excel was in Italian, Spanish or Portugese). The gene MEI1 was converted to May-01 when Excel was speaking Dutch (mei). And “TAMM41 was apparently converted to “Jan-41” due to similarity with the month of January in Finnish (tammikuu).”

Dick jokes and Microsoft-incompatible nomenclature are not the only treats lurking among the tens of thousands of genes in the literature.

This is because there are far more elegant ways to troll the genome. A few decades ago German fruit-fly geneticists started naming their genes in ways that seemed almost calculated to twist English-speaking tongues into Spätzle (Exhibit A: the gene spätzle). The plant biologist Edward Farmer told me that, rather than being an unavoidable consequence of a global genetics community, this was in fact a very intentional (and very funny) deployment of weaponised linguistics: knirps, krüppel and spätzle don’t exactly roll off the American or British tongue. The plant community soon followed suit, hatching their own weaponised umlauts in genes like knolle, wüschel, and zwille.

Farmer himself is no innocent here – when it was time to name his own plant gene, he went for gene name trolling Olympic gold, executing the linguistic double entendre. Fou2 is involved in plant defense mechanisms. It’s a completely unremarkable thing you can say anywhere in the world with a straight face and a clean conscience. Except in France, where the pronunciation foutu is a homonym of “fucked up”. “We did it deliberately for when we spoke in front of French speakers,” he says cheerfully. “With a bit of a theatrical presentation it worked quite well.” He shrugs: “geneticists like to have fun.”

Someday someone is going to do a PhD on the semiotics of the gene names, and I will be here for it.

Photo credit: “Suprised Tin Man” by Thomas Hawk, licensed under CC BY 2.0

2 thoughts on “What’s in a (gene) name

  1. Thanks Sally, a good fillet of the subject. ken-and-barbie aka “ken” (because mutant flies have no external genitalia) fits your theme, not least because “ken” dumps the female part of the gene name although barbie was clearly original and best.
    Slipping naughty names past the editors goes back a long way, though, for 30 years Genbank the DNA database has retained this annotation in the fucose operon of E. coli: /note=”unnamed protein product; fucK ORF (AA 1-482)”

    1. Andrew thanks for this – those are great additions. Also, this post was already running way overlong, so one thing I really wanted to get into (but didn’t) was how all these name changes affect working scientists. Is it a big pain in the behind? Are there like, gene name autocorrect programs to automate re-naming? Is there a reference database? How do you deal with all the old literature?

Comments are closed.

Categorized in: Curiosities, LWON, Miscellaneous, Sally, Science Culture

Tags: , ,