Sunday, 16 August 2009

Investigating the Smallest Bacterial Genome

ResearchBlogging.orgDuring the first two terms of my final year at university I took on a dry project based on bacterial genetics. In retrospect this was quite a cheap move since it avoids so many of the problems involved in normal lab work, but in my defence it was the project that interested me most and I really wanted some computing experience before making PhD applications. My work was based on bacterial endosymbionts of insects, which have been of recent interest due to their extremely small genomes. Bacteria in these symbiotic relationships are given a protected, nutrient-rich environment and in return they allow insects to survive on unbalanced diets by synthesising scarce biomolecules.

Primary endosymbiotic bacteria live their entire lives inside insects and are vertically transmitted from generation to generation, a process that leads to coevolution between the bacteria and the insect. One of the results of this coevolution was major changes to the original bacterial genome, which contained many genes that are essential for free-living bacteria but are unnecessary for life within an insect. Consequently, common features of endosymbiotic genomes compared to those of free-living bacteria are severe gene loss, genome compaction and skewing of GC content.

Electron micrograph showing bacteriocytes taken from P. venusta

1 – Bacteriocyte; 2 – C. ruddii; 3 – Unidentified electron-dense mass

My project focused on
Carsonella ruddii, the only bacterial endosymbiont of the psyllid, Pachypsylla venusta. It was hailed as the smallest bacterial genome chracterised when it was sequenced in 2006 and still holds that record. Its genome contains only 182 ORFs, less than 3% intergenic DNA and has a GC content of 16.5%. The bacteria appears to be provided with many nutrients by its host and its metabolism has been reduced to a few pathways: ATP synthesis, a section of the pentose phosphate pathway and biosynthesis of certain amino acids.

The early stages of my project involved a reannotation of the
C. ruddii genome followed by a sequence-based functional analysis of its metabolic enzymes. Using the enzymes deemed functional in this analysis I built an updated model of the C. ruddii metabolism which could be divided into six pathways involved in amino acid biosynthesis, five of which were incomplete. The only fully intact pathway led to the production of isoleucine and valine. These are both essential amino acids for insects and are severely under-represented in the adult psyllid diet.

Four of the incomplete amino acid pathways were missing only one reaction and the conservation of the rest of each of the pathways suggested that they might still be functional in
C. ruddii. The ‘missing’ reactions might occur spontaneously under some conditions or could be catalysed by unidentified enzymes. For three of these four missing reactions I found an example in the literature of a different bacterial endosymbiont which had lost that reaction but had retained the rest of the pathway. This seemed to suggest that the enzymes catalysing these reactions might be expendable and subject to loss during genome reduction in endosymbionts. Based on this and some other evidence from similar situations in endosymbionts I predicted that these pathways are probably functional in C. ruddii and that its main role symbiotic role is to provide the psyllid with essential amino acids.

The fourth of these incomplete pathways was the most interesting because I was unable to locate another endosymbiont which was missing the same reaction. The reaction was catalysed by the product of a gene,
AS, which was present on the C. ruddii genome but which I had labelled as a pseudogene during functional analysis. Although it’s difficult to conclusively say that an enzyme is inactive solely by sequence analysis, multiple alignments showed that this copy of AS was extensively degraded and was missing both of its key catalytic residues as well as its substrate binding residues. However, later in the project when I was scanning an EST set taken from the insect host of C. ruddii I located another copy of AS which also had bacterial origin but which was not present on the C. ruddii genome. Sequence analysis showed that this version of AS seemed to be active and could potentially fill the gap in the pathway.

Where did this copy of
AS originate from? It aligned well with the version of AS from P. aeruginosa and appeared to have a bacterial origin but was not found on the C. ruddii genome or the psyllid mitochondrial genome, both of which have been sequenced. Several lines of evidence ruled out the presence of a second bacterial endosymbiont in this symbiosis and since no plasmids had been reported during DNA sequencing of C. ruddii the source of this sequence appeared to be the nuclear genome of P. venusta itself. The presence of this bacterial sequence in the eukaryotic genome suggests that LGT may have taken place between a bacterial genome and the insect nuclear genome. This would be one explanation for the fact that C. ruddii has only 182 ORFs, which is significantly lower than the predicted minimal bacterial genome. However, it is also possible that C. ruddii uses mitochondrial proteins to survive and so LGT is not the only explanation for the low ORF count.

This was my favourite line of investigation during my project but the symbiosis between
C. ruddii and P. venusta had many more interesting features that I read about over the year. One of the questions I got in my viva was whether C. ruddii should be labelled as a bacterium or an organelle. I think this question is only really important when considering a minimal bacterial genome and if C. ruddii does turn out to be importing essential proteins from elsewhere then I think that the label organelle is definitely more appropriate. However, the definition of an organelle doesn't seem to be well-established and so whether or not C. ruddii really does have the smallest bacterial genome is a matter of opinion.

Nakabachi A, Yamashita A, Toh H, Ishikawa H, Dunbar HE, Moran NA, & Hattori M (2006). The 160-kilobase genome of the bacterial endosymbiont Carsonella. Science (New York, N.Y.), 314 (5797) PMID: 17038615

Gil, R., Silva, F., Pereto, J., & Moya, A. (2004). Determination of the Core of a Minimal Bacterial Gene Set Microbiology and Molecular Biology Reviews, 68 (3), 518-537 DOI: 10.1128/MMBR.68.3.518-537.2004

Glass, J. (2006). Essential genes of a minimal bacterium Proceedings of the National Academy of Sciences, 103 (2), 425-430 DOI: 10.1073/pnas.0510013103

Thao, M., Moran, N., Abbot, P., Brennan, E., Burckhardt, D., & Baumann, P. (2000). Cospeciation of Psyllids and Their Primary Prokaryotic Endosymbionts Applied and Environmental Microbiology, 66 (7), 2898-2905 DOI: 10.1128/AEM.66.7.2898-2905.2000


Anonymous said...

very nice post. Sounds like an exciting project with many implications.

Catarina Vicente said...

So this is what you spent all your time doing downstairs... very interesting indeed!

James Lloyd said...

really interesting. so any ETA on a genome sequence of the insect so we can look for the enzyme in question and if it has a signal peptide sending it to the bacteria (maybe something similar to sending it to a mito)? anything known about protein targeting to these types of mircobes in other examples?

it has always been one of my fav things to discuss the definitions of important things that no one can define like what is an organelle or life. no clear definitions means we cannot truly say what a virus is or an organelle etc.

Alejandro Montenegro-Montero said...

Hi Joseph.

Very interesting post.
I selected your post as one of my "picks of the week" in molecular biology over at my blog (

Joseph Boyle said...

I don't know if the psyllid genome is being sequenced at the moment unfortunately. Most of the research in this area seems to be directed at a similar symbiosis between an aphid and an endosymbiont named Buchnera aphidicola and I believe the aphid genome is being sequenced and annotated right now. LGT has been implicated in this symbiosis but the smallest strains of Buchnera still have about 400 genes, so I don't think that that genome sequence will be as interesting as the psyllid one could be.

As far as I know these endosymbionts have traditionally been thought of as genetically independent from their hosts, at least in insects, so I havn't read much about protein targeting in these cases.

Interestingly though, C. ruddii has no cell membrane/wall metabolism so this must be provided by the host. If it has similar cell walls to mitochondria then it's possible it could use the same protein import mechanism. I tried to examine this at one point but I couldn't make much out from the available electron micrographs. Also of interest is whether C. ruddii has peptidoglycan cell walls (some early electron microscopy papers listed its cell wall as gram negative). If so this would also point to LGT because eukaryotes wouldn't normally have the necessary biosynthetic enzymes to make peptidoglycan.

I'll have to catch up on the news in this area when I have access to papers again...

'Organelle' is an interesting discussion point. Out of interest do you know who, if anyone, has the right to define the more ambiguous terms in biology, or is it just popular use in textbooks and papers? The definitions that I found around the internet were extremely varied. I think that the meaning of 'Life' has got to be the toughest definitions though :P

Thanks Alejandro, that's very kind :)

Menelaos Symeonides said...

I take it that these endosymbionts live in the insect gut. How can LGT be an explanation for the gene loss in the bacteria and the AS gene in the insect unless the endosymbionts live in or near the reproductive organs? Wouldn't LGT have to take place in the germline for this to be meaningful?

Also, did you ever have a look at human (or mammalian) endosymbionts in terms of ORF count or LGT possibility? I remember a TED talk about how we have essentially been domesticated by bacteria (you can view it that way considering the number of bacteria outnumbers the number of "human" cells in a human body by orders of magnitude) and have been meaning to find out more about it. Obviously we are talking about gut microbes here so the germline transmission problem is still there but I'm sure we have some endosymbionts where it could happen?

Fascinating post, btw, and quite an honour to be featured on MolBio Research Highlights!

Joseph Boyle said...

C. ruddii cells are actually stored in a specialised organ called the bacteriome which is placed between the alimentary canal and the gonads of the psyllid. Psyllids only reproduce sexually and, before they do, endosymbiont cells travel from the bacteriome to the oocyte, which carries them onto the next generation. EMs seem to show that during this process the bacteria are actively taken up by the follicle cell via phagocytosis, demonstrating the extent of coevolution between the two organisms. This seems like the most opportune point in the life cycle for LGT to take place.

So it's quite different from the gut symbiosis in humans, since we only acquire our bacteria during and after birth if I remember correctly. I remember at least one study claiming that the human genome contains bacterial genes. This was based on the fact that some genes are found in humans and bacteria but not in any other eukaryotes. I don't have much paper access at the moment unfortunately so I can't say whether they did an exhaustive search of eukaryotic genomes...

Unfortunately the final year Immunology module didn't go into human endosymbionts as much as I'd hoped so I still don't know a great deal about them. My project for next year is examining a family of proteins that may regulate those types of bacteria though, so I'll hopefully be learning a lot more :)

Your blog pictures look really cool btw, especially the beach. I hope you have a good time over there.