How scientists sequenced the complete genome of Streptomyces leeuwenhoekii, revealing its potential for novel antibiotic discovery
In the hyper-arid, salt-crusted landscape of Chile's Atacama Desert—one of the most extreme environments on Earth—scientists made a remarkable discovery. From the saline soils of the Chaxa Lagoon, they isolated a tiny bacterium with an outsized potential: Streptomyces leeuwenhoekii 2 6 .
This microorganism, named after the pioneering microscopist Antonie van Leeuwenhoek, would eventually reveal itself to be a treasure trove of genetic innovation, capable of producing novel antibiotics and other valuable compounds 1 6 .
What makes this bacterium so remarkable isn't just what it can produce, but how we've come to understand its full potential. Through cutting-edge genomic technologies, researchers have now mapped its entire genetic blueprint, uncovering secrets that could lead to new medicines and a deeper understanding of how life adapts to extreme environments 2 .
This is the story of how scientists sequenced the genome of Streptomyces leeuwenhoekii and why it matters for the future of drug discovery.
For decades, scientists discovered new antibiotics and other natural products by growing microorganisms in the lab and analyzing what compounds they produced. This process was like fishing in the dark—sometimes you'd catch something valuable, but often you'd come up empty-handed.
With the advent of DNA sequencing technologies, a more systematic approach emerged: genome mining 1 2 .
Genome mining allows researchers to scan a bacterium's complete genetic code to find all the instructions it contains for making specialized compounds. Think of it as reading the entire cookbook of a master chef rather than just tasting whatever dishes happen to be prepared that day.
This approach has revealed that bacteria, particularly actinomycetes like Streptomyces, have the genetic potential to produce far more compounds than we ever suspected from traditional methods 2 7 .
While the concept of genome mining sounds straightforward, sequencing the genomes of Streptomyces and similar bacteria presents unique challenges:
Early attempts to sequence S. leeuwenhoekii using Illumina technology resulted in a draft genome broken into 658 fragments, with many important genes incorrectly assembled 6 . Researchers needed a better approach to obtain the complete, accurate genome required for effective genome mining.
Screen thousands of microbes for activity
Grow in laboratory conditions
Isolate chemical products
Screen against pathogens
Identify active compounds
Sequence DNA to predict compound production
Obtain complete DNA sequence
Find biosynthetic gene clusters
Determine potential products
Focus on promising candidates
To overcome the limitations of previous sequencing attempts, researchers employed a sophisticated strategy that combined two complementary technologies 2 :
This "third-generation" technology produces very long sequence reads—perfect for assembling repetitive regions and obtaining large contiguous segments
Long Reads Repetitive RegionsThis "second-generation" technology provides shorter reads but with higher accuracy—ideal for correcting errors in the PacBio assembly
High Accuracy Error CorrectionBy marrying these approaches, scientists could leverage the strengths of each technology while mitigating their weaknesses.
Researchers began by isolating high-quality DNA from S. leeuwenhoekii cells grown in laboratory cultures
The long PacBio reads were assembled into three large contigs representing the chromosome and two plasmids 1
The more accurate Illumina data was used to identify and correct errors, particularly in homopolymer regions 2
This innovative approach marked a significant milestone in microbial genomics—it was the first time an actinomycete's genome had been assembled into single contigs for all its replicons without needing additional laboratory work to fill gaps 2 .
The final assembled genome revealed the complete genetic architecture of S. leeuwenhoekii:
| Component | Type | Size | Accession Number |
|---|---|---|---|
| Chromosome | Linear | 7,903,895 bp | LN831790 |
| pSLE1 | Circular plasmid | 86,370 bp | LN831788 |
| pSLE2 | Linear plasmid | 132,226 bp | LN831789 |
Data source: 5
| Technology | Read Length | Accuracy | Best For | Limitations |
|---|---|---|---|---|
| PacBio | Long reads | ~99% | Assembling repetitive regions, large contigs | Errors in homopolymer regions |
| Illumina | Short reads | >99.9% | Error correction, variant detection | Difficulty with repetitive areas |
| Combined Approach | Both long and short | Highest | Complete, accurate genomes | More expensive, complex analysis |
The power of the combined sequencing approach became evident when considering the improvements over previous attempts. The PacBio assembly alone contained approximately 2,976 errors in the 7.9 Mb chromosome, primarily single-base omissions in homopolymer runs 2 . These errors were systematically corrected using the Illumina data, resulting in a highly accurate final sequence.
The complete genome sequence revealed S. leeuwenhoekii to be what scientists call a "gifted" strain—exceptionally rich in genes for producing specialized metabolites 6 . Bioinformatic analysis uncovered an astonishing 35 biosynthetic gene clusters—sets of genes that work together to produce specific compounds 1 2 .
Novel ansamycin-type polyketides with activity against antibiotic-resistant bacteria like MRSA 6
Antibiotic MRSA Active22-membered macrolactone polyketides discovered through cultivation in different growth media 6
Macrolactone Media DependentThree clusters for these structurally complex compounds with potential pharmaceutical applications 1
Structural Complexity 3 ClustersHygromycin A and desferrioxamine E—previously known compounds also produced by this strain 2
Hygromycin A Desferrioxamine EPerhaps most exciting was that most of these 35 gene clusters were completely novel—their corresponding compounds remain uncharacterized and potentially represent new chemical structures with useful biological activities 1 .
Interestingly, researchers discovered that S. leeuwenhoekii produces different compounds depending on growth conditions—a phenomenon known as the "One Strain Many Compounds" (OSMAC) principle 6 7 .
Only produced in modified ISP 2 medium
Detected in multiple media but not all
Produced in all eight media tested, making it a reliable chemical marker
This flexibility suggests the bacterium can activate different biosynthetic pathways in response to environmental conditions, significantly expanding its chemical repertoire beyond what might be detected in standard laboratory conditions.
| Tool/Method | Function in Research | Specific Example from Study |
|---|---|---|
| PacBio SMRT sequencing | Generates long reads to assemble large contigs | Assembled chromosome and plasmids into single contigs 2 |
| Illumina MiSeq | Provides high-accuracy short reads for error correction | Corrected homopolymer errors in PacBio assembly 2 |
| antiSMASH | Bioinformatics tool to identify biosynthetic gene clusters | Predicted 35 gene clusters in S. leeuwenhoekii 7 |
| Artemis/ACT | Genome visualization and annotation | Used to manually inspect and correct genome assembly 2 |
| GAP5 | Assembly editing tool | Manually edited alignment between PacBio and Illumina data 2 |
| GC-Frame Plot | Identifies frameshifts in coding sequences | Verified corrections to PacBio sequence errors 2 |
The successful sequencing of S. leeuwenhoekii demonstrated an important proof-of-concept: complete genome assembly of actinomycetes was not only possible but practical 2 .
This opened the floodgates for similar approaches with other microorganisms, particularly those from extreme environments that might harbor novel biochemistry 6 .
The hybrid sequencing approach pioneered with S. leeuwenhoekii has since been applied to numerous other Streptomyces species, including strains isolated from disease-suppressive soils and marine environments 8 .
These studies continue to reveal the incredible genetic diversity within this genus and its capacity to produce valuable compounds.
The S. leeuwenhoekii genome project exemplifies a broader shift in natural product discovery toward genome-guided approaches 7 .
Instead of randomly screening microorganisms for activity, researchers can now:
This more rational approach has accelerated the discovery of new compounds, helping to address the ongoing need for novel antibiotics and other therapeutics in an era of increasing drug resistance 7 .
The story of Streptomyces leeuwenhoekii reminds us that important discoveries often come from the most unexpected places. From the harsh soils of the Atacama Desert to its complete genetic sequence, this bacterium has revealed how much we still have to learn from the microbial world.
As sequencing technologies continue to advance and become more accessible, the approach used to decipher S. leeuwenhoekii—combining complementary technologies to overcome their individual limitations—serves as a model for future genomic studies. Each sequenced genome adds another chapter to our understanding of life's chemical diversity and provides new leads in the endless quest for useful molecules.
The 35 biosynthetic gene clusters identified in S. leeuwenhoekii represent both a culmination of years of sequencing innovation and a starting point for future discovery. Most of these clusters remain uncharacterized, their chemical products unknown and their biological activities unexplored. The genomic treasure map has been drawn—now the chemical treasure hunt begins.