Background & Summary

Trichoptera, commonly known as caddisflies, represent the largest order of completely aquatic insects within Endopterygota1. Encompassing approximately 17,000 extant species, Trichoptera are distributed across all continents except Antarctica2. Their larvae exhibit remarkably diverse behavior, constructing various nest structures or living freely in aquatic environments3. Their adaptability to varying water conditions, including temperature and dissolved oxygen, differs significantly among families, genera, and individual species4. Consequently, they serve as vital indicator organisms in water quality monitoring efforts. Additionally, the varied feeding habits of trichopteran larvae contribute to the energy dynamics within stream ecosystems5,6.

Trichoptera is divided into two suborders, Annulipalpia and Integripalpia, based on morphology and habit. Annulipalpian larvae typically inhabit running water or wave-washed riverbanks, using pin silk along with plant debris and small stones to construct fixed shelter. Integripalpia includes “cocoon-makers” and “Phryganides”7,8. Cocoon-makers larvae are either free-living or construct purse-case or saddle-case and are usually found in fast-flowing rivers and streams. Last instar larvae produce closed, semipermeable cocoons for pupation. In contrast, most Phryganides larvae thrive in stagnant or slow-moving water, adeptly combining stones, leaves, and twigs with silk proteins to construct mobile nests9,10. Rhyacophilidae and Phryganeidae are representative cocoon-makers and Phryganides, respectively, and exhibit marked ecological habit and lifestyle differences.

The family Rhyacophilidae originated in the Palaearctic region and is primarily distributed in the northern-hemisphere11. Their predatory larvae exhibit high sensitivity to environmental changes12. However, the majority of phryganeid larvae are shredders, feeding on detritus and plant material in aquatic environments13. These larvae tend to be less sensitive to environmental changes compared with rhyacophilid larvae. Some species can survive in humid terrestrial environments after leaving the water10. Himalopsyche anomala Banks and Eubasilissa splendida Yang & Yang are typical representatives of Rhyacophilidae and Phryganeidae, respectively. Despite extensive studies on their biological characteristics, their precise phylogenetic positions and the molecular mechanisms underlying their adaptive evolution remain uncertain. High-quality reference genomes are crucial for advancing genetics and genome research. To date, nearly 30 trichopteran species have had their genomes sequenced and published, including two Himalopsyche species and Eubasilissa regina. However, the chromosome-level has been reached in only partial species from five families (Glossosomatidae, Hydropsychidae, Leptoceridae, Limnephilidae, and Odontoceridae).

To enhance our understanding of the adaptive evolution and ecology of holometabola aquatic insects, we used PacBio long-read sequencing, Illumina short-read sequencing, and Hi-C data sequencing techniques to achieve the first chromosome-level genome assemblies for H. anomala Banks and E. splendida Yang & Yang, with assembly sizes of 663.43 and 859.28 Mb and scaffold N50 lengths of 28.44 and 31.17 Mb, respectively. Hi-C scaffolding resulted in chromosome-level assemblies, with 99.29% (2,697 contigs) and 99.61% (643 contigs) of the initially assembled sequences anchored to 24 and 29 pseudochromosomes for H. anomala and E. splendida, respectively. In total, 288.10 Mb (43.43%) and 471.23 Mb (54.84%) of the sequences were identified as repetitive elements in these two respective assemblies. Moreover, integrating three prediction methods enabled the identification of 11,469 and 10,554 protein-coding genes (PCGs) in H. anomala and E. splendida, respectively. The high-quality genomes of these species not only advance our understanding of adaptive evolution in Trichoptera but also serve as resources for comparative genomics research on evolution in biology and ecology fields. Furthermore, they contribute to elucidating the phylogenetic relationships between the cocoon-maker and Phryganides groups.

Methods

Sample collection

Himalopsyche anomala and E. splendida specimens were collected using ultraviolet light tubes from ** was performed using Minimap2 v2.1720, and the assembled genome underwent two rounds of polishing with NextPolish v1.1.021. Redundant sequences were removed using Purge_Dups v1.2.522 with the haploid cutoff set at 60 (-s 60) based on the aforementioned short-read map**. Before chromosome anchoring, Hi-C reads alignment and quality control were conducted using Juicer v1.6.223 with its default parameters. Subsequently, 3D-DNA v18092224 was employed to automatically anchor the majority of contigs into pseudochromosomes. Mis-joins were corrected using Juicebox v1.11.0823 through manual inspection and refinement. In total, 97.68% and 99.58% of assembly contigs were anchored into 24 and 29 pseudochromosomes, with lengths of 11.53–39.79 Mb for H. anomala and 9.92–51.78 Mb for E. splendida (Fig. 1).

Fig. 1
figure 1

Genome-wide chromosomal interactive heatmap. Each chromosome and contig is framed in blue and green, respectively. (a) Himalopsyche anomala. (b) Eubasilissa splendida.

Thorough examination for potential contaminants was conducted using MMseqs. 2 v1125 with the parameter “–min-seq-id 0.8” against the National Center for Biotechnology Information (NCBI) nt and UniVec databases. Sequences with > 90% alignments were removed. The final assembly lengths were 663.43 Mb (H. anomala) and 859.28 Mb (E. splendida), respectively (Table 1). To identify sex chromosomes, Illumina reads of the female individual were mapped against the assembly, and sequencing depth for each chromosome was calculated. Trichoptera follows the ZO female sex determination system26, hence, chromosomes with half the sequencing depth were identified as sex chromosomes (Tables S1, S2). The GC content of H. anomala and E. splendida assemblies was 31.55% and 32.76%, respectively. Notably, the estimated genome size closely matched the assembly size, with the genome assembly size of H. anomala resembling that of other Himalopsyche species27,28, whereas the genome size of E. splendida exceeded that of Eubasilissa regina (440.07 Mb)29. Genome completeness was assessed using Benchmarking Universal Single-Copy Orthologs (BUSCO) v3.0.230, employing the parameter “-m genome”, during each stage of the assembly. The completeness was computed as 98.1% and 98.2% for H. anomala and E. splendida, respectively, indicating high-quality assembled genomes (Table 2).

Table 1 Genome assembly statistics for Himalopsyche anomala and Eubasilissa splendida.
Table 2 Statistical result of BUSCO for Himalopsyche anomala and Eubasilissa splendida.

Repetitive sequence and noncoding RNAs annotation

RepeatModeler v2.0.231 and the LTR discovery pipeline (-LTRstruct) of genome tools32 were used to build a de novo repetitive element database. Subsequently, we merged this database with the known repeat element database (Repbase-2018102633 and Dfam 3.134). RepeatMasker v4.0.735 was used to annotate the repeat elements of the two assemblies based on the custom database, identifying 288.10 Mb (approximately 43.43%) and 471.23 Mb (approximately 54.84%) of repetitive sequences for H. anomala and E. splendida, respectively. Among these elements, the largest proportion comprised unclassified elements, accounting for 21.43% and 28.44% of the total genomes of the respective species. Details regarding other common repetitive elements are provided in Tables S3, S4. To annotate the non-coding RNAs, we employed Infernal v1.1.436 and tRNAscan-SE v2.0.937, low-confidence tRNAs by setting parameter “EukHighConfidenceFilter” was filtered. A total of 717 ncRNAs and 766 ncRNAs were annotated in the H. anomala and E. splendida genomes, respectively, with tRNAs constituting more than 50% (384 and 420) of these ncRNAs. Details regarding other noncoding RNAs are provided in Tables S5, S6.

Genome annotation

We integrated a multifaceted approach encompassing ab initio predictions, homologous proteins, and transcriptomic strategies to predict gene structures in the H. anomala and E. splendida genomes. Initially, we used BRAKER v2.1.638, which integrated results from Augustus v3.3.339 and GeneMark v4.3240. In this process, we utilized the arthropod reference proteins from OrthoDB10 v1041 to proceed ab initio predictions. Additionally, we downloaded the protein sequences of model organisms and closely related species (Table 3), including Drosophila melanogaster Meigen, Bombyx mori (Linnaeus), Spodoptera litura (Fabricius) and so on. These sequences were used for homologous gene prediction, employing GeMoMa v1.7.142 with the parameter “GeMoMa.c = 0.5 GeMoMa.p = 10”. Transcriptome sequencing reads underwent the same quality control methods used for DNA sequencing. Subsequently, HISAT2 v2.2.043 and samtools were employed to produce BAM alignments for reference assembly, and StringTie v2.1.644 was used to perform transcriptome assembly. Conclusively, we used MAKER v3.01.0345 to synthesize the three distinct strategies. A total of 11,469 and 10,554 PCGs were predicted in the H. anomala and E. splendida genomes, respectively (Table 4). The average number of exons and introns per gene was similar in H. anomala (9.4 exons and 8.2 introns) and E. splendida (7.1 exons and 8.3 introns). Variations in gene density were observed across different chromosomes, with the highest gene density on chromosome 21 and chromosome 23 in the H. anomala and E. splendida genomes, respectively (Fig. 2a,b). BUSCO was employed to predict protein sequence for both genomes with integrity of 98.4% in protein model, attesting to the high-quality annotation of the genomes.

Table 3 Species taxonomic information and accession code of all samples used in this study.
Table 4 Structural annotation information of protein-encoding genes of Himalopsyche anomala and Eubasilissa splendida.
Fig. 2
figure 2

Characterization of the assembled Himalopsyche anomala and Eubasilissa splendida genome, phylogenetic relationship, and gene family evolution. (a) Himalopsyche anomala. (b) Eubasilissa splendida. From the inner to outer layers: gene density, GC content (GC), DNA transposons (DNA), long-interspersed elements (LINE), long-terminal repeat elements (LTR), short-interspersed elements (SINE), chromosome length (Chr).

To functionally annotate the PCGs, Diamond v2.0.11.1491). Additionally, the final annotated gene BUSCO completeness was 98.4% for both H. anomala and E. splendida. Collectively, these results confirm the high quality and accuracy of the new chromosome-level assemblies.