Search Weight Loss Topics:

Oct 4

Mining the equine gut metagenome: poorly-characterized taxa associated with cardiovascular fitness in endurance athletes | Communications Biology -…

Ethical approval

The local animal care approved the study protocol and use committee (ComEth EnvA-Upec-ANSES, reference: 11-0041, dated July 12, 2011), and protocols were conducted following the EU regulation (no 2010/63/UE). Owners and riders provided their informed consent before the start of sampling procedures with the animals. The horses (Equus caballus) used in this research study were pure-breed or half-breed Arabian (three females, one male, and seven geldings; age: 101.69 years old).

Eleven endurance horses were selected from a cohort previously used in our team6,24,25,70. All equine athletes started training for endurance competitions at age 4 and presented a similar training history, level of physical fitness, and training environment. The 11 horses were selected due to the following criteria: (1) enrollment in the same 160km endurance category; (2) blood sample collection before and after the race; (3) feces collection before the race; (4) absence of gastrointestinal disorders during the four months before enrollment; (5) absence of antibiotic treatment during the four months before enrollment and absence of anthelmintic medication within 60 days before the race, and (6) a complete questionnaire about diet composition and intake.

Subject metadata, including morphometric characteristics and daily macronutrient diet intake records, is depicted in Supplementary Data1. Daily nutrient intake calculations are described elsewhere24.

The endurance race was split into ~3040km phases. At the end of each phase, veterinarians checked horses (referred to as a vet gate). The heart recovery time was the primary criterion evaluated at the vet gate as it is shown to be an excellent complement to a physical assessment of an individual. The heart rate was measured at each vet gate by the riders and a veterinarian using a heart rate meter and a stethoscope. Any horse deemed unfit to continue (due to a heart rate above 64bpm after 20min of recovery) was immediately withdrawn from the event.

It should be noted that the time interval between arrival at the vet gate and the time needed to decrease the heart rate below 64bpm was counted as part of the overall riding time. Therefore, the cardiac recovery time was calculated as the difference between the arrival time (at the end of the phase) and the time of veterinary inspection (referred to as the time in by the FEI endurance rules). The average speed of each successive phase was calculated at the vet gate.

Changes in these three variables during endurance events have been shown to predict whether a horse is aerobically fit or not71. We consider these variables to estimate cardiovascular capacity linked to performance capability and achievement. Therefore, these three variables were first scaled through a Z-score; that is, the number of standard deviation units a horses score is below or above the average score. Such a computation creates a unitless score that is no longer related to the original units of analysis (e.g., minutes, beats, Km/h). It measures the number of standard deviation units and can more readily be used for comparisons. A composite based on such Z-scores was then created to estimate cardiovascular fitness. Specifically, the composite() function of the multicon R package (v.1.6) was used to develop a unit-weighted composite of the three variables listed above.

The kinship272 (v.1.8.5) R package was used to calculate the pedigree kinship matrix of all individual pairs, plot the pedigree, and trim the pedigree object. The kinship coefficient for any two subjects was calculated as the probability that an allele chosen at random for both subjects at a given locus is identical-by-descent, that is, inherited from a common ancestor72. The pedigree was calculated using six generations back for the 11 Arabian horses of the study. The pedigree kinship matrix was then visualized using the plot_popkin() function from the popkin (v.1.3.17) R package. The inbr_diag() function was used to modify the kinship matrix, with inbreeding coefficients along the diagonal, preserving column and row names.

Blood samples were collected from each horse the day before the event (Basal, T0) and immediately after the end of the competition (T1) for transcriptomic, biochemical, metabolomic, and acylcarnitine assays. As described elsewhere24, pretreatment of the blood samples was carried out immediately after the collection because field conditions provided access to refrigeration and electrical power supply. Briefly, blood samples for RNA extraction were collected using Tempus Blood RNA tubes (Thermo Fisher) and stored at 80C. Whole blood samples were taken in EDTA tubes (10mL; Becton Dickinson, Franklin Lakes, NJ, USA) to determine biochemical parameters, while for the metabolome profiling, the sodium fluoride and oxalate tubes were used to inhibit further glycolysis that may increase lactate levels after sampling. Then, clotting time at 4C was strictly controlled for all samples to avoid cell lyses that affect metabolome components. After clotting at 4C, the plasma was separated from the blood cells, transported to the lab at 4C, and frozen at 80C (no more than 5h later, in all cases). Concerning the acylcarnitine, blood samples were collected in plain tubes. After clotting, the tubes were centrifuged, and the harvested serum was stored at 4C for no more than 48h and subsequently stored at 80C.

According to the manufacturers instructions, total RNAs were isolated using the Preserved Blood RNA Purification Kit I (Norgen Biotek Corp., Ontario, Canada). RNA purity and concentration were determined using a NanoDrop ND-1000 spectrophotometer (Thermo Fisher), and RNA integrity was assessed using a Bioanalyzer 2100 (Agilent Technologies, Santa Clara, CA, USA). All the 22RNA samples were processed. The transcriptome microarray data production, pre-processing, and analysis are depicted in Mach et al.25.

Transcriptome profiling was performed using an Agilent 4X44K horse custom microarray (Agilent Technologies, AMADID 044466). All of the steps are detailed here73,74. We refer to our previous work for more details on the pre-processing, normalization, and application of linear models25. Given our interest in understanding the role played by mitochondria during exercise, the set of 801 differentially expressed mitochondrial genes reported by our team25 was selected for the downstream steps of analysis (Supplementary Data15).

As described elsewhere24,70, the plasma metabolic phenotype of endurance horses was obtained from 1H NMR spectra at 600MHz. The 1H NMR spectra were acquired at 500MHz with an AVANCE III (Bruker, Wissembourg, France) equipped with a 5mm reversed QXI Z-gradient high-resolution probe. Further details on sample preparation, data acquisition, quality control, spectroscopic data pre-processing, and data pre-processing, including bin alignment, normalization, scaling, and centering, are broadly discussed elsewhere75. Details on metabolite identification are described in our previous work24,25.

Sera were assayed for total bilirubin, conjugated bilirubin, total protein, creatinine, creatine kinase, -hydroxybutyrate, and aspartate transaminase (ASAT), -glutamyltransferase and serum amyloid A levels on an RX Imola analyzer (Randox, Crumlin, UK).

As a proxy for mitochondrial -oxidation, the serum acylcarnitine profiles were produced and analyzed as described elsewhere6. In the positive mode, free carnitine and 27 acylcarnitines were analyzed for their butyl ester derivatives by electrospray tandem mass spectrometry (ESI-MS-MS) on a triple quadrupole mass spectrometer (Xevo TQ-S Waters, Milford, MA, USA) using deuterated water.

Fresh fecal samples were obtained while monitoring the horses before the race. One fecal sample from each animal was collected immediately after defecation24,76, and three aliquots (200mg) were prepared. The dehydration experienced by most horses after the race altered intestinal motility and feces shedding, making it impossible to recover the feces immediately after the race.

Aliquots for SCFA analysis and DNA extraction were snap-frozen.

SCFA levels were determined by gas chromatography using the method described elsewhere77.

Total DNA from the 11 samples was extracted from ~200mg of fecal material using the EZNA Stool DNA Kit (Omega Bio-Tek, Norcross, Georgia, USA) following the manufacturers instructions. DNA was then quantified using a Qubit and a dsDNA HS assay kit (Thermo Fisher).

As detailed in our previous studies24,25, concentrations of bacteria, anaerobic fungi, and protozoa in fecal samples were quantified by qPCR using a QuantStudio 12K Flex platform (Thermo Fisher Scientific, Waltham, USA). Primers for real-time amplification of bacteria (FOR: 5-CAGCMGCCGCGGTAANWC-3; REV: 5-CCGTCAATTCMTTTRAGTTT-3), anaerobic fungi (FOR: 5-TCCTACCCTTTGTGAATTTG-3; REV: 5-CTGCGTTCTTCATCGTTGCG-3) and protozoa (FOR: 5-GCTTTCGWTGGTAGTGTATT-3; REV: 5-CTTGCCCTCYAATCGTWCT-3). Details of standard dilutions series, the thermal cycling conditions, and the estimation of the number of copies are detailed elsewhere24,25.

A detailed description of the DNA isolation process, V3V4 16S rRNA gene sequencing-PCR amplification, is presented by our group19,20,24,25,76,78,79. A negative control sample alongside biological samples at the DNA extraction and PCR steps was considered in attempts to control DNA contamination before and after sequencing. In addition, contamination was minimized through laboratory techniques such as UV irradiation of material, ultrapure water, the DNA-free Taq DNA polymerase, and the separation of pre-and post-PCR areas.

The Divisive Amplicon Denoising Algorithm (DADA) was implemented using the DADA2 plug-in for QIIME 2 (v.2021.2) to perform quality filtering and chimera removal and to construct a feature table consisting of read abundance per amplicon sequence variant (ASV) by sample80. Taxonomic assignments were given to ASVs by importing Greengenes 16S rRNA Database (release 13.8) to QIIME 2 and classifying representative ASVs using the naive Bayes classifier plug-in81. The phyloseq (v.1.36.0)82, vegan (v.2.5.7)83, and microbiome (v.1.14.0) packages were used in R (v.4.1.0) for the downstream steps of analysis. A total of 364,026 high-quality sequence reads were recovered for the 11 horses of the study (mean per subject: 33,093(pm)17,437, range: 12,05262,670). Reads were clustered into 5412 chimera- and singleton-filtered ASVs at 99% sequence similarity. The genera taxonomic assignments and counts for each individual are presented in Supplementary Data10).

The negative control sample did not yield a band on the agarose gel, and the concentration of the purified amplicon was undetectable (<1ng/L). Nevertheless, the decontam (v.1.14.0) R package was used to identify and visualize possible contaminating DNA features in the negative control sample. The function isContaminnat() was used to determine the distribution of the frequency of each contaminant feature as a function of the input DNA concentration. Only 6 ASV were statistically classified (p<0.05) as contaminants, although their frequency plots showed they were non-contaminants (Supplementary Fig.11).

Metagenomic sequencing was performed using the same DNA extractions. For each individual, a paired-end metagenomic library was prepared from 100ng of DNA using the DNA PCR-free Library Prep Kit (Illumina, San Diego, CA, USA). The size was selected at about 400bp. The pooled indexed library was sequenced in an Illumina HiSeq3000 using a paired-end read length of 2150pb with the Illumina HiSeq3000 Reagent Kits at the PLaGe facility (INRAe, Toulouse).

Raw metagenomic reads were quality-trimmed, assembled, binned, and annotated using the ATLAS pipeline, v.2.4.484. In short, using tools from the BBmap suite v.37.9985, reads were quality trimmed with ATLAS parameters: preprocess_minimum_base_quality=10, preprocess_minimum_passing_read_length=51, preprocess_minimum_base_frequency=0.05, preprocess_adapter_min_k=8, preprocess_allowable_kmer_mismatches=1, and the preprocess_reference_kmer_match_length=27. The contamination from the horse genome (available at NCBI sequence archive with the accession number GCA_002863925.1; Equus_caballus.EquCab3.0) was filtered out using the following settings: contaminant_max_indel=20, contaminant_min_ratio=0.65, contaminant_kmer_length=13, contaminant_minimum_hits=1, and contaminant_ambiguous=best. Reads were error corrected and merged before assembly with metaSPAdes v.3.13.186 with the subsequent parameters: spades_k=auto, prefilter_minimum_contig_length=300, minimum_average_coverage=1, minimum_percent_covered_bases=20, and minimum_contig_length=500 after filtering. QUAST 5.0.287 was used to evaluate the quality of each sample assembly. Since a high diversity between individuals was described through 16S rRNA amplicon analysis, we first assembled each sample independently. Contigs from single samples were clustered into metagenomic bins using MetaBAT 2 (v.2.14)88 with the following parameters: sensitivity=sensitive, min_contig_length=1500 and Maxbin 2.0v.2.2.789 with the parameters set to max_iteration=50, prob_threshold=0.9, and min_contig_length=1000. Contig predictions were combined using DAS Tool v.1.1.2-190 with diamond engine and score_threshold set to 0.5.

ATLAS configuration file, summaries of individual samples quality control, contigs from the individuals, and detected bins are available at the INRAE data repository ( and are contained in the files ATLAS_config.yalm, ATLAS_dag.pdf, notebook.html, ATLAS_QC_report.html, and ATLAS_bin_report_DASTool.html.

Assembly statistics for the predicted MAGs such as completeness, redundancy, size, number of contigs, contig N50, length of the longest contig, average GC content, and the number of predicted genes were computed using the lineage workflow from CheckM v.1.1.392. MAGs were designated as near-complete drafts if they had completeness 90%, redundancy <5%, transfer RNA gene sequences for at least 18 unique amino acids, or medium-quality drafts if they had completeness 50% and a redundancy <10%. A summary of the assembly statistics for the predicted MAGs is available at the INRAE data repository: as ATLAS_assembly_report.htlm.

Because the same MAG may be identified in multiple samples, dRep v.2.2.293 was used to obtain a non-redundant set of MAGs by clustering genomes to a defined average nucleotide identity (ANI) and returning the representative with the highest dRep score in each cluster. The parameters used were set to ANI=0.95, overlap=0.6, length=5000, completeness=50, contamination=10, and N50=0.5. Only the highest-scoring MAG from each secondary cluster was retained as the winning genome in the dereplicated set. The abundance of each MAG was then quantified across samples by mapping the reads to the non-redundant MAGs using the BBmap suite v.37.9985 (pairlen=100, minid=0.9, mdtag=t, xstag=fs, nmtag=t, sam=1.3, ambiguous=best, secondary=t, saa=f, maxsites=10). The sample-specific median coverage of each MAG was then computed using pileup within BBMap with default parameters.

For the taxonomic annotation, ATLAS predicted the genes of each MAG sequence using Prodigal v.2.6.394 with single-mode and closed-end parameters. The taxonomy of the predicted MAGs was inferred using the genome taxonomy database (GTDB-Tk)43 (v.5.0, release 95 (July 17, 2020)). As such, GTDB-Tk taxonomy names were used throughout this paper. In addition, domain-specific trees incorporating the predicted MAGs were inferred by constructing a maximum-likelihood tree using the de novo workflow in GTDB-Tk v.5.0 with the following parameters: --bacteria | --archaea, min_perc_aa=50, prot_model=WAG. Trees were visualized using ggtree (v.3.0.2) in the R package.

To assess the contribution of the constructed MAGs to the functional potential of the gut microbiome, the predicted gene and proteins extracted by Prodigal during the CheckM pipeline were compared to the EggNOG database 5.0 using eggnog-mapper (v2.0.1). KEGG annotation (Kyoto Encyclopedia of Genes and Genomes) and CAZymes annotation (Carbohydrate-active Enzyme) were extracted from this output. Since the detection of KOs and CAZymes families is likely influenced by sequencing depth, their relative abundance was normalized to the abundance of the MAG they derived from. Pathways attributed to each KO were annotated from the KEGG database (downloaded 23-October-2021;

The uniqueness of our predicted MAG catalog was confirmed by dereplicating them with the 121 MAGs produced by Gilroy et al.44 and three reported by Youngblut et al.45 using dRep v.3.2.093 with parameters: P_ani=0.9, S_algorithm ANImf, S_ani=0.99, clusterAlg average, cov_thresh=0.1, coverage_method larger. dRep performed pairwise genomic comparisons by sequentially applying an estimation of genome distance and an accurate measure of average nucleotide identity. Visualizing and comparing highly similar genomes were performed using the CGView family of tools (

The establishment and assessment of the quality and representation of the microbiome gene catalog were performed through the metagenomic ATLAS pipeline (v.2.4.4)84. As described above, we first assembled the clean reads into longer contigs.

Genes were predicted by Prodigal v.2.6.3 and then clustered using linclust95 to generate a non-redundant gene catalog. Redundant genes were removed with linclust using the following parameters: minlength_nt=100, minid=0.95, coverag=0.9, and subsetsize=500,000. The quantification of genes per sample was done through the combine_gene_coverages() function in the ATLAS workflow, which aligned the high-quality clean reads to the gene catalog using the BBmap suite v.37.9985 (minid=0.95, mdtag=t, xstag=fs, nmtag=t, sam=1.3, ambiguous=all, secondary=t, saa=f, maxsites=4). Taxonomic and function annotations were done based on the EggNOG database 5.0 using eggnog-mapper (v.2.0.1) (emapper.pyannotate_hits_table {input.seed}no_file_comments). The eggNOG numbers corresponding to CAZymes based on homology searches to the CAZyme database were retrieved from these. We used the derived eggNOG abundance matrix to obtain a CAZyme profile per sample. Similarly, KEGG annotation was recovered from the EggNOG output. KEGG gene IDs were mapped to KEGG KOs and used to get the KEGG functional pathway hierarchy. Furthermore, using mmseqs2 (v.13.45111) to find genes at a 95% similarity threshold and 80% overlap, we compared our gene catalog with a previously published gene catalog containing ~4 million genes30. The parameters used were the following: easy-search --search-type 3 --min-seq-id 0.95 --cov-mode 0 -c 0.8 --threads 16 --alignment-mode 3 --max-seq-len 100000.

The annotated gene catalog fasta file is deposited at DDBJ/ENA/GenBank Whole Genome Shotgun under the BioProject ID PRJNA438436 and is also available at as Genecatalog_with-note.fna.gz. The KO and CAZymes derived from the gene catalog are available in the same INRAE data repository and are in the and files, respectively.

The kmer-based kaiju v.1.8.0 ( approach was used for microbial taxonomic profiling of the trimmed shotgun metagenomes and the microbial gene catalog. The microbial gene catalog fasta, core group genes fasta, and paired reads after quality trimmed and decontamination from the horse genome were used and annotated against the NCBI nr_euk reference database (released on May 25, 2020) containing all proteins belonging to archaea, bacteria, fungi, microbial eukaryotes, and viruses for classification in Greedy run mode with -a greedy -e 3 allowing for maximum three mismatches. By default, Kaiju returned a NA if it could not find a taxonomic classification at certain ranks. The Kaijus tab-separated output files were imported into Krona and converted into HTML files. They are available at under raw-samples.nr_euk.kaiju.html.

To circumvent the problem of false-positive species predictions due to misalignment and contamination, we defined an abundance threshold of 25%, where the top 25% abundant species in at least 50% of the individuals were retained using the filterfun_sample() function in the phyloseq R package. This reduced background noise but kept information on poorly-described species if they were ubiquitously found in the samples. The dominant phylotypes abundance, taxonomy, and the associated metadata are available at as Ecaomic_dominant_phylotypes_nonrariefied.rds.

The high-quality clean paired reads were aligned to the ResFinder database (accessed March 2018, v.4.0) using bowtie2 (v.2.3.5). ResFinder is a manually curated database of horizontally acquired antimicrobial resistance (AMR) genes. It contains many genes with numerous highly similar alleles (e.g., -lactamases). To avoid random assignment of read pairs on these high-identity alleles, the database was clustered at 95% of identity level, over 200bp using CDHIT-EST (options -G 0 -A 200 -d 0 -c 0.95 -T 6 -g 1)96 and a reference sequence was attributed to each cluster. Two successive mappings were done: (i) the first mapping with standard parameters (bowtie2 --end-to-end --no-discordant --no-overlap --no-dovetail no-unal) on the complete ResFinder database, and (ii) a second mapping on the clustered database using the reads from the first mapping, with less stringent parameters (bowtie2 --local --score-min L,10,0.8). More than 99% of the reads from the first mapping correctly aligned on a cluster reference sequence in the second mapping.

Counts from the second mapping were normalized by computing the RPKM (reads per kilobase reference per million bacterial reads) value for each ResFinder reference sequence. The RPKM values were calculated by dividing the mapping count on each reference by its gene length and the total number of bacterial read pairs for the samples and multiplying by 109. A minimum of 20 mapped reads was considered to validate the presence of an AMR gene cluster.

The microbiome R package allowed us to study global indicators of the gut ecosystem state, including measures of evenness, dominance, divergences, and abundance. Comparison of the gut -diversity indices between groups was performed by a two-sided Wilcoxon rank-sum test (pairwise comparison). BenjaminiHochberg multiple testing correction p<0.05 was set as the significance threshold for comparison between groups.

To estimate -diversity, BrayCurtis dissimilarity was calculated using the phyloseq R package. All samples were normalized using the rarefy_even_depth() function in the phyloseq R package, which is implemented as an ad hoc means to normalize features resulting from libraries of widely differing sizes. The PerMANOVA test (a non-parametric method of multivariate analysis of variance based on pairwise distances) was implemented using the adonis() function in the vegan R package and the pairwise.Adonis2() function from the pairwiseAdonis (v.0.4) R package tests the global association between ecological or functional community structure and groups. The model was adjusted by factors affecting the microbiome: age, sex, and dietary macronutrient intake.

The core group of genes in the catalog was defined as the genes present in all individuals.

The dominant core microbiome at the genus level was calculated using a detection threshold of 0.1% and a prevalence threshold of 95% in the microbiome R package.

The SParse InversE Covariance Estimation for Ecological Association Inference method (SPIEC-EASI)97 was used to identify sub-populations (modules) of co-abundance and co-exclusion relationships between dominant phylotypes and CAZy classes abundances matrices. Specifically, the method allows microorganisms and functions to interact differently, from bidirectional competition to mutualism or not interacting at all. The statistical method SPIEC-EASI comprises two steps: a transformation for compositionality correction of the feature matrices and estimation of the interaction graph from the transformed data using sparse inverse covariance selection. The sparse graphical modeling framework was constructed using the spiec.easi() function of the SpiecEasi package (v.1.1.1). The features were clustered using the method=mb, lambda.min.ratio=1e5, nlambda=100, pulsar.params=list (thresh=0.001). Regression coefficients from the SPIEC-EASI output were extracted and used as edge weights to generate a feature co-occurrence network R igraph package (v.1.2.6) and Cytoscape (v.3.8.2).

Data integration was carried out using several approaches and different combinations of datasets. Before the integration, we applied some additional pre-processing steps to our exploratory datasets. In particular, to eliminate intra-individual variability and focus on the differential signals between T1 and T0, we considered values (T1T0) for each of these datasets, namely biochemical assay data and metabolome acylcarnitine profiles, and gene expression data. For the transcriptome, we constructed a matrix of log-transformed expression values between T1 and T0 (e.g., the difference in log2-normalized expression between T1 and T0).

The integration of data was then performed using complementary methods and working with different datasets available, namely: (1) values of mitochondrial-related genes; (2) values of 1H NMR metabolites; (3) values of the biochemical assay metabolites; (4) values of plasmatic acylcarnitines; (5) the fecal SCFAs at T0; (6) the bacterial, ciliate protozoa and fungal loads at T0; (7) the dominant gut phylotypes at T0; (8) the CAZymes profiles at T0; (9) the KOs at T0, and the (10) athletic performance data.

As a first integration approach, a global non-metric multidimensional scaling (NMDS) ordination was used to extract and summarize the variation in microbiome composition using the metaMDS() function in the vegan R package. Stress values were calculated to determine the number of dimensions for each NMDS.

The explanatory datasets were then fit to the ordination plots using the envfit() function in the vegan R package98 with 10,000 permutations. Each covariates effect size and significance were determined, and all p-values derived from the envfit() function were adjusted BenjaminiHochberg. Variation partitioning was performed using the varpart() function in vegan in R.

The N-integration algorithm DIABLO of the mixOmics R package (, v6.12.2) was used as a second integrative approach. It is to be noted that, in the case of the N-integration algorithm DIABLO, the variables of all the datasets were also centered and scaled to unit variance before integration. In this case, the relationships among all datasets were studied by adding a different categorical variable, e.g., the cardiovascular fitness of horses. Horses with poor cardiovascular fitness (n=8) were compared to horses with enhanced cardiovascular fitness (n=3). DIABLO seeks to estimate latent components by modeling and maximizing the correlation between pairs of pre-specified datasets to unravel similar functional relationships99. To predict the number of latent components and the number of discriminants, the block.splsda() function was used. The model was first fine-tuned using leave-one-out cross-validation by splitting the data into training and testing. Then, classification error rates were calculated using balanced error rates (BERs) between the predicted latent variables with the centroid of the class labels using the max. dist() function.

Finally, the DESeq2 (v.1.32.0)100 R package was used to test differential abundances analysis between groups for the dominant phylotypes, MAGs, and the genetic functionalities derived from KOs and CAZymes at the basal time. DESeq2 assumes counts can be modeled as a negative binomial distribution with a mean parameter, allowing for size factors and a dispersion parameter. The p-values were adjusted for multiple testing using the BenjaminiHochberg procedure. DESeq2 comparisons were run with the parameters fitType=parametric and sfType=Wald.

The validation set consisted of 22 pure-breed or half-breed Arabian horses (12 females, three males, and seven geldings; age: 9.21.27) not included in the experimental set to ensure that the observed effects were reproducible in a broader context (Supplementary Data20). Five animals were enrolled in a 160km endurance competition among the horses in the validation set, while 17 were in a 120km race. The management practices throughout the endurance ride and the International Equestrian Federation (FEI) compulsory examinations and the weather conditions, terrain difficulty, and altitude were that of the experimental set. All the participants enrolled in the study (experimental and validation set) competed in the same event in October 2015 in Fontainebleau (France). The cardiovascular capacity was created as described in the Performance measurement section as a composite of post-exercise heart rate, cardiac recovery time, and average speed during the race. Then, the HIGH, MEDIUM, and LOW groups were determined according to the interquartile range of the composite cardiovascular fitness values. HIGH included individuals with cardiovascular fitness values above the 75th percentile, LOW below the 25th percentile, and MEDIUM, the individuals ranging in between.

The PerMANOVA test was implemented by using pairwise.Adonis2() function from the pairwiseAdonis R package. The model was adjusted by factors affecting the microbiome: age and sex. The homogeneity of group dispersions (variance) was applied via the betadisp() function of the vegan package to account for the confounding dispersion effect. The one-way ANOVA with Tukeys honest significant differences (HSD) method for pairwise comparisons was performed using the TukeyHSD() function in the stats R package (v.3.6.2).

The PLS-DA was used to identify the key genera responsible for the differences in the groups using the mixOmics101 R package (v. 6.18.1). In addition, as PLS-DA loadings may be misleading with highly correlated variables, the differences in each relative genus abundance between the groups were quantified by DESeq2 R package.

Further information on research design is available in theNature Research Reporting Summary linked to this article.

Visit link:
Mining the equine gut metagenome: poorly-characterized taxa associated with cardiovascular fitness in endurance athletes | Communications Biology -...

Related Posts

    Your Full Name

    Your Email

    Your Phone Number

    Select your age (30+ only)

    Select Your US State

    Program Choice

    Confirm over 30 years old


    Confirm that you resident in USA


    This is a Serious Inquiry