• Mon - Sat 8:00 - 6:30, Sunday - CLOSED

# Evolution during primary HIV infection does not require adaptive immune selection Evolution during primary HIV infection does not require adaptive immune selection

## Significance

HIV evolution within infected individuals creates large barriers to successful vaccination and therapy. Here, we used a model that matches viral loads and mutation rates to characterize the driving forces behind HIV evolution early during infection. Surprisingly, the best model of the data did not require explicit pressure from the host immune system. Instead, the model predicts most new viral variants are intrinsically worse at infecting new cells relative to their parents. Thus, most variants do not persist and only by occasional chance does a new fit variant come to dominate. These findings also highlight the tight connection between viral population dynamics and evolution, warranting more modeling to disentangle these processes in the future.

## Abstract

Modern HIV research depends crucially on both viral sequencing and population measurements. To directly link mechanistic biological processes and evolutionary dynamics during HIV infection, we developed multiple within-host phylodynamic models of HIV primary infection for comparative validation against viral load and evolutionary dynamics data. The optimal model of primary infection required no positive selection, suggesting that the host adaptive immune system reduces viral load but surprisingly does not drive observed viral evolution. Rather, the fitness (infectivity) of mutant variants is drawn from an exponential distribution in which most variants are slightly less infectious than their parents (nearly neutral evolution). This distribution was not largely different from either in vivo fitness distributions recorded beyond primary infection or in vitro distributions that are observed without adaptive immunity, suggesting the intrinsic viral fitness distribution may drive evolution. Simulated phylogenetic trees also agree with independent data and illuminate how phylogenetic inference must consider viral and immune-cell population dynamics to gain accurate mechanistic insights.

Longitudinal sequencing of HIV over time in an infected person provides invaluable insights into the pathogenesis of disease. Phylogenetic tools (1) help illuminate evolutionary relationships between viral sequences and phylodynamics leverages such relationships to further infer underlying processes governing evolution (2, 3). However, phylodynamic tools often do not encompass the details of within-host HIV infection, which include massive exponential expansions and contractions of viral populations, mounting immune responses, target cell limitation, and existence of short- and long-lived cell populations. Mechanistic mathematical models of HIV explicitly include these predator/prey interactions and are amenable to generalized nonlinear processes including therapeutic interventions (4). Therefore, unifying intrahost mechanistic modeling with phylodynamics is a potentially powerful approach to reveal the mechanisms underlying intrahost viral evolution that hinder HIV prevention and/or cure.

There is an extensive history of modeling unified phylodynamics for viruses within and between hosts (5⇓–7). Within-host HIV models have modeled nucleotide sequences and assumed multistrain phenotypes with fitness distributions (6⇓⇓⇓⇓⇓–12). Particular aspects of HIV biology including recombination (13, 14), drug resistance (15), antibody evolution (16, 17), adaptive immunity (18, 19), and latency (20) have been considered. Several general forward simulation packages are available (21⇓–23). Our work grows from these and other models. Here we sought to identify mechanistic drivers of HIV evolution by identifying and validating a within-host phylodynamic (WiPhy) model against a range of experimentally collected HIV data. By building in capabilities of the model to simulate data that can be exported and analyzed by existing phylogenetic inference software, we corroborate the body of work showing that without accurate models for population dynamics, phylogenetic trees can inaccurately infer phylodynamics.

The WiPhy model is then used to address the ongoing question of how and how much the host immune system influences within-host HIV evolution. Previously, immune control over viremia has been elegantly demonstrated in nonhuman systems through rapid increases in viral loads following CD8+ T cell depletion (24⇓⇓–27). Yet, the extent of immune pressure on viral evolution is harder to observe directly. Selective pressure and coevolution of CD8+ T cells has been inferred from matching HIV mutations to circulating CD8 epitopes (28⇓⇓–31), computing the ratio of synonymous to nonsynonymous mutations in HIV sequences (32), linking the prevalence of escape mutations to host-genetic predispositions such as HLA type (33) (human leukocyte antigen, genes that regulate immune function), and modeling (19, 34). However, explicit escape from cellular immunity is not always obvious. Using data from 125 adults in the SPARTAC trial (33) Roberts et al. found most individuals had no detectable escape mutants within 2 y of infection. Early mutations (<6 mo after seroconversion) were mostly transmitted, rather than arising from rapid de novo escape in the new host. Even in an HLA-matched host who mounted a measurable and HIV-specific CD8 response, the average time before the targeted epitope evolved an escape mutation was longer than 2 y. Lee et al. also found four clade-C-infected individuals had little indication of cytotoxic T cell-driven immune selections in the first year (35). Neutralizing and broadly neutralizing antibodies (bNAbs) also are thought to interact and coevolve with founder viruses over the course of HIV infection (36, 37). Yet, bNAbs can arise quickly without many mutations (38) and in infants (39). Recently, Strauli et al. analyzed data of unprecedented detail on both HIV and antibody repertoire sequences, ultimately finding that HIV/Ab coevolution is at minimum hard to detect, if not rare entirely (40).

In that context, we find that the most parsimonious model of HIV primary infection requires adaptive immunity to control viral load but does not require the adaptive immune system to directly select for certain variants. Sequence evolution can instead be controlled by a distribution of intrinsic viral fitness where most variants are less infectious than their parents. We show the model-predicted viral fitness distributions agree with that of HIV deep mutational scanning (DMS) which quantifies fitness in vitro, necessarily in the absence of immune pressure. By building a model that includes host and virus population dynamics as well as mutation, our work highlights crucial questions about HIV evolution relevant to vaccines and therapeutics.

## Results

### Viral Dynamics and Phylogenetics during Primary HIV Infection.

We sought to identify an optimal model for HIV primary infection phylodynamics. Therefore, we first collected four datasets relevant to early HIV infection: 1) nonlinear viral dynamics during early HIV infection, 2) longitudinal divergence and diversity, 3) post antiretroviral therapy HIV reservoir size and composition in terms of defective and intact sequences, and 4) tree-balance measures (see SI Appendix, Table S1 for details and references).

Next, to quantitatively score models against these data, we developed 10 phylodynamic metrics: viral kinetic measures (peak, set point, and set point variability), evolutionary measures (HIV envelope, or env, divergence and diversity on days 20 and 40 after infection), and ratio of intact to all proviral sequences in the HIV reservoir (see Materials and Methods and all definitions and values in SI Appendix, Table S2). Because no individual dataset had sufficiently granular data for all types, we opted to fit to population data across types. This decision implies the model describes a typical HIV infection. Parameter values are therefore less specific and have higher variance than those that might be estimated by fitting to individuals. Alternatively, tighter individual estimates could be more biased by features of the specific cohort and potentially less reflective of the entire range of HIV infections.

### WiPhy Model for HIV Primary Infection.

We began with a general stochastic WiPhy model that extends the canonical viral dynamics model (41) by adding latency and adaptive immunity and includes viral variants that mutate (Fig. 1A). Each variant corresponds to a unique genotype (signified by an integer g). Evolution is then tracked by recording the complete genealogy (or transmission record) of these genotypes ($g\to g\prime \to g″\dots$). We record attributes for each genotype that allow reconstruction of phylogenetic trees and calculation of evolutionary summary statistics: parent genotype, infectivity, Hamming distance (HD) to the founder sequence, number of each nucleic acids (ACTG), and age.

Fig. 1.

Mechanistic WiPhy models and the optimal fit to experimental data. (A) Mechanistic model schematic. Susceptible cells are infected by viral variant with a genotype $V\left(g\right)$, generating new infected cells (latent and active), producing more virus, and engendering immune responses. (B) Mutation model governs new variants that are defective (probability τ) or intact. (C) If intact, point mutations (probability μ) can occur that change variant fitness (infectivity) based the viral fitness (vf) model—exponential model was optimal. (D) Adaptive immune (ai) models were also varied—global was optimal. (E) Three stochastic replicate simulations of the best model (exponential-global, purple) against the five types of experimental data (gray, see SI Appendix, Table S1 for details on data and cohorts).

Upon cell infection, the virus can mutate (probability μ). Most mutants are terminally defective but some variants remain intact (with probability τ) (Fig. 1B). New intact variants are given a new genotype ($g\prime$) and a new infectivity ($\beta g\prime$) drawn from a distribution (Fig. 1C). Throughout, we characterize variant fitness using infectivity. Changes cannot be ascribed to mutation of a certain genomic locus, that is, there is no genotype/phenotype link in the model. However, nucleotide sequences can be reconstructed from the genealogy to enable alignment and phylogenetic tree reconstruction (see Fig. 4). Together, this formulation allowed us to simulate large population sizes (109 viruses in 10 mL of blood) with good temporal resolution (Δt = 0.01 d), compute all phylodynamic metrics, and connect simulated data to phylogenetic trees. All code is freely available at https://github.com/FredHutch/WiPhy_HIV.

### Mathematical Model Selection against Phylodynamic Metrics from Primary HIV Infection.

To determine the mechanisms required to accurately match experimental data, we attempted to fit 24 distinct mechanistic models. For viral fitness (vf, Fig. 1C), we tested four models: all new variants have the same fitness (vf-identical), all new variants have a randomly assigned fitness (vf-random), and two models where variant fitness was inherited, either based upon an exponentially (vf-exp) or normally distributed (vf-normal) change $\Delta \beta$ from its parent sequence. For adaptive immunity (ai, Fig. 1D), we tested six models: one with no adaptive immunity; two where immune pressure was implicit, either based upon the size of a certain sequence population (ai-size) or the length of time that a certain sequence existed (ai-age); and three where immune pressure was explicit—meaning a compartment of the model $E\left(g\right)$ was added—either based upon a scenario where adaptive immune cells can kill any HIV sequence (ai-global) or adaptive immune cells have a specific cognate genotype which they can only kill (ai-specific) or adaptive immune cells can kill a range of genotypes (ai-groups). In the third category, a typical dynamical model for adaptive immune cells was included that allows adaptive cells to grow and shrink in number based on the commensal infected cell count. Immune cell generation is not limitless, and creation saturates if a certain value is reached (see Eq. 3).

For each model, 100 parameter sets per model parameter were tested and 20 stochastic replicates were attempted, stopping if 10 were reached. This amounted to 104,497 simulations (or roughly 72 simulation days at 1 min per run). We determined successful models as those with a balanced fit to all phylodynamic metrics: an approximate Bayesian computation (ABC) approach (see Materials and Methods) (42). SI Appendix, Fig. S1 shows the total scores of the best single stochastic runs, as well as the best parameter sets that fit well across stochastic runs.

The best single run and the best average score across stochastic replicates was achieved by the “exponential-global” model that is governed by an exponential intrinsic fitness distribution for viral offspring and a single adaptive immune compartment that kills globally, i.e., removes all viral strains equally. This model had a normalized residual sum of squares (RSS) two or more points lower than all other models. Regardless of the adaptive immune model, most of the top models had the exponential fitness distribution. The normal viral fitness model with no adaptive immunity at all came in third for the stochastic runs, suggesting it can do well on a given stochastic run, but did not fall into the top five models when averaged across stochastic runs. Individual model traces are compared to data metrics in SI Appendix, Fig. S2, which shows that visually several of the top five models appear to fit reasonably well. To go further, in SI Appendix, Fig. S3 we illustrate that viral load set point and diversity later than 40 d appear to put the strongest filter on models, with exponential-global outperforming all other models in these categories.

We were particularly interested in why a model in which individual adaptive immune compartments kill specific viral strains—the most literal version of host-on-pathogen selective force—was unsuccessful. SI Appendix, Fig. S4 shows that the exponential-strain model was not capable of achieving a low enough viral load set point, suggesting there is a balance between maintaining diversity and set-point level that is hard to achieve through asymmetric immune pressure to certain strains.

In summary, although it appeared that single stochastic runs of several models could perform reasonably well, averaged across stochastic replicates the exponential fitness distribution was optimal for all adaptive immune models. Furthermore, by this quantitative scoring system, global immune pressure was optimal, with total RSS greater than four points lower than the nearest competing models.

### Implications of an Optimal Model with Nonspecific Immunity and Inherited Exponential Viral Fitness.

Three stochastic replicate simulations of the best model and best parameter set are compared to data in Fig. 1E. Individual traces are imperfect but all metrics (viral load peak, nadir and set point, set point variability, sequence divergence, sequence diversity, and intact and defective latent reservoir size) were captured within our tolerance.

Several mechanistic results are implied by the optimal WiPhy model. Requiring inherited fitness indicates that lineages persist by chance beyond single-cell lifetimes. Quantitatively, our model predicts that advantageous mutation occurs in 11% of intact mutations, and intact are only ∼5% of all mutations. The average intact mutant has roughly half (0.47) the infectivity of its parent [e.g., a nearly neutral process (43)] such that most lineages die out and leave room for new variants. Importantly, strain-specific adaptive immune pressure was not necessary to capture the major features of within-host HIV phylodynamics in early infection. Neither a model where viral strains implicitly lost fitness over time nor models with explicit strain-specific immunity matched the data as well as one with a broad immune response that killed all variants. Rather than immune-mediated selection sweeps, our model favors constant mutations and fluctuations in intrinsic viral fitness as the primary mechanistic driver of observed HIV evolution during primary infection.

### Self-Consistency of Model Selection.

To check that the model selection process was robust, we performed a self-consistency exercise. Data simulated by a given model and parameter set were used as the experimental data and the model selection process was repeated. For most models (and most parameter sets), it was possible to correctly select the model that generated the data (SI Appendix, Fig. S5). We also assessed the effective dimensionality of model output, finding that a substantial amount (∼80%) of model variation is encompassed by two principal components (pc1 and pc2, SI Appendix, Fig. S6) and that within these components many models overlap. The relatively low effective dimensionality helps to explain why one (or a few) metrics can provide unique signatures that differentiate between models. The overlap shows how many models can fit reasonably well (though not optimally). Together, these checks emphasize the uniqueness of model output and strengthen confidence in the selection process.

### Model-Estimated Fitness Distribution Resembles In Vivo Fitness Distributions Not Restricted to Primary Infection.

Next, our model’s prediction that HIV evolution can be explained in the absence of specific adaptive immunity was tested against a different dataset: in vivo entropy distributions from sequences not restricted to primary infection. A reranked Shannon’s entropy was employed as a quantitative estimate of the relative effect of reducing fitness after a point mutation. After putting experimental and model-predicted distributions on the same scale (because there is no absolute scale for entropy), there the shape of the two distributions was not significantly different (Fig. 2). Disagreement arose only in ranges of larger fitness costs ($\Delta \beta$ < 0.25). Fortunately, this range is less biologically relevant because variants experiencing large fitness costs, regardless of precise value, are subdominant and do not substantially influence data or simulations. Importantly, although our original model validation used primary infection data, the model-derived intrinsic viral fitness distribution also approximated fitness distributions from chronic infection data, suggesting that our conclusions about viral evolution might pertain in other stages of infection.

Fig. 2.

Model-predicted fitness distribution resembles in vivo data not restricted to primary infection as well as in vitro data without the influence of adaptive immune pressure. The exponential distribution predicted by the model was compared to available in vivo sequence entropy and in vitro DMS data that quantified the relative fitness of all amino acid changes within env. Ranked fitness has similar fractions advantageous. Distributions were not significantly different by paired Kolmogorov–Smirnov tests. (Inset) Cumulative distribution functions (cdf).

### Model-Estimated Fitness Distribution Resembles In Vitro Fitness Distributions.

It is difficult to know whether the viral fitness distribution could be conflating viral fitness and adaptive immune selection. Thus, we next compared the model-estimated distribution with another dataset: DMS of HIV env (44, 45). DMS quantifies the relative fitness of in vitro-generated variants (perturbing nearly all amino acids in env). The distribution resembles the model-estimated fitness distribution and both distributions contain a similar proportion (∼10 to 15%) of advantageous mutations (Fig. 2) and distributions were not significantly different (Fig. 2, Inset). The largest fitness enhancements in vivo were generally less than those in vitro, such that we cannot rule out that intrinsically fit variants are reduced by immunity. This does not conflict with the model in which adaptive immunity reduces all variants. It also is unclear whether the same variants that are extremely fit in vitro would necessarily succeed as well in vivo for reasons aside from immunity. These data reinforce that it is sufficient to describe HIV phylodynamics during primary infection without including positive selection by adaptive immunity.

### Global sensitivity analysis shows adaptive immune response has a strong impact on viral load but limited impact on viral evolution.

To study the impact of the adaptive immune compartment further, we performed a global sensitivity analysis by simultaneously varying all parameters of the best model and calculating the Spearman correlation coefficient between all parameters and all summary statistics (SI Appendix, Fig. S7A). Importantly, of all parameters, the adaptive immune killing rate κ had the strongest impact on the drop to nadir and setpoint, illustrating the importance of the immune system on controlling viral dynamics, especially after a high peak viral load. Parameters regulating the maximal intensity of the immune response (saturation terms for both killing $hg$ and recruitment $hE$) had minimal impact on all metrics compared with other parameters. There was minimal correlation among adaptive immune parameters and phylodynamic measures (diversity and divergence at days 20 and 40) and average infectivity was the strongest determinant of phylodynamic metrics—fitting scores for phylodynamic measures show this pattern even more strongly (SI Appendix, Fig. S7B). Together, these observations suggest that adaptive immunity’s effect on evolution is indirect through viral load modulation. SI Appendix, Fig. S7C shows predicted connections between population dynamic and phylodynamic measures that cannot be calculated from these data (because metrics are from different individuals). In general, there were strong correlations within population dynamic measures and phylodynamic measures but little correlation between these two broad categories. There was a notable lack of correlation between peak viral load and phylodynamics at or after day 20. While relatively weak, nadir and set-point viral load were correlated with phylodynamics, emphasizing the secondary impact of adaptive immune pressure on evolution through reduced viral load.

### Single-Variant Viral Dynamics during Early HIV-1 Infection.

The best-fit model was employed to investigate individual variant viral dynamics. First, we tracked variants’ viral loads by genotype (Fig. 3A). During the first 3 wk of infection there were only 1,000 variants, whereas by day 60 >300,000 productively infectious viral genotypes had been produced, meaning many more defective variants had been created. At approximately day 40, population sweeps appeared (a new variant achieving top abundance) and abundances of concurrent sequences became increasingly even. As explained above, the population sweeps are not caused by strain-specific targeting by the immune system but by continual mutations following the exponential intrinsic viral fitness distribution. Next, realigning variants to their time of emergence (setting $t=0$ when the variant entered the top 10; Fig. 3B) identified two dominant kinetic profiles. The first were variants from before and during peak viremia, which have a large spike of >105 viral copies (red/yellow); the second were mostly generated after day 60 (when global adaptive immunity was appreciable), which peak at ∼104 viral copies and slowly decay (blue).

Fig. 3.

Visualizing evolutionary dynamics in the optimal model. Example simulation of the best model (variants ever in top 10 and total viral loads). (A) Coloring by genotype number illustrates population sweeps and >106 intact variants; many more defective variants have been created. (B) Variant trajectories shifted to the time they entered the top 10 by abundance; variants emerging later in infection have different kinetic profiles than those from early infection (compare red and blue). (C) Coloring by HD to founder sequence illustrates most early (red) variants have approximately one point mutation from the founder sequence, whereas later sequential evolution has occurred, with variants emerging with more than two mutations from the founder sequence. (D) Proportional abundance colored by HD illustrates the stark shift from founder predominance to more evenness after viral load nadir. (E) The complete transmission record, or genealogy illustrates the “true tree”—the parental genotype of each variant created on each day. Certain lineages persist for more than a hundred days, meaning that offspring are generated from a parental sequence that was created months prior. (F) The tMRCA of 50 randomly sampled sequences on a given day is bimodal: Variants are created by both more ancestral and more recent parents.

Coloring the variants instead by their HD from the founder virus (Fig. 3C) revealed a starlike phylogeny that dominates for approximately the first 40 d—meaning that while many distinct variants have emerged, they are all only one or two mutations away from the founder virus, and that the founder virus remains the mutual common ancestor. A shift from a starlike phylogeny arrives as sequential mutations occur; variants emerge with three or four base pair mutations from the founder around day 50. The predominance of the founder virus for the first 40 d is also evident by examining proportional abundance (calculated as the ratio of each variant viral load to the total; Fig. 3D). When the founder loses dominance the other variants are similarly competitive, and thus a more even balance of several variants becomes apparent. The timing of these results agrees with independent data showing shifts from demographic to selective effects around day 50 (46).

### Highly Granular Simulated Phylogenetic Trees.

The ability to access the complete transmission record from these simulations allows examination of evolutionary relationships with high granularity. Fig. 3E demonstrates how long certain lineages persist by plotting the parental genotype of each variant sampled on a given day. For example, the founder variant (g = 0) and other variants (e.g., g = 100) are prolific, producing new direct descendant variants that can be found for months. This timescale far outlasts the lifespan of any single infected cell (∼1 d) and this mechanistic model has no adaptive immune selection. Therefore, lineage persistence is a probabilistic balance between viral production and deleterious mutation of offspring. Additionally, some variants do persist at subdominant levels; gaps on the x axis indicate times between which a parental variant was not dominant to the point where its progeny were guaranteed to be sampled.

Calculating the times to most common ancestor (tMRCA) throughout infection for all pairs of subsampled sequences (n = 50 at each time point) revealed a bimodal distribution with cocirculating lineages (Fig. 3F). For example, most sampled sequences on day 200 (green) coalesced to common ancestors ∼10 to 40 d prior to the sampling date. This represents a time-localized quasi-species that is generated actively by a dominant circulating variant. However, a minority coalesced to more ancestral sequence, representing the continuing impact of prolific early variants. Note bimodality is not driven by latency and reactivation; similar results were found using a model without latent compartments.

### Model Validation with Estimated Phylogenetic Trees in the First Year of Infection.

Phylogenetic trees are commonly used to illustrate patterns in HIV evolution. We therefore tested whether the selected exponential-global model could show reasonable agreement with another independent dataset, a phylogenetic tree from a highly sampled individual in the first year of infection (47) (p1362, Fig. 4A). To accommodate all sources of variability in comparing to the experimental data, three simulations with the best model, three sequence samplings from each simulation, and three tree estimation replicates [in BEAST (48)] were performed on each sample set. This process admits 27 phylogenetic trees (1.1.1 → 3.3.3); two examples are illustrated in Fig. 4B, which appeared visually similar to the individual p1362.

Fig. 4.

Comparative analysis of experimental and model tree estimation. (A) Experimental tree (C1V2 env, p1362). All sampling schemes are based on this individual. (B) Running the best model three times (i), sampling sequences with identical timing and sample size three times (j), and with three tree estimate replicates (k) resulted in 27 trees enumerated i.j.k. Two example simulated trees visually match the experimental tree. (C) Quantitative comparison of trees using phylogenetic summary statistics show some simulations (dots) agree with data (dashed line) and that model run introduces the most variability (solid colored lines are medians across sequence sampling and BEAST run).

To quantitatively compare trees, phylogenetic summary statistics from each simulated tree and the experimental tree were examined (Fig. 4C): average tMRCA, Sackin’s index (a tree balance statistic calculated as the sum over the number of internal nodes between root and tip for all tips in the tree) (49), and the dominant eigenvalue of the tree’s modified graph Laplacian spectrum (MGL). MGL is a robust measurement of tree shape that quantifies deep/shallow branching events and importantly was shown to be a surrogate for synonymous to nonsynonymous (dN/dS) ratio, a metric often used to quantify selection (50). Model run 3 (greens) matched experimental statistics well. Although sampling and tree estimation stochasticity affected summary statistic values, the most significant variability was introduced by rerunning the model—particularly average tMRCA (Fig. 4C; horizontal lines show median across sampling and tree estimation).

This process also highlights the potential for misclassifications in the absence of detailed population dynamics. This exercise employed the simplest assumption of constant population size in BEAST. Because viral loads peak early in infection, the tree inference substantially overestimated the distance between the root and the founder sequence: The purple samples observed at day 8 of infection were placed in the maximum clade credibility tree at ∼100 d. Such artifacts might be overcome with more complicated population dynamic models in BEAST. Yet, recent work on birth–death models with time-varying rates showed different scenarios generate the same trees such that scenarios are not distinguishable even with infinite data (51). Another immediate challenge arises from building bifurcating rather than polytomic trees on data from the simulation. The present model allows for a single ancestor to produce many different offspring variants without intermediates (polytomy)—thus additional internal nodes inferred by a bifurcating tree may be artifacts—a point warranting further investigation.

## Discussion

By modeling human HIV data including viral population sizes and evolutionary dynamics, we uncovered several important characteristics of HIV pathogenesis. The most parsimonious model was governed by 1) an inherited distribution of viral infectivity drawn from an exponential distribution such that 2) most mutants are less fit than parental sequences. This distribution in turn implies a nearly neutral evolutionary process driven by intrinsic fluctuations in viral fitness. The optimal model also carried an adaptive immune system that was equally potent against all variants, suggesting that 3) although adaptive immunity is needed to control viremia, within-host pressure against specific strains was not needed to accurately model viral evolution.

Together these findings paint a picture of what is sufficient to describe early HIV infection: a viral quasi-species in which a fit variant can dominate or cocirculate with other dominant strains. However, any mutant progeny of currently dominant variants are probabilistically likely to be less fit such that new variants emerge and take over [similar to nearly neutral evolution (43)]. Such population sweeps are sufficiently modeled without any additional pressure from the immune system against specific variants. The imprint of the founder virus is also long-lasting (Fig. 3), which leads to a bimodal distribution of circulating variant sequence age (i.e., there is creation of infected cells by recent and ancestral strains) This finding might be relevant to understanding the discordance of within- and between-host evolutionary rates, but more work is warranted.

Next, using sequence entropy from individuals not necessarily sampled within primary infection, we found our model-estimated distribution was similar, suggesting that although we fit our model to primary infection, this distribution may hold during other stages of infection, and that evolution might be driven by intrinsic fitness in those stages too. Moreover, it might be questioned whether exponentially distributed fitness in the model is effectively modeling adaptive immune selection. Thus, we showed that in vitro DMS data (which guarantees no influence from adaptive immunity) were also relatively similar to the model-estimated distribution (Fig. 2). These results corroborated our hypothesis that much of HIV evolution is controlled at the viral rather than adaptive immune level.

Although our results imply that adaptive immunity to HIV may be broader and less directly influential on evolution than previously imagined, it remains a key component of viral control. This agrees with past experimental work: The timing of CD8+ T cell expansion correlated with reductions in viral loads (52) and depleting CD8+ T cells in SIV infected macaques led to viral expansion (53) [we note other studies show inefficient infected cell killing by CD8+ T cells, suggesting a more nuanced interpretation (54)]. Additionally, because our model effector cell killing rate κ was a strong determinant of viral load setpoint (SI Appendix, Fig. S7), we hypothesize that the overall HLA–antigen match (which depends on the specific host and the specific virus) determines disease severity. This agrees with the finding that certain host HLA genotypes are associated with delayed progression to AIDS (55) but that the founder sequence is correlated to pathogenesis (56).

There is strong evidence pointing to selection by adaptive immunity during chronic HIV infection. Observations range from fixed mutations that can be linked to detectable CD8+ T cell responses (31), a dose–response relationship (in one individual) between immune pressure and escape rate (57), an increase over time of escape mutations in HLA-matched hosts relative to HLA-mismatched hosts (33), and the emergence of bNAbs (37). Our results do not invalidate these findings. Instead, in the context of the “red-queen phenomenon” (58), a constant escape and chase, it may be that [as others have observed (33, 35)] sequential viral/host coevolution is not particularly relevant for early HIV pathogenesis. A surprisingly similar message arose from a deep analysis of HIV and antibody repertoires sequenced from the same individuals (40).

Our modeling has several limitations. The magnitude of our modeled adaptive immune response cannot be directly compared to existing values from the various studies because $E\left(g\right)$ is not precisely representing any specific cell type (e.g., CD8+ T cells, anti-HIV antibodies, or natural killer cells) and likely only captures the HIV-specific arm of the immune system. The landscape of HIV fitness costs has been modeled in more detail previously (59). Susceptible cells are also not clearly defined phenotypes. In ∼15% of cell infections viral progeny share genetic material from two parental sequences that infected the same cell (60, 61); we do not explicitly simulate such genetic recombination. While explicit modeling of recombination could be added as described previously (13, 62), the present approach effectively allows for some recombination signatures. For example, since all mutational distances are small during early infection, by allowing for many point mutations in a single infection event this could be seen as a recombination. We do not attempt to incorporate compartmental anatomy, instead relying on past studies that show HIV dynamics are reasonably consistent across tissues (63⇓–65). We do not directly model nonsynonymous to synonymous ratio (dN/dS), which has been used to demonstrate selective pressure. However, dN/dS can be tricky to interpret for nonequilibrium scenarios (66) and dN/dS > 1, which implies mutations that meaningfully change proteins (nonsynonymous mutations) are more likely to survive, is not obvious in the first years of HIV infection (32). Additionally, the MGL summary statistic [a surrogate for dN/dS (50)] agreed between our model-derived phylogenetic trees and human trees sampled in the first year of infection.

In building simulated trees (Fig. 4), we also highlighted several challenges of tree estimation from real data in which depth and granularity of sampling is limited. Others have argued that positive selection can be obscured by or conflated with demography (67), have shown misclassification of phylodynamic parameters (51, 68⇓–70), and demonstrated that nonequilibrium population dynamic “jackpot” events can resemble selection (71). Bearing these complexities in mind, we advocate for inclusion of population dynamic data whenever possible and continual enhancement of phylodynamic methods such as ours to disentangle these exquisitely coupled processes in practice.

Future applications of WiPhy models abound from optimizing sampling depth for phylogenetic inference using simulated data, estimating infection timing, and modeling therapies. Our results have important ramifications for vaccine design and therapeutic application of bNAbs to supplement the adaptive immune system. As within-host viral genetic data continue to be collected in treatment and prevention trials, phylodynamic models will be crucial for precise and comprehensive interpretation.

## Materials and Methods

### Mathematical Description of the Model.

The model (Fig. 1) contains cells susceptible to HIV infection S, which are created with rate $\alpha S$ and die with rate $\delta S$. HIV infection begins with the introduction of a founder HIV sequence with genotype g as an intact actively infected cell $Ag*$ (superscript * denotes intactness). Infected cells produce virions and intact virions $Vg*$ infect new cells with rate $\beta gSVg*$. Unproductive virions $Vg\left(\right)$ are also produced (hence empty superscript parentheses) from defective active infected cells (see third equation in Eq. 1) but cannot go on to infect other cells.

When a new cell is infected, mutations occur with rate μ. Given mutation, the proviral sequence is intact with probability τ. A small proportion of infected cells enter one of two latent states $Ls,g \left(*\right)$ with the small probability $\lambda s$, where subscript s further subdivides latent classes to satisfy observed multiphasic decay patterns (72). Thus, we have three possible infected cell states, which can each be intact or defective—for brevity we express as $Ig\left(*\right)=\left\{Ag\left(*\right),L1,g\left(*\right),L2,g\left(*\right)\right\}$. The rules of the mechanistic model can be approximately expressed as set of differential equations ($\partial t$ denotes time derivative) that grows as genotypes are added. After each time step, new sequences are added by mutation such that $\left\{g\right\}\to \left\{g,g\prime \right\}$.$\partial tS=\alpha S-\delta SS-\Sigma g\beta gVg*S\partial tIg\left(*\right) =bs\left(g,\tau ,\mu ,\lambda s\right)Vg*S-ds\left(Ig\left(*\right)\right)Ig\left(*\right)\partial tVg\left(*\right)=\pi Ag\left(*\right)-\gamma Vg\left(*\right)$[1]

The generic creation and removal rates of each type of infected cell is governed by the birth and death vectors $bs$ and $ds$ such that, for example,$bs=A=\beta g\left[\lambda A\tau \mu , \lambda A\left(1-\tau \right)\mu ,\lambda A\tau \left(1-\mu \right), \lambda A\left(1-\tau \right)\left(1-\mu \right)\right]$[2]represents the rate of creation of active cells of four types: defective mutated, intact mutated, defective nonmutated, and intact nonmutated. There are copies of these birth and death vectors for each overall infected cell state $s\in \left\{A,L1,L2\right\}$.

The removal rate of each type of infected cells depends on their state, intactness, and genotype $ds\left(Ig\left(*\right)\right)$, and the rate itself can also be different functions of the number of cells of that state. These rules vary in each of the adaptive immune (ai) models described below. Additionally, latently infected cells proliferate (added to the birth vector with rate $\alpha s$ for all intactness/genotypes) and die (added to the death vector with rate $\delta s$ for all intactness/genotypes) and reactivate to an active state (added to the death vector with rate $\xi s$ for intact genotypes).

To model mutational changes, we modify the HDs of mutated sequences by drawing a Poisson distributed number of nucleotide changes $\wp \left(\Delta g\right)$. For intact mutants we use an average of one nucleotide substitution $Hg\prime =Hg+\wp \left(\Delta g;1\right)$ and for defectives—typically generated through APOBEC hypermutation or large insertions/deletions—we use $Hg\prime =Hg+\wp \left(\Delta g;40\right)$, where 40 is the average number of base pair changes for DNA (73).

### Viral Fitness (vf) Models.

We created four models for the viral fitness (vf) of mutated intact sequences (Fig. 1C). The first is a trivial model where each viral strain has the same fitness. Thus, $p\left(\Delta \beta \right)=d\left(\Delta \beta \right)$, where d is the Dirac delta function equal to zero unless $\Delta \beta =1$. The second assumes no inheritance from parental strains such that fitness changes are uniformly distributed up to a maximum value $p\left(\Delta \beta \right)=U\left[0,\beta max\right]$. The third and fourth models assume heritability of viral fitness, either with an exponential distribution $p\left(\Delta \beta \right)=exp \left(-\Lambda \Delta \beta \right)/\Lambda$ with rate Λ or a Gaussian distribution $p\left(\Delta \beta \right)=N\left(1,\sigma \beta \right)$. However, in both cases we also enforce the constraint that $\beta g \prime \in \left[0,\beta max\right]$. From a biological point of view, these models encompass a broad range of possibilities for phenotypic variation, ranging from simplest (constant), to most complicated (normal and exponential) that have symmetric or asymmetric fitness (74). These choices are also justified by maximum entropy distributions (75).

The genotype-dependent death rate of actively infected cells $dA\left(Ag\right)$ is used to incorporate six models of the ai response. The first model has no adaptive immunity such that $dA\left(Ag\right)=\delta A$.

The next two models have “implicit immunity,” meaning that there are no additional compartments to represent immune cells. In the strain-size model, $dA\left(Ag\right)=\varphi AAgAg+hA$. We interpret this to mean that the number of actively infected cells with viral genotype g attracts immune cells relative to that genotype’s abundance. More abundant genotypes are removed faster. Rate $\varphi A$ is the maximum and $hA$ parameterizes maximal rate saturation. In the strain-age model infected cell death depends on genotype age, $dA\left(Ag\right)=\delta Aexp \left[\kappa a\left(t-ag\right)\right]$. This can be interpreted to mean older sequences have had more time to accrue adaptive immune pressure and thus are eliminated more rapidly—a mechanism which enforces strain replacement based on magnitude of rate constant $\kappa a$.

The remaining versions explicitly model immunity. We add a state variable compartment representing effector cells $Eg\left(t\right)$ governed by$\partial tEg=\omega AgAg+hgEg-\delta EEg-\varphi E\sum gEg\sum gEg+hE.$[3]

This part of the biology is the least understood and our mechanistic implementations may effectively capture several types of cells or molecules (CD8+ T cells, NK cells, and antibodies). We draw inspiration for construction from our prior work and published models of immune systems in viral dynamics (76⇓⇓–79).

In strainwise immune models, effector cells have their own genotype g which matches a viral genotype. Then, immune cells grow based on the prevalence of their cognate antigenic genotype (term with nonlinear growth rate ω and saturation constant $hg$), die naturally with rate $\delta E$, and have another death term such that the total adaptive immune response (sum over genotypes) is constrained in size (term with saturation constant $hE$). In the global model, $dA\left(Ag\right)=\left[\delta A+\kappa \sum gEg\right]$. We interpret this to imply that there is a single adaptive immune compartment that can kill any strain. In the strain-specific model, $dA\left(Ag\right)=\left[\delta A+\kappa gEg\right]$. We interpret this to imply that for each viral strain there is an adaptive immune compartment that can kill only that strain. The killing ability of each strain-specific adaptive immune compartment is $\kappa g$. In the strain-group model, $dA\left(Ag\right)=\left[\delta A+\kappa G\sum g\in GiEg\right]$. We interpret this to mean there is some cross-immunity such that sequences with similar sequence numbers (within a group $Gi$ where each group has the same size G) can be killed by a single immune compartment with killing rate $\kappa G$ which is assumed the same for all groups. The death rate of latently infected cells is simpler, and rates are not dependent on values, instead $dLs=\left[\delta s,\delta s+\xi s,\delta s,\delta s+\xi s\right]$ such that intact cells die slightly faster in accordance with observed values.

### Parameters from the Literature.

By using previous estimates, we constrained the parameters that must be estimated. All information on fixed parameters, initial conditions, and fit parameter ranges is contained in SI Appendix, Table S3. Three parameters are estimated in all models, and different models have further parameters that must be estimated such that results range from three to nine estimated parameters.

### Simulation Implementation.

The model is implemented in C++ and is freely available (https://github.com/FredHutch/WiPhy_HIV). We use a discrete stochastic τ-leap simulation scheme in 10 mL of plasma and a simulation time interval of Δt = 0.01 d. The state variables $X=\left\{S,Ig\left(*\right),Vg\left(*\right),E\dots \right\}$ represent the numbers of susceptible, active/latent infected cells and virions (for each genotype) and adaptive immunity if explicitly in the model. Thus, in each time interval, a Poisson-distributed number of events of each mechanistic transition is chosen by the reaction propensities $pX$ (80) such that $e=\wp \left(pX\Delta t\right)$. Then, the state variables are updated using the event transition matrix T as $\Delta X=eT$. For example, in an interval we might observe the creation of a new latently infected cell of a new genotype by viral infection. For this example, $T=\left[S-1,\dots ,L1,g\prime *+1,\dots ,Vg*-1\right]$, meaning removal of a susceptible cell, removal of an intact virion of genotype g, and the creation of an intact first phase latently infected cell with genotype $g\prime$. In this same interval many other events could occur simultaneously.

### Tracking the Complete Transmission Record (Genealogy).

To capture evolution, each viral strain is given a genotype number (an integer g). This number specifies an fitness/infectivity ($\beta g$), the number of base-pair mutations for this genotype relative to the founder virus (the HD $Hg$), its age ($ag$ the date of its emergence in time since the start of infection), and the number of each nucleic acids ($nx$ where $x\in \left\{A,C,T,G\right\}$ and the initial number is taken from the reference HXB2 sequence). The number of each state variable (e.g., infected cells) associated to that genotype is recorded at each time step. A substantial computational enhancement was achieved by tracking population size but not attributes and transmission records for defective variants. This choice is valuable because hypermutants and/or large deletions are typically removed before analysis of experimental data, but modeling the number of defective sequences was crucial to accurately populate the latent reservoir, which is well known to be predominantly defective (81).

### Model Fitting Procedure.

The best parameterization of each model was achieved by testing $k×$ 100 values of each parameter, where the number of model parameters is k. This approach means that for a model with eight parameters, a total of 800 × 8 = 6,400 parameterizations were tested, i.e., more complex models had more opportunities to find an optimum. Values were drawn from a grid search evenly spaced between a lower and upper bound for each parameter (often several orders of magnitude) based on previously determined HIV model rates. Because the model is stochastic, we attempted 20 replicate simulations for each parameter set, stopping if 10 replicates were successful. Each replicate was scored by computing the model value of each metric ($mi$) and computing a variance-normalized residual sum of squares ($RSSi$, also called the $\chi 2$ statistic) against the metric from the data: $RSSi=\left(mi-M¯i\right)2/var\left(Mi\right)$, where $Mi$ is the experimentally determine metric; the overbar denotes the mean and var denotes the variance or squared SD. We also calculated the total RSS as the sum $RSS=\sum iRSSi$.

### Model Selection.

Because we fit to correlated metrics and combined data sources, we ruled out typical model selection procedures based on likelihoods and information criteria (e.g., Akaike information criterion). Instead, we applied an ABC approach. The normalized RSS was calculated for each metric and runs with all individual metrics fitting reasonably well ($RSSi<5\forall i$) were included. SI Appendix, Fig. S1 shows the RSS summed over individual metrics. We determined the best single run for the best parameter set, for each model. Then, to pick models that consistently work well, we ultimately accepted model parameterizations for which model output averaged across stochastic runs was within our RSS tolerance.

### Global Sensitivity Analysis.

Using the best model, we quantified the influence of model parameters on all metrics, and on the RSS error of all metrics, using global sensitivity analysis by calculating the Spearman correlation coefficient (SI Appendix, Fig. S7) (82).

### Modeling Comparison to Entropy Distributions.

We obtained filtered HIV-1 env alignments (type M without recombinants) from the LANL database (https://www.hiv.lanl.gov/content/index) and removed all but subtype B sequences resulting in 2,339 sequences. Entropy was calculated with default options on the LANL website and gaps are removed to resolve the consensus sequence and its entropy values. The relative abundance of each base b at each position ψ in the env is expressed such that a perfectly even distribution at some position is written $p\psi \left(b\right)=\left[0.25, 0.25, 0.25, 0.25\right]$, whereas a perfectly uneven distribution at some position is written $p\psi \left(b\right)=\left[0, 1, 0, 0\right]$, which represents that a single base (e.g., T) is found at that position for all individuals in the database. We calculated Shannon’s entropy $S\psi =-\sum bp\psi \left(b\right)log p\psi \left(b\right)$ for each position. Next, because our model is agnostic to nucleotide-position-specific biology, we reranked entropy from most to least variable. We then use the distribution of entropy as a quantitative estimate of the relative effect of reducing fitness after a point mutation to positions in env. We then identified the factor y that would scale entropy (assumed constant over position) such that $yS$ most closely resembled our best-fit viral-fitness distribution $p\left(\Delta \beta \right)$. We minimized the RSS between data and model distributions to find $y\sim$ 1.5.

### Sequence Sampling.

To simulate sampling, we randomly select virions (recapitulating experimental sampling of viral RNA) from the simulation. At the time intervals matching the experimental data, cells are computationally sampled from the present virus until a given number of sequences are represented or until all active sequences are represented if the actual number is less.

### Calculating tMRCA.

Time to most recent common ancestor (Fig. 3) was calculated by examining all sequence pairs and determining their parental sequence. If the parent is identical, the procedure halts and the birth date of the parent is recorded as tMRCA. If the parent is different, we track back to parents of the parental sequences and repeat.

### Calculation of Divergence and Diversity.

Because we do not track nucleotide sequences, to calculate the number of base-pair differences between a pair of sequences we compute the sum of their HDs and subtract off the HD of their parental sequence $\Delta \left(i,j\right)=Hi+Hj-HP\left(i,j\right)$. The divergence is calculated as the $max\Delta \left(i,j\right)$ where $i=0$ is the founder sequence. Diversity is calculated as the average pairwise distance (83):$\nabla =2l\Sigma i=2N\Sigma j=2Nfifj\Delta \left(i,j\right),$[4]where the frequency of each sampled variant is $fi=NiN$, where N is the total sample size and the number of nucleotides l is the length of HIV env.

### Integration with BEAST.

To harmonize simulation output with phylogenetic inference tools, we used genealogies and HDs to output a list of sampled nucleotide sequences. We applied an HKY nucleotide substitution model beginning with the HXB2 reference and keeping a fixed length genome to export a time labeled FASTA file which can be input to BEAST. Note that this coarse post facto site model could be expanded to include insertions and deletions but not recombination currently. Our model allowed for HKY variable frequency of transitions and transversions such that forward simulation and backward inference was congruent. We chose a strict molecular clock and a fixed population size and ensured convergence by testing different burn-in sizes. An example XML file in which BEAST settings can be found is provided within our GitHub repository.

## Data Availability

All code is freely available at GitHub (https://github.com/FredHutch/WiPhy_HIV). Previously published data were used for this work (all citations for all data are given in the text and SI Appendix, Table S1). All other study data are included in the article and/or SI Appendix.

## Acknowledgments

This work was funded by a Washington Research Foundation postdoctoral fellowship and a National Institute of Allergy and Infectious Diseases K25 (AI155224) to D.B.R. D.B.R. is grateful to numerous colleagues for conversations including F. Boshier, A. Dingens, T. Bedford, B. Dearlove, E. Lewitus, A. Feder, P. Roychoudury, P. Edlefsen, and J. Mullins.

## Footnotes

• Author contributions: J.T.S. and D.B.R. designed research; D.A.S. and D.B.R. performed research; M.R. and D.B.R. contributed new reagents/analytic tools; D.A.S. and D.B.R. analyzed data; and D.A.S., M.R., J.T.H., J.T.S., and D.B.R. wrote the paper.

• The authors declare no competing interest.