This week we built a phylogeny using our fish samples and COI sequences from NCBI downloaded through geneious.
First we created an alignment of all of our sequences, then edited it so that we had trimmed edges, similar to last week. Additionally, we looked to see if any sequences lined up especially poorly. This may happen because the sequence is a reverse read relative to the others. Two of my sequences lined up especially poorly, so I reveresed them. They still lined up poorly, so I removed them and redid the alignment. Below is an image of the alignment after editing:
MODEL of MOLECULAR EVOLUTION
Next, we inputted our alignment into the program jModelTest2 to find the best model of molecular evolution. We exported the alignment as a phylip file and computed likelihood scores. We then used the program to calculate Aikake Information Criterion and Bayesian Information Criterion scores. The lower the AIC or BIC score, the better the fit. For both AIC and BIC, the model TPM2uf+I+G scored the lowest. This model was ranked 40, so it was moderately complex.
We used the MrBayes plugin in Geneious to build our first tree. First, I chose a subsitution model. Since TPM2uf was not available, I chose HKY85, which is similarly complex. For the rate variation, I chose invgamma because my model above had “+I+G” in it.
I ran a short analysis to test the waters, with a chain length of only 10,100 and a burn in of 100. The tree produces had many polytomies, branches with low probabilities, and the density and trace plots showed signs of not running the analysis long enough. I then ran the analysis again with a chain length of 110,000 and burn in of 10,000. This tree had fewer polytomies and greater probabilities. The clades were similar, but not the same. Lastly, the density plot showed a better distribution of probabilities and the trace plot looked like a fuzzy caterpillar. After class, I ran the analysis at 3,000,000 with a burn-in of 300,000. This produced a tree with even fewer polytomies and better probabilites.
Next I used the RAxML plugin to build a maximum likelihood tree. We used rapid bootstrapping with rapid hill climbing with a bootstrapping parameter of 100. This created a file with 100 trees, so I created a consensus tree of those. This tree had many, many polytomies but the clades with high support did match clades in the MrBayes analysis.
Now I used PHYML to build another maximum likelihood tree. Again we used 100 bootstraps. This tree produced many fewer polytomies than RAxML. Many clades had agreement with the MrBayes trees, but there were a several taxa placed in pretty different parts of the tree.