Re: [R-sig-phylo] seemingly conflicting output from BAMM

Dan Rabosky Thu, 14 Aug 2014 18:10:26 -0700

Hi Chris-

Just to add to what Jonathan wrote.

This is a good question. You have two basic issues that are being confounded: 
(1) how much evidence is there for a rate shift overall, versus (2) how much 
evidence do you have bearing on the locations of specific shifts. In your case, 
you have limited evidence for rate variation: Bayes factors of 3 -5 versus a 
model with 0 shifts. That's rather weak evidence for rate variation, but it's 
(in my opinion) at least worth considering further. You can also see that this 
is not especially strong evidence from considering the posterior distribution 
of shifts, which gives a posterior probability of 0.18 to a model with 0 shifts 
(as an aside, Bayes factors are always going to be more reliable for these 
types of comparisons because they explicitly take into consideration whatever 
prior distribution you've specified on the number of shifts).

In your case, you have weak overall evidence for a shift somewhere in your 
data. However, you have even less evidence that a shift occurs at any 
particular location. This is what you see in your credible shift set: each 
shift configuration has a shift in a different location. The overall "best" 
shift configuration, e.g., the one with the maximum a posteriori (MAP) 
probability, does not have any "core" shifts. This is a bit confusing, but is 
explained at length here: http://bamm-project.org/rateshifts.html

Basically, we only consider "significant" shifts when we enumerate the set of 
distinct shift configurations in the posterior, where "significant" means that 
a shift occurs on a given branch with substantially elevated frequency relative 
to what you'd expect under the prior (again, this is all explained on the 
documentation page). These are termed "core shifts" in BAMM terminology. In 
your case, the shift configuration with the highest posterior probability 
actually has no core shifts.

However, you can decrease the threshold used to identify core shifts. You can 
lower the Bayes factor criterion used to identify shifts with elevated 
posterior probabilities relative to the prior expectation in the 
credibleShiftSet function by changing the default value for the BFcriterion 
argument (e.g., if you set BFcriterion = 1, you will probably see a shift in 
your MAP probability shift configuration). 

However, tweaking the BFcriterion argument won't change the fact that your 
dataset has only weak evidence overall for rate heterogeneity among clades. As 
you decrease the BFcriterion, you will find that the posterior probability of 
your MAP shift configuration will also drop. Jonathan makes a good point, 
especially relevant in this case, because the credible shift set here is 
telling you something important that you won't get out of a single point 
estimate: you don't have much confidence at all in any particular rate shift.

~Dan Rabosky 

On Aug 14, 2014, at 8:28 PM, Jonathan Chang wrote:

> Hi Krzysztof,
> 
> It certainly can be true that the most credible shift configuration is
> one where there are no inferred rate shifts, but also prefer a model
> with one rate shift. The issue is mentioned in the BAMM documentation
> <http://bamm-project.org/rateshifts.html>
> 
> BAMM looks like it found evidence of a rate shift on your phylogeny.
> However, the exact location of that rate shift is not certain. In your
> plots, shift configuration #2 shows a rate increase on the upper
> clade, whereas #3 shows a rate decrease in the lower clade. #4 and #5
> tell similar stories. Note that the configurations with 1 rate shift
> (#2-#5) combined are seen more often than configurations with 0 rate
> shifts (#1).
> 
> Personally I'm unclear on how useful the most credible shift
> configuration actually is. To me that throws away a lot of the power
> of BAMM by reducing its inference down to a point estimate.
> 
> Jonathan
> 
> On Thu, Aug 14, 2014 at 5:08 PM, Krzysztof Kozak <kk...@cam.ac.uk> wrote:
>> Dear All,
>> 
>> I have been asked to analyse my chronogram using BAMM, and I like the idea.
>> Sadly, I am puzzled by the output. I worked through the example and read the
>> entire documentation, but still don't grasp why different analyses suggest
>> different answers.
>> 
>> 1. On one hand, several functions suggest that there are 1-2 rate shifts in
>> my data.
>> - Plotting netdiv rate shows it changing somewhat at two times.
>> - plot.bammdata(edata) shows increased rate on the branch leading to a
>> disproportionately large clade
>> - rescaling the branch lengths by the Bayes Factor of a rate shift
>> (bayesFactorBranches) also shows that branches leading to more speciose
>> clades are very long
>> - computeBayesFactors gives this output:
>> 0 1.0000000 0.2860509 0.2273844 0.3127353 0.3841264 1.091439 0.3605840
>> 1 3.4958818 1.0000000 0.7949089 1.0932856 1.3428605 3.815542 1.2605592
>> 2 4.3978396 1.2580058 1.0000000 1.3753596 1.6893262 4.799974 1.5857908
>> 3 3.1975924 0.9146741 0.7270825 1.0000000 1.2282796 3.489977 1.1530008
>> 4 2.6033098 0.7446790 0.5919520 0.8141469 1.0000000 2.841354 0.9387120
>> 5 0.9162216 0.2620860 0.2083345 0.2865348 0.3519449 1.000000 0.3303749
>> 7 2.7732786 0.7932987 0.6306002 0.8673021 1.0652895 3.026865 1.0000000
>> 
>> - simple summary of the posterior summary(edata) also favours models with
>> shifts
>> Shift posterior distribution:
>>         0     0.1800
>>         1     0.4300
>>         2     0.2800
>>         3     0.0840
>>         4     0.0240
>>         5     0.0025
>>         7     0.0005
>> 
>> 2. On the other hand, the plot of Credible Shift Sets always shows the model
>> with no shifts as most frequent (??? - an example is attached).
>> - ...and the best shift configuration is indeed without shifts, as checked
>> with
>> priorshifts <- getBranchShiftPriors(tree, prior)
>> best <- getBestShiftConfiguration(edata, prior, BFcriterion  = 5)
>> 
>> To summarise: I do not understand how it is possible to find substantial
>> Bayes Factors in support of a model with two rate shifts, and yet have the
>> model without shifts as the "best configuration".
>> I hope this is not too naive and I will appreciate any feedback.
>> 
>> Best,
>> __
>> Krzysztof "Chris" Kozak
>> PhD Candidate, Department of Zoology
>> University of Cambridge, CB2 3EJ
>> http://heliconius.zoo.cam.ac.uk/people/krzysztof-kozak/
> 
> _______________________________________________
> R-sig-phylo mailing list - R-sig-phylo@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
> Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

_____________________
Dan Rabosky
Assistant Professor & Curator of Herpetology
Museum of Zoology &
Department of Ecology and Evolutionary Biology
University of Michigan
Ann Arbor, MI 48109-1079 USA

drabo...@umich.edu
http://www-personal.umich.edu/~drabosky
http://www.lsa.umich.edu/ummz/

        [[alternative HTML version deleted]]

_______________________________________________
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

Re: [R-sig-phylo] seemingly conflicting output from BAMM

Reply via email to