Jarrod and Dan, While I see what Dan is saying and I agree that evaluating this with non-phylogenetic data is not entirely useful, I think you have stumbled upon a known issue but one that is not widely appreciated.
While the MK model is a useful model for discrete characters in many ways, it may be inappropriate for such a test. If you assume that the root is equally likely to be in either state with a lot of tips (i.e. approaching the limit), the ML estimate for the ratio of q01 (transition rate from 0 to 1) to q10 will converge on the ratio between number of tips in state 1 to number of tips in state 0. So if you have a tree where most of the tips are in state 1 then you will get support for the asymmetric change model as this best explains the data. It is a very weird problem and perhaps I am incorrect in this. Regardless, this result does not hold if the discrete character is modeled simulataneously with the branching process of the phylogeny (i.e. BiSSE; Maddison et al. 2007). So, in summation, I mostly agree with you. But this is not shocking behavior of the model. If you are interested, a Bayesian implementation of the Mk model can be found in the package diversitree in the function make.mk . again, i could be off base here. curious to see what others think? matt On Thu, Aug 16, 2012 at 10:58 AM, Dan Rabosky <drabo...@umich.edu> wrote: > > HI Jarrod- > > It isn't immediately obvious to me why the exercise below reflects > something problematic. In the first scenario, you have a random binary > state but with strong differences in frequency. Because there is > effectively no phylogenetic signal (as data are simply random), this > suggests a high rate of transition between states. I think that such > asymmetry in frequencies would be highly unlikely under a symmetric model > of character change regardless of the root state. I think it is useful to > think about whether a symmetric process could have given rise to a truly > random distribution of tip data with strong frequency differences - my > intuition is that this is highly unlikely. However, I would be happy to > know what others think. > > Cheers, > ~Dan > > > On Aug 16, 2012, at 10:09 AM, Jarrod Hadfield wrote: > > > Hi, > > > > I have had a few replies off-list which have made me try and clarify > what I mean. I think the distinction needs to be made between two types of > probability: the probability that an outcome is 0 or 1 Pr(y| \theta) and > the probability density of \theta, Pr(\theta | \gamma), indexed by > parameter(s) \gamma. It seems to me that in order to make the problem > identifiable an informative prior (or an assumption) is required for the > root state. It seems to me that the strong prior Pr(\theta=0.5|\gamma) =1 > is used, and then justified because int Pr(y=0| \theta)Pr(theta)dtheta=int > Pr(y=1| \theta)Pr(theta)dtheta=0.5. A non-informative prior distribution > for \theta could be U(0,1). This also induces a prior on y of the same > form, int Pr(y=0| \theta)Pr(theta)dtheta=int Pr(y=1| > \theta)Pr(theta)dtheta=0.5, but that is not the main motivation for > choosing U(0,1). > > > > For example, is this not worrying: > > > > library(ape) > > n<-100 > > tree<-rcoal(n) # random tree > > y<-rbinom(n, 1, 0.2) # random data unconnected to the tree > > m1<-ace(y, tree, type = "d", model="SYM") > > m2<-ace(y, tree, type = "d", model="ARD") > > anova(m1, m2) # asymmetric evolutionary transition rates strongly > supported > > > > y<-rbinom(n, 1, 0.5) # random data unconnected to the tree but p=0.5 > > m1<-ace(y, tree, type = "d", model="SYM") > > m2<-ace(y, tree, type = "d", model="ARD") > > anova(m1, m2) # asymmetric evolutionary transition not supported > > > > Cheers, > > > > Jarrod > > > > > > > > > > > > > > Quoting Jarrod Hadfield <j.hadfi...@ed.ac.uk> on Thu, 16 Aug 2012 > 12:30:30 +0100: > > > >> Hi, > >> > >> I have been helping someone with some analyses and came across some > routines to estimate asymmetric transition rates between discrete > characters. This surprised me because its fairly straightforward to prove > that asymmetric transition rates cannot be identified using data collected > on the tips of a phylogeny. However when I run these routines (e.g. ace) > likelihood ratio tests often suggest strong support for asymmetric rates. > This seems to arise because there is an implicit (and unjustified) prior > point mass on the ancestral state *probability*. If anyone could point me > to reference that states this that would be great? I really don't want to > be saying we have support for processes that we need a fossil record, not > just a phylogeny, to understand. > >> > >> Cheers, > >> > >> Jarrod > >> > >> > >> > >> -- > >> The University of Edinburgh is a charitable body, registered in > >> Scotland, with registration number SC005336. > >> > >> _______________________________________________ > >> R-sig-phylo mailing list > >> R-sig-phylo@r-project.org > >> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo > >> > >> > > > > > > > > -- > > The University of Edinburgh is a charitable body, registered in > > Scotland, with registration number SC005336. > > > > _______________________________________________ > > R-sig-phylo mailing list > > R-sig-phylo@r-project.org > > https://stat.ethz.ch/mailman/listinfo/r-sig-phylo > > > > > > > > > [[alternative HTML version deleted]] > > _______________________________________________ > R-sig-phylo mailing list > R-sig-phylo@r-project.org > https://stat.ethz.ch/mailman/listinfo/r-sig-phylo > [[alternative HTML version deleted]] _______________________________________________ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo