Re: [R-sig-phylo] Stochastic mapping & discrete characters

Liam J. Revell Tue, 01 Jul 2014 12:08:06 -0700

Hi Naama.

I can answer the three questions that pertain to phytools:


(1) The function make.simmap has a bug for non-reversible models when
the input character has more than 2 states. This has to do with the
algorithm for simulating changes along edges where the correct waiting
time is simulated, but then the states at the end of the waiting time
are chosen with the incorrect probabilities. This will sometimes cause
make.simmap to hang for a long time. This does not affect the states
simulated at nodes (which occurs first) and should not affect stochastic

character mapping at all for binary characters or any reversible modelof character evolution. I have posted a fixed version here:

http://www.phytools.org/make.simmap/v1.9/make.simmap.R - but you can
also install a version of phytools containing this update here:
http://www.phytools.org/nonstatic/. Use phytools >= 0.4-22

(2) You should look into the function describe.simmap to summarize the
time spent in each state, etc., from stochastic character mapping using
make.simmap.

(3) To simulate stochastic character histories you can use the phytools
function sim.history. Note, though, that the matrix Q is the *transpose*
of the fitted value of Q from make.simmap. Sorry about this. That means
to simulate stochastic character histories on your tree you can do:

## stochastic mapping:
mtrees<-make.simmap(tree,x,model="ARD",nsim=100)
## summarize results
obj<-describe.simmap(mtrees)
obj
plot(obj) ## PP at nodes from stochastic mapping
fitted.Q<-mtrees[[1]]$Q
## simulate under fitted model
simhistories<-sim.history(tree,t(fitted.Q),nsim=100,anc=rstate(obj$ace[1,]))

The last part (anc=rstate(obj$ace[1,])) is necessary if you havenon-reversibility, because this way you are picking from the posteriordistribution at the root from your real data.

In addition, sim.history does not permit any columns (rows in the matrixfrom make.simmap) to be equal to 0.0 (which we will have if we have atruly non-reversible character). To resolve this, you can do something like:


ii<-which(diag(fitted.Q)==0)
fitted.Q[ii,-ii]<-max(nodeHeights(tree))*1e-12
fitted.Q[ii,ii]<--sum(fitted.Q[ii,])
simhistories<-sim.history(tree,t(fitted.Q),nsim=100,anc=rstate(obj$ace[1,]))

Hope this is helpful.

All the best, Liam

Liam J. Revell, Assistant Professor of Biology
University of Massachusetts Boston
web: http://faculty.umb.edu/liam.revell/
email: liam.rev...@umb.edu
blog: http://blog.phytools.org

On 7/1/2014 7:54 AM, Naama Kopelman wrote:

Hi all,

I am trying to examine a correlation between two discrete traits,
using stochastic mapping.

I tried two options and ran into problems with both:



1. Using stochastic mapping of Phytools â€“

make.simmap - The problem is that I need to use constraints - for
one character

I need to set the state of the root to 0 (I figured how to do that),
while for the second

character I need to set the reverse rate to 0. Couldnâ€™t figure out
this second constraints.

Tried this

mtrees<-make.simmap(tree,mtree$states,nsim=10,
model=matrix(c(0,1,0,0),2))

But model=matrix(c(0,1,0,0)) leads to the following error:

BLAS/LAPACK routine 'DGEBAL' gave error code -3

In addition: Warning message:

In matrix(XX$rates[II], m, m, dimnames = list(lvls, lvls)) :

data length [3] is not a sub-multiple or multiple of the number of
rows [2]

The same command works if I use , model=matrix(c(0,1,0,0),2))

In addition, I wonder if anyone has examined the distribution of two
characters - i.e. how can I calculate the percentage of time spent in
(0,0), (0,1), etc. for the

classes generated by phytools.



2. Using Diversitree -

I used asr.stoch with a likelihood function and parameters. I managed
to make constraints in the

two needed ways (i.e. on the root, and on the rates)

pars <- c(.1, .1, .03, .03, .03, .06)

set.seed(1)

phy <- trees(pars, "bisse", max.taxa=4, max.t=Inf, x0=0)[[1]]

h <- history.from.sim.discrete(phy, 0:1)

plot(h, phy, main="True history")

lik <- make.mk2(phy, phy$tip.state)

#lik<-constrain(q13~q12,q21~q12,q23~q12,q31~q12,q32~q12)

argnames(lik)

lik2<-constrain(lik, q10~0)

argnames(lik2)

fit = find.mle(lik2,c(.1))

st.s1 <- asr.stoch(lik2, fit$par)

plot(st.s1, phy)

st.s2 <- asr.stoch(lik2, fit$par)



What I do not understand is how to simulate stochastic mappings on my
trees randomly â€“ I want to

use the same rate parameters, but with no constraints on the tips.
These simulations will help me

Determine whether the statistic I am examining is significantly large
for my data.



In addition, I wonder if anyone has examined the distribution of two
characters -

i.e. how can I calculate the percentage of time spent in (0,0),
(0,1), etc. for the

classes generated by Diversitree.



Thanks a lot for your help!

[[alternative HTML version deleted]]



_______________________________________________ R-sig-phylo mailing
list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive
at http://www.mail-archive.com/r-sig-phylo@r-project.org/


_______________________________________________
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

Re: [R-sig-phylo] Stochastic mapping & discrete characters

Reply via email to