Re: [R-sig-phylo] Plotting ancestral states from estimation on multiphylo object

2015-11-05 Thread Arbuckle, Kevin
Hi Liam,

Thanks for that update to the function, as far as I can tell it worked 
perfectly! As a slight aside, I'm glad that now exists (even if you want to 
tidy it up or run more checks on it). I think it is a great addition to 
phytools; given that you can implement stochastic mapping across multiple and 
different trees, it's a huge advantage to be able to sensibly plot the results 
of such analyses.

Thanks again,

Kev


From: Liam J. Revell [liam.rev...@umb.edu]
Sent: 04 November 2015 19:21
To: Arbuckle, Kevin; r-sig-phylo@r-project.org
Subject: Re: [R-sig-phylo] Plotting ancestral states from estimation on 
multiphylo object

Hi Kevin.

This is almost certainly because your input trees have different
topologies. describe.simmap was just not designed for this.

However, I have just posted a new version of the function that does
accept trees with different topologies. It does this by computing the
posterior probabilities of nodes present in a reference tree (e.g., a
consensus or MCC tree). That is, it goes through all the nodes, and for
each one first asks of each stochastic map tree (1) is the node present
in the tree and then (2) what is the state of the node. The
probabilities are then the conditional probabilities - conditioned on
the node being present!

I have posted a demo here:
http://blog.phytools.org/2015/11/describesimmap-when-summarizing-results.html.
Note that I just did this today, so there is a good chance there will be
bugs. Please report back.

Finally, if you are using large trees setting check.equal to TRUE will
be very slow - so it is probably better in this case to compute a
consensus or reference tree externally, and then input it as the
optional argument ref.tree. This should circumvent the check if
check.equal is set to FALSE (the default).

All the best, Liam

Liam J. Revell, Associate Professor of Biology
University of Massachusetts Boston
web: http://faculty.umb.edu/liam.revell/
email: liam.rev...@umb.edu
blog: http://blog.phytools.org

On 11/4/2015 5:44 AM, Arbuckle, Kevin wrote:
> Hi folks,
>
>
>
> I am currently trying to estimate ancestral states for a binary trait on a 
> set of 500 ultrametric trees (obtained from birdtree.org) of 105 bird species 
> (and in some cases there are multiple records per species). I have completed 
> this part using make.simmap in phytools for which I created 10 stochastic 
> maps for each of the 500 trees. This worked fine. However, the issue comes 
> when I try to plot the posterior probabilities for each state as pie charts 
> at nodes using the following code:-
>
>
>
> cols<-setNames(c("black", "red"),sort(unique(mapdat)))
> plot(trees[[1]],lwd=3,no.margin=T,label.offset=2,cex=0.5)
> nodelabels(pie=sm$ace,piecol=cols,cex=0.3)
> tiplabels(pie = to.matrix(mapdat, sort(unique(mapdat))), piecol = c("black", 
> "red"), cex = 0.2)
> add.simmap.legend(leg=c("No","Yes"),colors=cols,vertical=FALSE)
>
>
>
> where mapdat is the named vector used for the stochastic mapping, trees is 
> the original set of 500 trees,  and sm is the output from describe.simmap.
>
>
>
> The problem is that the plot produced 
> (https://www.dropbox.com/s/ese4nu3yqb66rvv/asr.tiff?dl=0) is wrong; some of 
> the ancestral states don't seem to even remotely correspond to anything 
> tipward of them and, more telling, some of the tips have been assigned the 
> wrong state (the two species on the bottom, Rhea pennata and Dromalus 
> novaehollandiae, both lack the trait in the data frame and so should both be 
> black, but R. pennata is shown as red!).
>
>
>
> I then tried changing the pie argument for tiplabels to pie=sm$tips in order 
> to use the data for tips from the simmap rather than the raw data in case 
> that fixed the problem. It seemed to fix part of the problem as the new plot, 
> for example, now appropriately classifies R. pennata as lacking the trait 
> (https://www.dropbox.com/s/3ikfw4v5yyjacqx/ace%20asr.tiff?dl=0). However, 
> this has introduced a new oddity. Some of the tips have simple dots whereas 
> others have pie charts with varying probabilities of displaying the trait or 
> not. I had originally assumed that this was for cases where I had multiple 
> data points for a single species and where there was variation within a 
> species. However,some species with only a single datapoint such as Vultur 
> gryphus at the very top of the figure not only display a pie chart, but also 
> display greatest support for lacking the trait despite the data specifying 
> that it possesses the trait!
>
>
>
> This is exceptionally confusing and I am not entirely sure where to start as 
> I have no idea what's going wrong. My best guess is that because the 
> different trees in the multiphylo objec

Re: [R-sig-phylo] Plotting ancestral states from estimation on multiphylo object

2015-11-04 Thread Liam J. Revell

Hi Kevin.

This is almost certainly because your input trees have different 
topologies. describe.simmap was just not designed for this.


However, I have just posted a new version of the function that does 
accept trees with different topologies. It does this by computing the 
posterior probabilities of nodes present in a reference tree (e.g., a 
consensus or MCC tree). That is, it goes through all the nodes, and for 
each one first asks of each stochastic map tree (1) is the node present 
in the tree and then (2) what is the state of the node. The 
probabilities are then the conditional probabilities - conditioned on 
the node being present!


I have posted a demo here: 
http://blog.phytools.org/2015/11/describesimmap-when-summarizing-results.html. 
Note that I just did this today, so there is a good chance there will be 
bugs. Please report back.


Finally, if you are using large trees setting check.equal to TRUE will 
be very slow - so it is probably better in this case to compute a 
consensus or reference tree externally, and then input it as the 
optional argument ref.tree. This should circumvent the check if 
check.equal is set to FALSE (the default).


All the best, Liam

Liam J. Revell, Associate Professor of Biology
University of Massachusetts Boston
web: http://faculty.umb.edu/liam.revell/
email: liam.rev...@umb.edu
blog: http://blog.phytools.org

On 11/4/2015 5:44 AM, Arbuckle, Kevin wrote:

Hi folks,



I am currently trying to estimate ancestral states for a binary trait on a set 
of 500 ultrametric trees (obtained from birdtree.org) of 105 bird species (and 
in some cases there are multiple records per species). I have completed this 
part using make.simmap in phytools for which I created 10 stochastic maps for 
each of the 500 trees. This worked fine. However, the issue comes when I try to 
plot the posterior probabilities for each state as pie charts at nodes using 
the following code:-



cols<-setNames(c("black", "red"),sort(unique(mapdat)))
plot(trees[[1]],lwd=3,no.margin=T,label.offset=2,cex=0.5)
nodelabels(pie=sm$ace,piecol=cols,cex=0.3)
tiplabels(pie = to.matrix(mapdat, sort(unique(mapdat))), piecol = c("black", 
"red"), cex = 0.2)
add.simmap.legend(leg=c("No","Yes"),colors=cols,vertical=FALSE)



where mapdat is the named vector used for the stochastic mapping, trees is the 
original set of 500 trees,  and sm is the output from describe.simmap.



The problem is that the plot produced 
(https://www.dropbox.com/s/ese4nu3yqb66rvv/asr.tiff?dl=0) is wrong; some of the 
ancestral states don't seem to even remotely correspond to anything tipward of 
them and, more telling, some of the tips have been assigned the wrong state 
(the two species on the bottom, Rhea pennata and Dromalus novaehollandiae, both 
lack the trait in the data frame and so should both be black, but R. pennata is 
shown as red!).



I then tried changing the pie argument for tiplabels to pie=sm$tips in order to 
use the data for tips from the simmap rather than the raw data in case that 
fixed the problem. It seemed to fix part of the problem as the new plot, for 
example, now appropriately classifies R. pennata as lacking the trait 
(https://www.dropbox.com/s/3ikfw4v5yyjacqx/ace%20asr.tiff?dl=0). However, this 
has introduced a new oddity. Some of the tips have simple dots whereas others 
have pie charts with varying probabilities of displaying the trait or not. I 
had originally assumed that this was for cases where I had multiple data points 
for a single species and where there was variation within a species. 
However,some species with only a single datapoint such as Vultur gryphus at the 
very top of the figure not only display a pie chart, but also display greatest 
support for lacking the trait despite the data specifying that it possesses the 
trait!



This is exceptionally confusing and I am not entirely sure where to start as I 
have no idea what's going wrong. My best guess is that because the different 
trees in the multiphylo object that was fed into make.simmap were different (in 
topology as well as branch lengths), the node allocation from describe.simmap 
is not correct. But I am at a loss as to whether this is indeed the explanation 
and even more so as to how to fix it. Please let me know if you need/want any 
additional details. Any help would be gratefully received.



Thank you and best wishes,



Kev

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/



___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/


[R-sig-phylo] Plotting ancestral states from estimation on multiphylo object

2015-11-04 Thread Arbuckle, Kevin
Hi folks,



I am currently trying to estimate ancestral states for a binary trait on a set 
of 500 ultrametric trees (obtained from birdtree.org) of 105 bird species (and 
in some cases there are multiple records per species). I have completed this 
part using make.simmap in phytools for which I created 10 stochastic maps for 
each of the 500 trees. This worked fine. However, the issue comes when I try to 
plot the posterior probabilities for each state as pie charts at nodes using 
the following code:-



cols<-setNames(c("black", "red"),sort(unique(mapdat)))
plot(trees[[1]],lwd=3,no.margin=T,label.offset=2,cex=0.5)
nodelabels(pie=sm$ace,piecol=cols,cex=0.3)
tiplabels(pie = to.matrix(mapdat, sort(unique(mapdat))), piecol = c("black", 
"red"), cex = 0.2)
add.simmap.legend(leg=c("No","Yes"),colors=cols,vertical=FALSE)



where mapdat is the named vector used for the stochastic mapping, trees is the 
original set of 500 trees,  and sm is the output from describe.simmap.



The problem is that the plot produced 
(https://www.dropbox.com/s/ese4nu3yqb66rvv/asr.tiff?dl=0) is wrong; some of the 
ancestral states don't seem to even remotely correspond to anything tipward of 
them and, more telling, some of the tips have been assigned the wrong state 
(the two species on the bottom, Rhea pennata and Dromalus novaehollandiae, both 
lack the trait in the data frame and so should both be black, but R. pennata is 
shown as red!).



I then tried changing the pie argument for tiplabels to pie=sm$tips in order to 
use the data for tips from the simmap rather than the raw data in case that 
fixed the problem. It seemed to fix part of the problem as the new plot, for 
example, now appropriately classifies R. pennata as lacking the trait 
(https://www.dropbox.com/s/3ikfw4v5yyjacqx/ace%20asr.tiff?dl=0). However, this 
has introduced a new oddity. Some of the tips have simple dots whereas others 
have pie charts with varying probabilities of displaying the trait or not. I 
had originally assumed that this was for cases where I had multiple data points 
for a single species and where there was variation within a species. 
However,some species with only a single datapoint such as Vultur gryphus at the 
very top of the figure not only display a pie chart, but also display greatest 
support for lacking the trait despite the data specifying that it possesses the 
trait!



This is exceptionally confusing and I am not entirely sure where to start as I 
have no idea what's going wrong. My best guess is that because the different 
trees in the multiphylo object that was fed into make.simmap were different (in 
topology as well as branch lengths), the node allocation from describe.simmap 
is not correct. But I am at a loss as to whether this is indeed the explanation 
and even more so as to how to fix it. Please let me know if you need/want any 
additional details. Any help would be gratefully received.



Thank you and best wishes,



Kev

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/