One thing you could do is take all nonoverlapping pairs of taxa (Felsenstein's other technique in the contrasts paper): that is, for a tree (A,(B,(C,(D,E)))), you can look at D-E and B-C, *or* D-E and A-B, *or* D-E and A-C, *or* A-B and C-D, etc. (so, still leaving out one taxon each time, but only using each edge once) and then compare the states for pair of taxa. If state 0 has freq p, and state 1 has freq q, you should see p^2 0-0 pairs, 2pq 0-1 pairs, and q^2 1-1 pairs with truly random sampling; if it's maximally stratified, you should see only two of these pairs (i.e., if 1 is less common, you should see only 0-1 and 0-0 pairs). You could rig up a test statistic (prob comparing two multinomial models) from this.
I have some code that purportedly gets all independent such pairs in R at https://r-forge.r-project.org/scm/viewvc.php/pkg/R/independentTaxa.R?view=markup&revision=366&root=omearalab. However, I haven't rigorously tested it (this was in the dark ages before we all used testthat, too), so feel free to take, hack, republish, etc., but test first [anyone else should feel free to take this, too -- it could naturally go into phytools, ape, or phangorn, for example, assuming it actually works]. Best, Brian _______________________________________________________________________ Brian O'Meara, http://www.brianomeara.info, especially Calendar <http://brianomeara.info/calendars/omeara/>, CV <http://brianomeara.info/cv/>, and Feedback <http://brianomeara.info/teaching/feedback/> Associate Professor, Dept. of Ecology & Evolutionary Biology, UT Knoxville Associate Head, Dept. of Ecology & Evolutionary Biology, UT Knoxville Associate Director for Postdoctoral Activities, National Institute for Mathematical & Biological Synthesis <http://www.nimbios.org> (NIMBioS) Communication Director, Society of Systematic Biologists On Tue, Mar 14, 2017 at 1:37 PM, Ross Mounce <ross.mou...@gmail.com> wrote: > So I tried a 12-taxon fully pectinate tree with Blomberg's K as calculated > by picante::Kcalc() > > library(picante) > library(ape) > aa<-"(A,(B,(C,(D,(E,(F,(G,(H,(I,(J,(K,L)))))))))));" > t1<-read.tree(text=aa) > t4 <- compute.brlen(t1,method="Grafen",1) > tipvals <- c(0,1,0,1,0,1,0,1,0,1,0,1) > Kcalc(tipvals,t4) > > K = 0.3487135 > > > There are possible 924 permutations of 6 'painted' tips from 000000111111 > to 111111000000 (each of those two extreme distributions gives the maximum > value of K for this particular tree & number [6] of painted tips: 2.37703) > > There are 336 discrete integer value results of K for this tree, and > painting 6 tips: > > Min. 1st Qu. Median Mean 3rd Qu. Max. > 0.3438 0.3676 0.4489 0.5332 0.5945 2.3770 > > > A histogram of all 924 possible values of K (for 6 painted tips) shows that > Blomberg's K in terms of value distribution is extremely positively skewed > (it has a skewness of 2.671139), which is great if you're looking to test > phylogenetic signal without false positives, but not so great if you're > trying to assess "evenness" at the other tail of the distribution. In the > case of this exact tree, because I've enumerated all possible permutations > of 6 painted tips, I can calculate the 5% significance threshold as values > of K that are 0.3480158 or less. It seems some normalisation procedure > might be needed before safely using Blomberg's K to assess the significance > of evenness, if one is not going to exhaustively examine/enumerate all > possible values (which I can't do for a 1000+ tip tree). > > Does that make sense? > > Certainly interesting... > > > Ross > > > > > > > > > On 14 March 2017 at 14:53, Ross Mounce <ross.mou...@gmail.com> wrote: > > > Thanks Dave, > > > > I'll try Blomberg's K with small simulated fully-bifurcating trees of > > simple shape (e.g. fully pectinate), where I can easily paint the tips > > myself in what I believe to be a "maximally stratified manner" e.g. > > 010101010 to see if Blomberg's K does actually reach minimum (i.e. > 0.00000 > > ?) for such a distribution. If it does, great! This is the measure I > need. > > > > I still wonder though, for a complex tree structure in terms of > > balance/shape somewhere intermediate between fully balanced and fully > > pectinate; how does one arrive empirically at _the_ most optimal > > stratified/even sampling ('painting') of tips if say only 25% of tips > > are/can be 'painted'. I guess a lot depends on how one defines what 'even > > sampling' on a phylogeny actually is, does it include branch lengths et > > cetera... > > > > I'll give it a try anyway, > > > > Thanks again, > > > > Ross > > > > > > On 14 March 2017 at 14:33, David Bapst <dwba...@gmail.com> wrote: > > > >> Ross, > >> > >> An interesting question. I understand it as that you want to test if > >> the trait is overdispersed relative to phylogeny, which still makes me > >> think that measures of 'phylogenetic signal' might be still be useful, > >> even though the typical interpretation is 'signal' as 'heritability'. > >> I would try some toy examples with smallish trees and artificial data > >> and play with different signal measures; particularly your idea > >> regarding that the variance is high at the level of closest > >> relatedness suggests that you perhaps should investigate Blomberg's K > >> as a measure, rather than Pagel's lambda: > >> > >> Blomberg, S. P., T. Garland, and A. R. Ives. 2003. Testing for > >> phylogenetic signal in comparative data: behavioral traits are more > >> labile. Evolution 57. > >> > >> However, your soft polytomies are worrisome; I suggest using the MPT > >> or posterior tree sample, if such exists, or considering resolving > >> those polytomies somehow. > >> > >> Cheers, > >> -Dave > >> > >> On Tue, Mar 14, 2017 at 5:45 AM, Ross Mounce <ross.mou...@gmail.com> > >> wrote: > >> > Hi all, > >> > > >> > I'm interested in the distribution of a non-heritable binary > >> > trait/observation across a large tree 1000+ tip tree. The tree is > >> > non-distinct in shape and balance, it is neither fully pectinate nor > >> fully > >> > balanced. It has many soft polytomies too. > >> > > >> > I believe the distribution of this trait to be significantly > stratified > >> > such that just for the sake of explanation, every other tip is > "present" > >> > for the trait. So essentially I'm interested in testing the evenness > of > >> > distribution of "present" tips across the tree. > >> > > >> > In this instance it doesn't seem to me that I should be testing for > >> > "phylogenetic signal" or using models that do that, nor am I testing > the > >> > randomicity of distribution of the trait. > >> > Specifically, I want to test if the observed distribution is > >> significantly > >> > close to "perfect" stratification for the given number of "presences" > >> > (which is ~33% of the tips of the tree), on the given fixed tree > shape. > >> > > >> > TL;DR > >> > > >> > How can I meaningfully test the evenness of the distribution of a > binary > >> > trait across a tree, with R? > >> > > >> > > >> > Any ideas? > >> > > >> > Thanks, > >> > > >> > Ross > >> > > >> > > >> > -- > >> > -- > >> > -/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/ > >> -/-/-/-/-/-/-/- > >> > Ross Mounce, PhD > >> > Software Sustainability Institute Fellow > >> > Dept. of Plant Sciences, University of Cambridge > >> > www.rossmounce.co.uk <http://rossmounce.co.uk/> > >> > -/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/ > >> -/-/-/-/-/-/-/- > >> > > >> > [[alternative HTML version deleted]] > >> > > >> > _______________________________________________ > >> > R-sig-phylo mailing list - R-sig-phylo@r-project.org > >> > https://stat.ethz.ch/mailman/listinfo/r-sig-phylo > >> > Searchable archive at http://www.mail-archive.com/r- > >> sig-ph...@r-project.org/ > >> > >> > >> > >> -- > >> David W. Bapst, PhD > >> Adjunct Asst. Professor, Geology and Geol. Eng. > >> South Dakota School of Mines and Technology > >> 501 E. St. Joseph > >> Rapid City, SD 57701 > >> > >> http://webpages.sdsmt.edu/~dbapst/ > >> http://cran.r-project.org/web/packages/paleotree/index.html > >> > > > > > > > > -- > > -- > > -/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/- > > /-/- > > Ross Mounce, PhD > > Software Sustainability Institute Fellow 2016 > > Dept. of Plant Sciences, University of Cambridge > > www.rossmounce.co.uk <http://rossmounce.co.uk/> > > -/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/- > > /-/- > > > > > > -- > -- > -/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/ > -/-/-/-/-/-/-/- > Ross Mounce, PhD > Software Sustainability Institute Fellow 2016 > Dept. of Plant Sciences, University of Cambridge > www.rossmounce.co.uk <http://rossmounce.co.uk/> > -/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/ > -/-/-/-/-/-/-/- > > [[alternative HTML version deleted]] > > _______________________________________________ > R-sig-phylo mailing list - R-sig-phylo@r-project.org > https://stat.ethz.ch/mailman/listinfo/r-sig-phylo > Searchable archive at http://www.mail-archive.com/r- > sig-ph...@r-project.org/ > [[alternative HTML version deleted]] _______________________________________________ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/