Hi Jacob, ape::makeNodeLabel(phy, method = "md5sum") returns 'phy' with node labels that depend on the tips descendant from each node. For instance:
tr3 <- makeNodeLabel(rtree(3), m = "m") tr4 <- makeNodeLabel(rtree(4), m = "m") any(tr3$node.label %in% tr4$node.label) If you repeat these 3 commands several times, you should have ~20% of TRUE. In your case, match() should make more sense. Also, I suppose your trees are rooted. If they are unrooted, you should consider using splits (or root them). Best, Emmanuel ----- Le 18 Fév 21, à 0:59, Jacob Berv jakeberv.r.sig.ph...@gmail.com a écrit : > Dear R-sig-phylo, > > Over the weekend, I asked Liam Revell if he had a solution to use matchNodes > for > a particular problem I’m trying to solve—finding all phylogenetically > equivalent nodes when comparing trees that have uneven taxon samples and > different topologies. Liam was kind enough to take some time to write a blog > post about this, and got me started with some code > > http://blog.phytools.org/2021/02/on-matching-nodes-between-trees-using.html > > On it’s face this seems like a simple problem, but I’m running into some > issues > and thought I would reach out to the broader group. The code linked above > seems > to work, but only for comparing trees that start out as topologically > identical. For my purposes, I’m trying to match nodes from a given a > reference, > to nodes in and across several hundred gene trees that differ in topology and > taxon sample relative to the reference. > > Here is a function definition based on Liam’s example > > #function to match nodes from consensus > #to individual gene trees with uneven sampling > #derived from Liam Revell's example-- need to > testmatch_phylo_nodes<-function(t1, t2){ > ## step one drop tips > t1p<-drop.tip(t1,setdiff(t1$tip.label, t2$tip.label)) > t2p<-drop.tip(t2,setdiff(t2 $tip.label, t1$tip.label)) > > ## step two match nodes "descendants" > M<-matchNodes(t1p,t2p) > > ## step two match nodes "distances" > M1<-matchNodes(t1,t1p,"distances") > M2<-matchNodes(t2,t2p,"distances") > > ## final step, reconcile > MM<-matrix(NA,t1$Nnode,2,dimnames=list(NULL,c("left","right"))) > > for(i in 1:nrow(MM)){ > MM[i,1]<-M1[i,1] > nn<-M[which(M[,1]==M1[i,2]),2] > if(length(nn)>0){ > MM[i,2]<-M2[which(M2[,2]==nn),1] > } > } > return(MM) > } > > > When t1 and t2 are trees that have topological conflicts, this function > returns > an error: > > Error in MM[i, 2] <- M2[which(M2[, 2] == nn), 1] : > replacement has length zero > > I think(?) this happens because a particular node doesn’t exist in one or the > other trees, and it returns integer(0) at that line — but I’m not sure I > really > understand what is going on here. > > > I modified Liam’s code slightly to get it to run without error in the > described > case, by making it conditional on that particular line: > > > Modified version > > #function to match nodes from consensus > #to individual gene trees with uneven sampling > #derived from Liam Revell's example-- need to test > match_phylo_nodes<-function(t1, t2){ > ## step one drop tips > t1p<-drop.tip(t1,setdiff(t1$tip.label, t2$tip.label)) > t2p<-drop.tip(t2,setdiff(t2 $tip.label, t1$tip.label)) > > ## step two match nodes "descendants" > M<-matchNodes(t1p,t2p) > > ## step two match nodes "distances" > M1<-matchNodes(t1,t1p,"distances") > M2<-matchNodes(t2,t2p,"distances") > > ## final step, reconcile > MM<-matrix(NA,t1$Nnode,2,dimnames=list(NULL,c("left","right"))) > > for(i in 1:nrow(MM)){ > MM[i,1]<-M1[i,1] > nn<-M[which(M[,1]==M1[i,2]),2] > if(length(nn)>0){ > if(length(which(M2[,2]==nn))>0){ > MM[i,2]<-M2[which(M2[,2]==nn),1] > } > } else { > } > } > return(MM) > } > > > I’ve been experimenting with this and some downstream code for the last few > days, but I’ve run into some weird inconsistent results (not easily > summarized) > that make me think that this function is not working as intended. > > I was wondering — have any of you dealt with a similar problem? In principle > this seems like it should be similar to concordance analysis, but I care less > about identifying the proportion of nodes that exist in gene trees given a > reference, and instead I need the actual node numbers in a given gene tree > that > are phylogenetically equivalent to particular nodes in a reference. Happy to > try to hack away at something… > > > Best, > Jake Berv > > > > > > [[alternative HTML version deleted]] > > _______________________________________________ > R-sig-phylo mailing list - R-sig-phylo@r-project.org > https://stat.ethz.ch/mailman/listinfo/r-sig-phylo > Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/ _______________________________________________ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/