Hello all, Unable to get phyloch's read.mrbayes() to read my .con.tre files (for reasons unknown), I've gone with Graham's suggestion to use the simple output format from sumt. However, as I have a large number of analyses which took considerable computational time (and I am stubborn to learn how to run these on a cluster) I decided the more economical thing to do was to write an R script that took the existing nexus files and wrote a nexus file with MrBayes blocks that would output 'simple' format consensus trees. I realize such string manipulation might be easier in some other language, but I know R best.
Here it is, in case anyone else ever needs such a thing: _________________________________________________ #change to dir for MB files setwd("C:/dave/workspace/mrbayes") #get files files<-list.files() #get tree files treeFiles<-files[grep(x=files,pattern="run..t")] #get the nexus files inputs<-sort(unique(sub(".run..t","",treeFiles))) inputs<-sort(unique(sub(".tree.","",inputs))) #now start constructing sumt lines bayesBlock<-"#NEXUS" for(i in 1:length(inputs)){ origNex<-scan(inputs[i],what="character",sep="\n",blank.lines.skip=FALSE) cropNex<-origNex[2:grep(origNex,pattern="mcmcp")] #remove log line cropNex<-cropNex[-grep(cropNex,pattern="log start")] sumtLine<-paste0("sumt filename=",inputs[i]," outputname=", paste0(inputs[i],".simple")," conformat=simple;") bayesBlock<-c(bayesBlock,cropNex,sumtLine,"","end;","") } write(bayesBlock,file="simple_sumt.nex") _______________________________________________ Some caveats: this script assumes you already have mrbayes blocks in your nexus files (all of which are stored with all output in the same directory), including a line where you set MCMC parameters (mcmcp) just prior to calling the MCMC itself and after any other modifications to the dataset, model or prior settings. If you execute the nexus file created, it should run sumt and output new output files with 'simple' in the file name. Analyses where there are multiple trees (such as partitioned analyses with unlinked branch lengths) will have multiple consensus trees estimated for them, as expected if we had originally done this immediately after running mcmc. Also note I crop out any lines that create new log files, because I didn't want to write over my old log files. The resulting simple-format consensus tree files can be read into ape with read.nexus just fine. For whatever reason, these nexus files contain two trees, otherwise identical other than that one contains the posterior probabilities as its node-labels. ....However! There is a big however here. If we re-root the tree in ape (and there are many reasons why we might want to do so), the posterior probabilities, which are stored as node labels, no longer make sense as those node labels are not properties technically of the nodes but of the edges preceding those nodes. Change the directionality of the graph by rerooting and those posterior probabilities end up in the wrong place. (TreeFig apparently figures this out on its own, automatically, if your reroot in TreeFig.) I haven't found any option in an R package that slides node-labels around as if they were support values when the tree is rerooted (and if there is one, please let me know!) so I had to write my own thing for doing that to the newly created 'simple' consensus tree files... So here's another script. _____________________________________________ #change dir setwd("C:/dave/workspace/mrbayes") library(ape) files<-list.files() #get consensus nexus files (ugh regexp) conFiles<-files[grep(x=files,pattern=".con.tre")] #get only 'simple' files conFiles<-conFiles[grep(x=conFiles,pattern="simple")] conName<-sub(".nex.simple","",conFiles) conName<-sub(".con.tre","",conName) trees<-list() for(i in 1:length(conFiles)){ treesRead<-read.nexus(conFiles[i]) treeA<-treesRead[[2]] treeB<-treesRead[[1]] #now we're going to reroot and let's use fake species names hahah treeA<-root(treeA,resolve.root=TRUE,outgroup=c("OutOfTheGroup","OutThere", "outgroupy","groupout")) #okay basically need to figure out #which taxa are included by a node #and match between treeB and tree A splitsB<-lapply(prop.part(treeB),function(x) if(any(x==1)){treeB$tip.label[x] }else{treeB$tip.label[-x]}) splitsA<-lapply(prop.part(treeA),function(x) if(any(x==1)){treeA$tip.label[x] }else{treeA$tip.label[-x]}) #plot(treeB,show.node.label=TRUE) treeA$node.label<-treeB$node.label[match(splitsA,splitsB)] trees[[i]]<-treeA } class(trees)<-"multiPhylo" #plot(trees) pdf("consensus_trees_10-07-14.pdf",height=8,width=8) for(i in 1:length(trees)){ plot(trees[[i]],main=conName[i],show.node.label=TRUE) add.scale.bar() } dev.off() _____________________________________________________ The last bit just plots it in a pdf, which isn't important at all but I included it for the heck of it. The important stuff is the innards of the first for loop, which reads in the 'simple' consensus tree nexus file, reroots it and then rearranges the posterior probabilities to match the same splits across both trees. I've checked this against the read out logged by MrBayes and the result of rerooted trees in FigTree, and the resulting posterior probabilities match those shown across all three programs. Anyway, I hope this helps the next person trying to figure out how to plot posterior probabilites from MrBayes in R! Cheers, -Dave On Fri, Oct 3, 2014 at 10:29 AM, David Bapst <dwba...@gmail.com> wrote: > Just to clarify, off-list discussion with Graham reveals (after some > confusion on my part) that if the simple format option is used in sumt > in MrBayes, then ape's read.nexus will natively read the posterior > probability values as node-labels. Which is interesting and I had not > come across that information previously. > > Graham's comment is not in reference to the functioning of > read.annotated.nexus. > > Cheers, > -Dave > > > On Fri, Oct 3, 2014 at 8:49 AM, Slater, Graham <slat...@si.edu> wrote: >> Dave, >> >> I’ve had the same trouble with shuffling. However, all of this can be >> avoided if you specify the simple format for your .con file in the mrBayes >> block. >> >> sumt conformat = simple; >> >> The resulting tree will correctly display posterior probabilities in a >> phyloformat tree. >> >> Graham >> >> >> >> ------------------------------------------------------------ >> Graham Slater >> Peter Buck Post-Doctoral Fellow >> Department of Paleobiology >> National Museum of Natural History >> The Smithsonian Institution [NHB, MRC 121] >> P.O. Box 37012 >> >> >> (202) 633-1316 >> slat...@si.edu >> www.fourdimensionalbiology.com >> >> >> >> >> >> On Oct 3, 2014, at 10:42 AM, David Bapst <dwba...@gmail.com> wrote: >> >> Hello all, >> >> Recently, I wanted to display posterior probabilities on a 50% >> compatibility tree from a MrBayes run, created with the 'sumt' >> command. I looked around for ways to do this and found this email >> thread from last year: >> >> https://stat.ethz.ch/pipermail/r-sig-phylo/2013-June/002825.html >> >> ...which suggests using read.annotated.nexus() in package epibase, >> which has been renamed OutbreakTools more recently. >> >> Unfortunately, it appears read.annotated.nexus in OutbreakTools >> (v0.1-11, in R 3.1.1) improperly creates the new edge matrix, >> resulting in an apparent shuffling of tip labels. As this function was >> discussed on R-Sig-Phylo, I am sending this bug report to both the >> list and to the package maintainer (Thibaut J.). >> >> I don't have any BEAST output files, so I cannot test if this occurs >> with files that are not created by MrBayes. >> >> To show this phenomen, I have created an example .con.tre file using >> an example matrix of standard characters for several elephants that I >> stole off of Joe F.'s website. You can find the .con.tre file here on >> gdrive: >> >> https://drive.google.com/file/d/0B_xvEcEvKno_LTFhSVJrdHMtNFU/view?usp=sharing >> >> And now in R, a comparison with a tree: >> >> library(OutbreakTools) >> library(ape) >> >> tree1<-read.nexus("mat.nex.con.tre") >> tree2<-read.annotated.nexus("mat.nex.con.tre") >> >> layout(1:2) >> plot(tree1,no.margin=TRUE) >> plot(tree2,no.margin=TRUE) >> >> identical(tree1$tip.label,tree2$tip.label) >> identical(tree1$edge.length,tree2$edge.length) >> identical(tree1$edge,tree2$edge) >> >> Inspection of the tree file with FigTree suggests that the ape >> function is producing the correct tree and the OutbreakTools function >> is not. We can see the why with last three lines in the console: >> >> identical(tree1$tip.label,tree2$tip.label) >> >> [1] TRUE >> >> identical(tree1$edge.length,tree2$edge.length) >> >> [1] TRUE >> >> identical(tree1$edge,tree2$edge) >> >> [1] FALSE >> >> Closer inspection suggests that the issue appears to be that some >> terminal edges are inappropriately shuffled, thus shuffling the >> terminal tip labels on those edges, while leaving the tip.label order >> and the edge lengths the same. Also, larger trees seem to produce >> larger inconsistencies in the tree produced. >> >> So... now for the list, are there any other methods on CRAN for >> obtaining the posterior probabilities from a .con.tre file? Otherwise, >> next I guess I'll be trying phyloch. >> >> Cheers, >> -Dave >> >> -- >> David W. Bapst, PhD >> Adjunct Asst. Professor, Geology and Geol. Eng. >> South Dakota School of Mines and Technology >> 501 E. St. Joseph >> Rapid City, SD 57701 >> >> http://webpages.sdsmt.edu/~dbapst/ >> http://cran.r-project.org/web/packages/paleotree/index.html >> >> _______________________________________________ >> R-sig-phylo mailing list - R-sig-phylo@r-project.org >> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo >> Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/ >> >> > > > > -- > David W. Bapst, PhD > Adjunct Asst. Professor, Geology and Geol. Eng. > South Dakota School of Mines and Technology > 501 E. St. Joseph > Rapid City, SD 57701 > > http://webpages.sdsmt.edu/~dbapst/ > http://cran.r-project.org/web/packages/paleotree/index.html -- David W. Bapst, PhD Adjunct Asst. Professor, Geology and Geol. Eng. South Dakota School of Mines and Technology 501 E. St. Joseph Rapid City, SD 57701 http://webpages.sdsmt.edu/~dbapst/ http://cran.r-project.org/web/packages/paleotree/index.html _______________________________________________ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/