Re: [R-sig-phylo] Simulation function from Discrete
Hi Ekaphan, See the function rTraitMult. You have to program your model though. Cheers, Emmanuel -Original Message- From: Ekaphan Kraichak ekraic...@gmail.com Sender: r-sig-phylo-boun...@r-project.org Date: Tue, 6 Dec 2011 09:58:34 To: r-sig-phylo@r-project.org Subject: [R-sig-phylo] Simulation function from Discrete Hi all, I am currently doing analyses on phylogenetic correlations of multiple discrete traits (based on Pagel 1994 paper). For the most traits, there were only a few transitions on the tree, making it difficult to use likelihood ratio tests for hypothesis testing. I was wondering if there is a way to simulate character data under Pagel's Independent Model in R. This function used to be in the Discrete software, which is no longer distributed. I am aware of rTraitDisc and sim.char in ape and geiger, but not sure how to make it simulate data for correlated traits. Any help on this would greatly be appreciated. Thank you, Ekaphan Kraichak ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Re: [R-sig-phylo] taxonomic tables and trees
Hi David, It's in ape: ?as.phylo.formula Cheers, Emmanuel -Original Message- From: David Buckley dbuck...@mncn.csic.es Sender: r-sig-phylo-boun...@r-project.org Date: Tue, 13 Dec 2011 12:56:18 To: r-sig-phylo@r-project.org Subject: [R-sig-phylo] taxonomic tables and trees Hi all, I remember seeing an R script/application to transform a taxon table (say an excell table containing taxonomic information for a group organized as orders, families, genera, etc. in columns) onto a 'phylogenetic/taxonomic' tree (groups defined by the nested taxonomic ranks), but I can't remember where was that. Any hint? best and thanks david David Buckley Dpt. Biodiversidad y Biología Evolutiva Museo Nacional de Ciencias Naturales, CSIC c/José Gutiérrez Abascal 2 28006-Madrid Spain Phone: +34 91 411 13 28 ext. 1124 dbuck...@mncn.csic.es ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Re: [R-sig-phylo] Meaning of Transformed Phylogenetically Dependent Variables in GLS?
Hi Nicolai, Nicolai Cryer wrote on 06/01/2012 01:14: Hi all, I am under the impression that, given two continuous variables that can be shown (Pagel's Lambda or Gittleman Kot's autocorrelation) to be dependent on phylogeny, calling gls(variable1 ~ variable 2, correlation = corStruct) returns a regression on derived variables that have been transformed given the known correlation structure (under some evolutionary model). Yes, your impression is correct: these are the phylogenetically independent contrasts. My question is: is there any point to plotting a regression on this internally transformed data? Yes, see ?pic. I have some continuous variables that are quite meaningful, but when I manually perform the transformation I get values that are meaninglessly high (like a forearm length of 1,2 meters, compared to 6,5 centimeters originally). I might be doing the manual transformation the wrong way, in which case I'd appreciate a comment. I guess you want to ask a quite different question that is: given the phenotypes for species 1, species 2, ..., what would be the phenotype of species 2 if it were /not/ evolved from a common ancestor with species 1? I think it's a more complicated issue that cannot be solved with a simple back-transformation. Best, Emmanuel For variables X and Y varmatrix- vcv.phylo(phy = tree, correlation = corGrafen(1, tree)) transmatrix- chol(varmatrix) X.transformed- transmatrix %*% X Y.transformed- transmatrix %*% Y mylm- lm(Y ~ X) plot(Y ~ X) abline(coef(mylm)) The resulting estimates are different than from a normal gls, gls(y ~ x, correlation = corGrafen(1, tree)), so I would be happy to know what it is I'm looking for. Thanks a lot, Best, NC [[alternative HTML version deleted]] ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo -- Emmanuel Paradis IRD, Jakarta, Indonesia http://ape.mpl.ird.fr/ ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Re: [R-sig-phylo] 2 root edges in phylocom tree, but suggested fix in ape FAQ doesn't help
Hi Ben Megan, It's not the labelling of internal nodes which is the problem here: the 'extra' right parentheses imply the presence of nodes of degree two (ie, non-bifurcating). read.tree(text=(((abc),(ghi)),(mno));) also fails (for sure, the error message is not very explicit). It's been some time that this issue is around: I think it's time to create a new class derived from phylo to solve it. This will involve some recoding of course. Currently, I'm working on finalizing the release of ape 3.0, so this will have to wait a couple of months. Best, Emmanuel Ben Bolker wrote on 13/01/2012 02:10: On 12-01-12 01:54 PM, Megan Bartlett wrote: Hi, I'm trying to use R package ape to manipulate the phylocom megatree, version R20091110. However, whenever I try to read in the tree with: read.tree(R20091110.new.txt) I get the error message: Error in if (sum(obj[[i]]$edge[, 1] == ROOT) == 1 dim(obj[[i]]$edge)[1] : missing value where TRUE/FALSE needed I know that the suggested fix is to delete ( and )euphyllophyte from the beginning and end of the file, but this doesn't solve the problem. Trying to read the new tree gives me exactly the same error message. What should I be doing instead? Thanks so much for your help! Best, Megan Bartlett [[alternative HTML version deleted]] I started to respond to this on r-h...@r-project.org ... it's best not to cross-post if you can help it, or at least to say that you're crossposting (and why). Based on this small experiment: library(ape) r- read.tree(url(http://svn.phylodiversity.net/tot/megatrees/R20091110.new;)) r- read.tree(text=(((abc)def,(ghi)jkl),(mno)pqr);) r- read.tree(text=((def,jkl),pqr);) it looks like ape's read.tree() doesn't like the internal-node labeling convention that phylocom is using. Someone else will probably respond more helpfully ... Ben Bolker -- Emmanuel Paradis IRD, Jakarta, Indonesia http://ape.mpl.ird.fr/ ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Re: [R-sig-phylo] Add new leaves to a tree
Hi Brian, See bind.tree. Best, Emmanuel Brian Pellerin wrote on 10/01/2012 21:36: Hello R users, How do I grow a whole new set of leaves onto an existing tree? ### example with random tree library(ape) nleaf-1000 #Number of leaves on a tree br-sample(0:20,2*(nleaf-1),replace=T)#Branch lengths for tree tr-rtree(nleaf,br=br) #Tree crown Thanks for your help. Sincerely, Brian ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo -- Emmanuel Paradis IRD, Jakarta, Indonesia http://ape.mpl.ird.fr/ ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Re: [R-sig-phylo] randomizing clades within a node
Hi Juan, Juan Antonio Balbuena wrote on 13/01/2012 21:57: Hello I would very much appreciate if someone can help me with this one. I wish to compare two additive trees that are identical for some but not all the clades. So I use rtree of the ape package with, say, 10 tips to generate tree A. Then I wish to modify A to get tree B by randomizing the clades and patristic distances in a given number of nodes. If the number of nodes is, for instance, three, the 3 nodes furthest apart from the origin will be randomized. Then A and B will be compared. The whole procedure needs to be carried out a large number of times (10,000). The functions you need are extract.clade and bind.tree. The both have options to allow you to flexibly handle eventual root edges, so have a look at their help pages. HTH Emmanuel Any help would be much appreciated. Juan A. Balbuena -- Dr. Juan A. Balbuena Marine Zoology Unit Cavanilles Institute of Biodiversity and Evolutionary Biology University of Valencia http://www.uv.es/~balbuena http://www.uv.es/%7Ebalbuena P.O. Box 22085 http://www.uv.es/cavanilles/zoomarin/index.htm 46071 Valencia, Spain http://cetus.uv.es/mullpardb/index.html e-mail: j.a.balbu...@uv.es mailto:j.a.balbu...@uv.es tel. +34 963 543 658 fax +34 963 543 733 NOTE! For shipments by EXPRESS COURIER use the following street address: C/ Catedrático José Beltrán 2, 46980 Paterna (Valencia), Spain. ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo -- Emmanuel Paradis IRD, Jakarta, Indonesia http://ape.mpl.ird.fr/ ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Re: [R-sig-phylo] plotting character states on a circular cladogram
Hi Rafael, Rafael Rubio de Casas wrote on 02/02/2012 06:24: Hi Nicolai, Thanks for your message. I guess I wasn't completely clear in my previous message. tiplabels() does do a similar thing to what I want. In fact, it is what I ahve been using. The problem is that I think that with tiplabels you cannot place the colored dots at a particular position relative to the tips on a circular cladogram. You are right. I thought that points() would be easier to adapt for that, but maybe not. tiplabels() actually calls points() so you have to tell the function how to offset the coordinates which is not obvious because they are circular, -- though plot.phylo() does it when label.offset is used. I guess what you want is to prevent the labels to hide the terminal edges. Here's a suggestion for this: plot(phy, type=fan, label.offset=2, plot=FALSE) tiplabels(pch=21, cex=1.5, bg=label[phy$tip.label]) par(new=TRUE) plot(phy, type=fan, label.offset=2) The idea is to first draw the tip labels, and then the tree, so that the terminal edges are not hidden by the labels. The first line sets the drawing region but nothing is plotted so the labels can be drawn. The trick is to call par(new=TRUE) so that plot() does not refresh the graph. HTH Emmanuel Thanks for your help, anyways. Cheers, Rafa On 2/1/2012 5:25 PM, Nicolai Cryer wrote: Hey Rafael, (By the way, thanks for the easily reproducible code). I think you're looking for the tiplabels() function, so instead of using points, try: tiplabels(pch=21,cex=1.5, bg=label[phy$tip.label]) This should amount to the same kind of color coding, but the layout doesn't break when plotting a radial tree. Hope that helps, Best, Nicolai On 2/1/2012 2:40 PM, Rafael Rubio de Casas wrote: Dear R-Sig-Phylo users, I have what I believe should be an easy problem: I want to plot character states on the tips of a fan cladogram produced by ape. What I am looking for, I believe, is something similar to what you can do with points on a rectangular phylogram. That is, assign points to tips colored according to their trait value. For instance, the following code generates what I want for a rectangular phylogram: library(ape); library(geiger) phy=rcoal(25, tip.label=c(letters[1:25])) phy=rescaleTree(phy,25) trait=as.vector(sample(0:1,25,replace=T)) names(trait)=phy$tip.label label=character(length(phy$tip.label)) names(label)=names(trait) label[trait==0]=dodgerblue label[trait==1]=firebrick plot(phy, show.tip.label=T, label.offset=1.5,no.margin=TRUE) points(rep(26, length(phy$tip.label)), 1:length(phy$tip.label), pch=21, cex=1.5, bg=label[phy$tip.label]) My problem is that I don't know how to get the same sort of mapping of trait states at the tips when using plot(phy, type=fan) Thanks in advance, Rafa -- Emmanuel Paradis IRD, Jakarta, Indonesia http://ape.mpl.ird.fr/ ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Re: [R-sig-phylo] Estimating MRCA dates
Hi Dan, This is indeed an issue. I had the opportunity a few months ago to get the outputs from r8s and chronopl on the same data: this showed some difficulties with r8s, namely the starting and final values of the PL were identical. I'm currently working on chronopl because it fails to find decent starting values when several nodes are defined within intervals (it's looping endlessly). I've more or less solved this problem and this should be fixed in ape 3.0. I admit that chronopl has some weaknesses. I can see three explanations (all non-exclusive): 1) The implemantation in chronopl is flawed in which case it can be fixed. 2) The model of correlated rates is not appropriate in most cases. I remember to have seen a paper stating that the relaxed clock model (without correlation of rates) better fits real data. 3) The problem of optimizing a function of so many parameters is too difficult for the currently used algorithm in which case a Bayesian approach might be more appropriate, or using a better optimization algorithm. Cheers, Emmanuel Dan Rabosky wrote on 02/02/2012 22:55: Hi Brian- It has been awhile since I played with chronopl(), but at the time, I convinced myself that results were substantially different from penalized likelihood as implemented in r8s. I haven't explored this exhaustively, but at the time, I was very much convinced that the results of chronopl were potentially problematic. Other folks have mentioned to me that they've observed similar differences between r8s and chronopl and that they've recovered bizarre smoothed trees from chronopl. Perhaps someone with more recent chronopl experience can weigh in on this... ~Dan On Feb 2, 2012, at 6:06 AM, Brian O'Meara wrote: Ape can do this, function chronopl. In general, one good way to find out which package can do something is by looking at the task view for phylogenetics: http://cran.r-project.org/web/views/Phylogenetics.html . This will also allow you to install all the phylogenetics packages (at least those on CRAN) with just a couple of commands. An updated version should be coming out later today (and if someone notices other things missing, please let me know). Brian ___ Brian O'Meara Assistant Professor Dept. of Ecology Evolutionary Biology U. of Tennessee, Knoxville http://www.brianomeara.info Students wanted: Applications due Dec. 15, annually Postdoc collaborators wanted: Check NIMBioS' website Funding wanted: Want to collaborate on a grant? ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo -- Emmanuel Paradis IRD, Jakarta, Indonesia http://ape.mpl.ird.fr/ ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Re: [R-sig-phylo] plotting character states on a circular cladogram
Rafael Rubio de Casas wrote on 02/02/2012 22:02: Hi, Thanks, Emmanuel. I did try to play around with tip.labels (by the by, I did not know it called points, although I guess it makes a lot of sense). The problem I run into is that, for some reason, label.offset does not seem to work with the graphical labels. For instance, I would expect two concentric circles of colored dots with: It is indeed possible to do what you want (the code is in plot.phylo), but since you are working on a textbook example, my humble suggestion is to use a linear tree for this; it'll be also much easier to plot information on several traits. You'll get the same information with a fan tree but more difficult to decrypt by the human eye -- though others might have different opinions on this. plot(phy, type=fan, label.offset=5, plot=FALSE) tiplabels(pch=21, cex=1.5, bg=label[phy$tip.label]) par(new=TRUE) plot(phy, type=fan, label.offset=2, plot=FALSE) tiplabels(pch=21, cex=1.5, bg=label[phy$tip.label]) However only one is produced, and those dots are placed right on the tips. You can see that if you then follow with par(new=TRUE) plot(phy, type=fan, label.offset=2) I guess this is mostly irrelevant, But I am working on a figure for a book and I would like it to be a textbook example Also, when node states are drawn with nodelabels(), if if the terminal branches are very short, the plot of the nodes get confused with those of the tips For instance, if you try: ace.ER=ace(trait,phy,type=discrete) nodelabels(pie=ace.ER$lik.anc, piecol=c(dodgerblue,firebrick), cex=0.5,lwd=0.5) You can see that the last node plots are at the same level of the tip plots. If tips were denser (i.e., if the tree had more terminals) that would make the plot virtually unreadable. You might prefer to plot your tree with: plot(phy, c, FALSE) You might want to try 'thermo' instead of 'pie'. Best, Emmanuel Cheers, Rafa On 2/2/2012 7:03 AM, Emmanuel Paradis wrote: Hi Rafael, Rafael Rubio de Casas wrote on 02/02/2012 06:24: Hi Nicolai, Thanks for your message. I guess I wasn't completely clear in my previous message. tiplabels() does do a similar thing to what I want. In fact, it is what I ahve been using. The problem is that I think that with tiplabels you cannot place the colored dots at a particular position relative to the tips on a circular cladogram. You are right. I thought that points() would be easier to adapt for that, but maybe not. tiplabels() actually calls points() so you have to tell the function how to offset the coordinates which is not obvious because they are circular, -- though plot.phylo() does it when label.offset is used. I guess what you want is to prevent the labels to hide the terminal edges. Here's a suggestion for this: plot(phy, type=fan, label.offset=2, plot=FALSE) tiplabels(pch=21, cex=1.5, bg=label[phy$tip.label]) par(new=TRUE) plot(phy, type=fan, label.offset=2) The idea is to first draw the tip labels, and then the tree, so that the terminal edges are not hidden by the labels. The first line sets the drawing region but nothing is plotted so the labels can be drawn. The trick is to call par(new=TRUE) so that plot() does not refresh the graph. HTH Emmanuel Thanks for your help, anyways. Cheers, Rafa On 2/1/2012 5:25 PM, Nicolai Cryer wrote: Hey Rafael, (By the way, thanks for the easily reproducible code). I think you're looking for the tiplabels() function, so instead of using points, try: tiplabels(pch=21,cex=1.5, bg=label[phy$tip.label]) This should amount to the same kind of color coding, but the layout doesn't break when plotting a radial tree. Hope that helps, Best, Nicolai On 2/1/2012 2:40 PM, Rafael Rubio de Casas wrote: Dear R-Sig-Phylo users, I have what I believe should be an easy problem: I want to plot character states on the tips of a fan cladogram produced by ape. What I am looking for, I believe, is something similar to what you can do with points on a rectangular phylogram. That is, assign points to tips colored according to their trait value. For instance, the following code generates what I want for a rectangular phylogram: library(ape); library(geiger) phy=rcoal(25, tip.label=c(letters[1:25])) phy=rescaleTree(phy,25) trait=as.vector(sample(0:1,25,replace=T)) names(trait)=phy$tip.label label=character(length(phy$tip.label)) names(label)=names(trait) label[trait==0]=dodgerblue label[trait==1]=firebrick plot(phy, show.tip.label=T, label.offset=1.5,no.margin=TRUE) points(rep(26, length(phy$tip.label)), 1:length(phy$tip.label), pch=21, cex=1.5, bg=label[phy$tip.label]) My problem is that I don't know how to get the same sort of mapping of trait states at the tips when using plot(phy, type=fan) Thanks in advance, Rafa -- National Evolutionary Synthesis Center *NESCent http://www.nescent.org/* 2024 W. Main Street, Suite A200 Durham, NC27705 r...@nescent.org mailto:r...@duke.edu 919.668.9107 -- Emmanuel Paradis IRD, Jakarta, Indonesia http://ape.mpl.ird.fr
[R-sig-phylo] query about the option 'original.data' in write.nexus()
Hi all, I am thinking to remove the option 'original.data' in write.nexus(). The reasons are: 1. It doesn't seem useful because most NEXUS files read by this function store only trees (at least those reported to me). I believe most users keep their data (seqs, morpho, ...) and trees in separate files. 2. If some tips are deleted/removed from the tree, keeping track with the original file might be tricky (currently nothing is done to check this). 3. phylobase has better support for NEXUS files, so trying to fix the above in ape will be redundant. 4. Removing this option will make write.tree() slightly more efficient. Any advice/comment welcome. Cheers, Emmanuel -- Emmanuel Paradis IRD, Jakarta, Indonesia http://ape.mpl.ird.fr/ ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Re: [R-sig-phylo] collapse nodes based on node labels
Hello Ben, Ben Weinstein wrote on 13/02/2012 00:36: Hello all, Is there an analogous function to ape's di2multi which collapses descendant branches based on an input node? Instead of minimum branch tolerance to create polytomies, what would the best strategy to collapse all descendant branches of node X. I've played around with node.leaves, but haven't had great luck. I'm considering pruning the tree, collapsing the entire subtree and regrafting it on. which.edge() might be useful here. You may need to combine it with prop.part() though, and a way to identify internal edges, eg: phy$edge[, 2] Ntip(phy) Similarly to this, is there a way to maintain the original node numbering, even after setting nodelabels? For example if i set my nodelabels to my support values, but still want to highlight the first node in the tree. The support values are non unique (multiple 99s), so it becomes harder to specify commands to specific nodes. I don't follow. nodelabels() only draws labels on a plotted tree and doesn't modify the node numbering. See the examples to see how to call this function several times on the same tree. Best, Emmanuel I appreciate the help, Ben Weinstein -- Emmanuel Paradis IRD, Jakarta, Indonesia http://ape.mpl.ird.fr/ ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Re: [R-sig-phylo] adding a new taxa to an existing clade in a tree set
Hi Annemarie, Annemarie Verkerk wrote on 16/02/2012 09:57: Hi all, I have a question regarding the manual addition of a leaf to an existing clade in a set of trees. The reason I ask is because I want to test the influence of adding ancestral data to an ancestral state estimation. I have some ancestral data on the feature I want to investigate, and I know from other sources where the leaf is situated on the trees, but I don't have any data that I can use to actually build new trees which also incorporate this leaf at the moment. So I'm hoping to add it manually and see whether it makes a difference. I know it is possible to add a leaf to any of the external edges (I mean, edges leading to the leaves) of a tree using bind.tree() in ape. However, if I wanted to add a leaf to a lower position in the tree, say on an branch that splits and leads to two other leaves, it becomes difficult: not all the trees in the sample might feature the grouping of those two leaves, and even if they do, the relevant edge to which I would want to add the new leaf will have a different number in each tree. So I guess I have two questions: 1. Is there an easy way to select the trees that have a certain clade, so I can try to add a new leaf to only those trees? (I suspect a lot of error messages if I try to do it for the complete tree sample, as there will be trees that do not have the relevant clade.) see ?is.monophyletic 2. Is there a way (for those trees that have the clade I want to add the new leaf to) to define the relevant edge using only the leaf numbers and not the internal nodes/edge numbers? I.e., I want to add the leaf to the internal edge that leads to the leaves with number 5, 6, and 7, for instance, is there a way identify that internal edge? Yes, you can use mrca() like this: m - mrca(phy) phy$edge[, 2] == m[tip1, tip2] A more general way (ie, with clades made of 2 or more tips) is to use prop.part(phy): it will return a list of the tips descendant from each node (in other words, the clades of the tree). This allows you also to combine both steps: list_of_tips - sort(tip1, tip2, ...etc pp - prop.part(phy) node - NULL for (i in seq_along(phy)) { if (isTRUE(all.equal(pp[[i]], list_of_tips))) { node - i + Ntip(phy) break } } if (is.null(node)) ## go to next tree else { phy$edge[, 2] == node ## etc... HTH Emmanuel Many thanks for any suggestions, Annemarie ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Re: [R-sig-phylo] How to replace a clade with a single node?
Hello Ania, See the functions drop.tip and bind.tree. They both have an 'interactive' option. If you want to use node labels, if available, you can find a node number from its label, say xyz, with: grep(xyz, phy$node.label) + Ntip(phy) Best, Emmanuel -Original Message- From: Ania Tassinari anna.krentow...@gmail.com Sender: r-sig-phylo-boun...@r-project.org Date: Thu, 1 Mar 2012 13:58:23 To: r-sig-phylo@r-project.org Subject: [R-sig-phylo] How to replace a clade with a single node? Hello everyone, I am working with a large tree with ~450 tips. Some leaves are redundant as the organisms come from same Genus. I would like to be able to replace the redundant clades with a single leaf representative of the members of the clade. For example, if I have: library(ape) myNewick = (A,(B,(C1,C2))); myTree = read.tree(text=myNewick) plot(myTree) I would like to be able to replace the C1 and C2 clade with a single C leaf, so that: myNewick2 = (A,(B,C)); I would like to not rely on the edge.length if possible but instead on the node label. Although, for some reason myTree$node.label isn't always an available option. Thank you very much in advance for any help! Best, Ania ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Re: [R-sig-phylo] 3. partial correlation with gls residuals? (Tom Schoenemann)
of Anthropology Indiana University Bloomington, IN 47405 Phone: 812-855-8800 E-mail: t...@indiana.edu Open Research Scan Archive (ORSA) Co-Director Consulting Scholar Museum of Archaeology and Anthropology University of Pennsylvania Homepage: http://mypage.iu.edu/~toms/ [[alternative HTML version deleted]] ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo __ Alejandro Gonzalez Voyer Post-doc Estación Biológica de Doñana Consejo Superior de Investigaciones Científicas (CSIC) Av Américo Vespucio s/n 41092 Sevilla Spain Tel: + 34 - 954 466700, ext 1749 E-mail: alejandro.gonza...@ebd.csic.es Web site (Under construction): Personal page: http://consevol.org/members/alejandro.html Group page: http://consevol.org/index.html For PDF copies of papers see: http://csic.academia.edu/AlejandroGonzalezVoyer [[alternative HTML version deleted]] ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo -- Emmanuel Paradis IRD, Jakarta, Indonesia http://ape.mpl.ird.fr/ ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Re: [R-sig-phylo] plot.phylo x.lim/y.lim problem
Hi Alex, To fix this, do fix(plot.phylo) and look for this bit (around line #259): if (phyloORclado direction == downwards) yy - y.lim[2] - yy Change the second line for: yy - max(yy) - yy This will be in the next release. Thank you for the report. Cheers, Emmanuel -Original Message- From: Alex Perkins taperk...@ucdavis.edu Sender: r-sig-phylo-boun...@r-project.org Date: Wed, 21 Mar 2012 15:51:50 To: r-sig-phylo@r-project.org Subject: [R-sig-phylo] plot.phylo x.lim/y.lim problem I am trying to plot a phylogram along a specified time axis. For example, if a phylogeny spans 30 time units, I would like to be able to set the time axis to span 50 time units. This can be achieved with the x.lim or y.lim options, depending on the direction of the tree. However, this always adds the extra time on the node side, whereas I need to add it on the root side. In other words, the tree always gets placed on the top of the plot in the code below, but I need to have it on the bottom. Any help would be greatly appreciated! library(ape) data(bird.orders) # As you can see below, it doesn't matter what I set y.lim to. I get the same thing! # I'm sure this is a rookie mistake, but I don't know what else to do. plot(bird.orders, show.tip.label = FALSE, direction = 'downwards', y.lim = c(-70, 30)) plot(bird.orders, show.tip.label = FALSE, direction = 'downwards', y.lim = c(0, 100)) plot(bird.orders, show.tip.label = FALSE, direction = 'downwards', y.lim = 100) ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Re: [R-sig-phylo] behavior of is.monophyletic()
Hi Bret, This is fixed and will be in the next release. In the meantime, you can edit the function and change the last line to: all.equal(tips, desc) Thanks for the report. Cheers, Emmanuel -Original Message- From: Bret Larget lar...@stat.wisc.edu Sender: r-sig-phylo-boun...@r-project.org Date: Thu, 22 Mar 2012 18:05:24 To: jnylan...@users.sourceforge.net Cc: r-sig-phylo@r-project.org Subject: [R-sig-phylo] behavior of is.monophyletic() Please note the following discrepency in the behavior of is.monophyletic(). -- foo = read.tree(text=((1:2.0,(2:1.0,3:1.0):1.0):1.0,4:3.0);) is.monophyletic(foo,tips=2:3) [1] TRUE is.monophyletic(foo,tips=c(2,3)) [1] FALSE - I am aware that the function works properly when tips are specified as characters. However, the documentation states that tips may be specified as numeric. The difference between c(2,3) and 2:3 is that the former is class numeric and the latter is class integer. I cannot think of any good reason for why this distinction should result in different behavior of the function. It would be best if the function worked correctly in all situations, but in particular when the input matches the usage specified by the documentation. -- Usage is.monophyletic(phy, tips, reroot = !is.rooted(phy), plot = FALSE, ...) Arguments phy a phylogenetic tree description of class phylo. tips a vector of mode numeric or character specifying the tips to be ^^^ tested. reroot a logical. If FALSE, then the input tree is not unrooted before the test. plot a logical. If TRUE, then the tree is plotted with the specified group tips highlighted. ... further arguments passed to plot. - I hope that this can be corrected in a future version of ape. -Bret Larget ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Re: [R-sig-phylo] Error with PGLS under OU model using corMartins
Hi Nick, You can fix the value of alpha in corMartins: corMartins(1, tree, fixed=TRUE) You can try with different values to see which one gives the best fit. Best, Emmanuel -Original Message- From: Nicholas Mason nicholas.albert.ma...@gmail.com Sender: r-sig-phylo-boun...@r-project.org Date: Thu, 22 Mar 2012 12:57:59 To: r-sig-phylo@r-project.org Subject: [R-sig-phylo] Error with PGLS under OU model using corMartins Dear Rsigphylo users, I am having issues running PGLS (with nlme function gls()) under an OU model using the corMartins command from ape. I've included links to a reproducible example via my dropbox account below. Previous posts have been made about the same error I've encounter, but neither thread seems to have relevant replies or have resolved the issue. Apologies if I've missed something else from previous rsigphylo threads. https://stat.ethz.ch/pipermail/r-sig-phylo/2011-January/000875.html http://www.mail-archive.com/r-sig-phylo@r-project.org/msg00577.html In short, the data set I've linked to will run gls() under a correlation structure as defined by corBrownian and corPagel but not corMartins. The following error is encountered with corMartins: Error in recalc.corStruct(object[[i]], conLin) : NA/NaN/Inf in foreign function call (arg 4) Has anyone else experienced this issue or know how to resolve it? What does this message mean? This happens with each variable I examine, not just the example included here. Thanks in advance for any help or answers provided. Cheers, Nick Mason The R object below contains a list called sampledata. sampledata[[1]] is a data frame with tip values and sampledata[[2]] is the corresponding tree. Link to R objects: http://db.tt/ablyDBt1 Link to Rscript via public dropbox: http://db.tt/tJ8jg0JQ require(ape) require(nlme) load(sampledata.Rdata) sampledata[[1]]-tipdata sampledata[[2]]-tree gls(TraitA~TraitB,data=tipdata,correlation=corMartins(1,tree)) gls(TraitA~TraitB,data=tipdata,correlation=corBrownian(1,tree)) gls(TraitA~TraitB,data=tipdata,correlation=corPagel(1,tree)) - Nicholas Albert Mason MS Candidate, Burns Lab http://kevinburnslab.com/ Department of Biology - EB Program San Diego State University 5500 Campanile Drive San Diego, CA 92182-4614 845-240-0649 (cell) [[alternative HTML version deleted]] ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Re: [R-sig-phylo] trouble with subtreeplot returning a tree object on exit
Hi Casey, It works for me under Linux. I guess it must be a specific problem to Windows. Have you tried with the trex() function? Cheers, Emmanuel Casey Bergman wrote on 12/04/2012 13:21: Hello - I have a question about the use of the subtreeplot function in APE. The example code generates an interactive tree as expected, but I do not get a tree2 object returned when I click the close button on the window or use cmd+w. I am wondering if I am exit-ing improperly or if you can help make sense of the following error message: Error in identify.default(coor[, 1], coor[, 2], labels = labs, n = 1) : No graphics device is active Full transcript and platform details are below. Many thanks in advance for your help with this query. Best regards, Casey Bergman, Ph.D. Faculty of Life Sciences University of Manchester Michael Smith Building Oxford Road, M13 9PT Manchester, UK Tel: +44-(0)161-275-1713 Lab: +44-(0)161-275-5980 Fax: +44-(0)161-275-5082 skype: caseymbergman Email: casey.berg...@manchester.ac.uk Web: http://bergmanlab.smith.man.ac.uk/ Twitter: http://twitter.com/bergmanlab *** R version 2.15.0 (2012-03-30) Copyright (C) 2012 The R Foundation for Statistical Computing ISBN 3-900051-07-0 Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) tree1-rtree(50) #random tree with 50 leaves tree2-subtreeplot(tree1, wait=TRUE) # on exit, tree2 will be a subtree of tree1. wait... Node 1 out of 49 treated wait... Node 2 out of 49 treated wait... Node 3 out of 49 treated wait... Node 4 out of 49 treated wait... Node 5 out of 49 treated wait... Node 6 out of 49 treated wait... Node 7 out of 49 treated wait... Node 8 out of 49 treated wait... Node 9 out of 49 treated wait... Node 10 out of 49 treated wait... Node 11 out of 49 treated wait... Node 12 out of 49 treated wait... Node 13 out of 49 treated wait... Node 14 out of 49 treated wait... Node 15 out of 49 treated wait... Node 16 out of 49 treated wait... Node 17 out of 49 treated wait... Node 18 out of 49 treated wait... Node 19 out of 49 treated wait... Node 20 out of 49 treated wait... Node 21 out of 49 treated wait... Node 22 out of 49 treated wait... Node 23 out of 49 treated wait... Node 24 out of 49 treated wait... Node 25 out of 49 treated wait... Node 26 out of 49 treated wait... Node 27 out of 49 treated wait... Node 28 out of 49 treated wait... Node 29 out of 49 treated wait... Node 30 out of 49 treated wait... Node 31 out of 49 treated wait... Node 32 out of 49 treated wait... Node 33 out of 49 treated wait... Node 34 out of 49 treated wait... Node 35 out of 49 treated wait... Node 36 out of 49 treated wait... Node 37 out of 49 treated wait... Node 38 out of 49 treated wait... Node 39 out of 49 treated wait... Node 40 out of 49 treated wait... Node 41 out of 49 treated wait... Node 42 out of 49 treated wait... Node 43 out of 49 treated wait... Node 44 out of 49 treated wait... Node 45 out of 49 treated wait... Node 46 out of 49 treated wait... Node 47 out of 49 treated wait... Node 48 out of 49 treated wait... Node 49 out of 49 treated Error in identify.default(coor[, 1], coor[, 2], labels = labs, n = 1) : No graphics device is active ls() [1] tree1 ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo -- Emmanuel Paradis IRD, Jakarta, Indonesia http://ape.mpl.ird.fr/ ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Re: [R-sig-phylo] birthdeath {ape} fails every time.
Hi Simon, This is fixed and you can find the new version on ape's SVN. Another thing I fixed: it could happen that the calculation of the Hessian-based SEs fails: the error is now caught and the MLEs are returned with SEs set to NA (the profile-likelihood CIs are still computed of course). Matt is right that the distribution of branching times is weird with a coalescent tree, but an answer is still expected from birthdeath(). Regarding the transformation on mu and lambda, the present version imposes r (= lambda - mu) 0 and/or a (= mu/lambda) 1 because the likelihood computation uses log(r) and log(1 - a). This also avoids most warnings now. diversitree uses log(abs(... which, at first sight, is strange to me. But there may be some justification for this. Rich may comment on it? Best, Emmanuel Simon Greenhill wrote on 20/04/2012 06:52: Morning all, Whenever I try to use ape's (v 3.0-2) birthdeath function e.g. birthdeath(rcoal(sample(10:100, 1))) I get the following error: Error in while (foo(up) 0) up- up + inc : missing value where TRUE/FALSE needed The warnings all appear to be this sort of thing: 1: In log(1 - a) : NaNs produced 2: In log(exp(r * x[2:N]) - a) : NaNs produced 3: In nlm(function(p) dev(p[1], p[2]), c(0.1, 0.2), hessian = TRUE) : NA/Inf replaced by maximum positive value and seem to be generated from this line: out- nlm(function(p) dev(p[1], p[2]), c(0.1, 0.2), hessian = TRUE) Is there a fix/workaround for this? Kind regards, Simon ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo -- Emmanuel Paradis IRD, Jakarta, Indonesia http://ape.mpl.ird.fr/ ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Re: [R-sig-phylo] problems with assign(), paste(), and data.frame() for folders containing trees
Hi John, You seem to transform a (relatively) simple problem into a complicated one. First, you can get all the tree file names in one command, such as (I moved to a directory with one subdir with trees estimated by ML and another one by NJ; this is slightly arranged): f - grep(\\.tre, list.files(recursive = TRUE), value = TRUE) f [1] ML/trees_55species.tre ML/treesNADH2_45species.tre [3] ML/TR.ML.Dloop_55sp.tre ML/TR.ML.NADH2_45sp.tre [5] nj/trees_55species.tre nj/treesNADH2_45species.tre [7] nj/TR.NJ.Dloop_55sp.tre nj/TR.NJ.NADH2_45sp.tre Because in R file paths are resolved relatively, there is no need to navigate with setwd(). Second, I think you should use a list instead of a data frame because (I presume) you may have files with different numbers of trees (if this is not the case, you can transform the list in a data frame later). You may have commands like this, eg, if you want to get the mean branch length of each tree: ntree - length(f) L - list() for (i in 1:ntree) { tr - read.tree(f[i]) if (class(tr) == phylo) L[[i]] - mean(tr$edge.length) if (class(tr) == multiPhylo) L[[i]] - sapply(tr, function(x) mean(x$edge.length)) } Finally, you may name your list with the file names: names(L) - f This has the advantage that you can select some of the results, eg, the trees that were estimated by NJ: grep(nj/, names(L)) [1] 5 6 7 8 or those from D-loop: grep(Dloop, names(L)) [1] 3 7 You get the number of trees in each file with sapply(L, length). There can be many variations around this scheme. For instance, if you want to extract the branch lengths as in your example, the two lines above with mean would become: if (class(tr) == phylo) L[[i]] - tr$edge.length if (class(tr) == multiPhylo) L[[i]] - lapply(tr, $, edge.length) Best, Emmanuel -Original Message- From: John Denton jden...@amnh.org Sender: r-sig-phylo-boun...@r-project.org Date: Tue, 24 Apr 2012 20:30:26 To: R Sig Phylo Listservr-sig-phylo@r-project.org Subject: [R-sig-phylo] problems with assign(), paste(), and data.frame() for folders containing trees Hi folks, I am trying to recurse through several numbered subfolders in a directory. Each folder has many trees that I want to display summary values for. I have been expanding data frames using code with the structure name - rbind(name, newvals) to produce a data frame with n rows equal to the number of files in one of the folders, and n column equal to the number of values in the file. I can loop over the values within a single subdirectory fine with, for example, library(ape) trees - list.files(pattern=*.tre) iters=length(trees) branchdata.5 - data.frame() iterations - as.character(c(1:length(trees))) for (i in 1:iters) { tree - read.tree(trees[i]) iteration.edges.5 - as.numeric(tree$edge.length) branchdata.5 - rbind(branchdata.5, iteration.edges.5) } The problem comes when I want to iterate through the numbered subdirectories while also iterating through the files in a given directory. I want to recursively assign these data frames as well, with something like f - list.dirs(path = /.../.../etc, full.names = FALSE, recursive = FALSE) for (j in 1:length(f)) { setwd(paste(/.../.../.,j,sep=)) assign( paste(branchdata.5,j,sep=), data.frame() ) iterations - as.character(c(1:length(trees))) for (i in 1:iters) { tree - read.tree(trees[i]) assign(paste(iteration.edges.5,j,sep=), as.numeric(tree$edge.length) ) paste(branchdata.5,j,sep=) - rbind(paste(branchdata.5,j,sep=), paste(iteration.edges.5,j,sep=)) } names(iterations) - NULL boxplot(t(paste(branchdata.5,j,sep=)) , horizontal=TRUE , names=iterations , ylim=c(0,2), xlab=Branch Lengths , ylab=Iterations , main = ) } The problem seems to be in the rbind() when using values with assign() and paste(). I would love some help on this! John S. S. Denton Ph.D. Candidate Department of Ichthyology and Richard Gilder Graduate School American Museum of Natural History www.johnssdenton.com ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Re: [R-sig-phylo] read.dna warnings and pitfalls
Hi Dan, The reason for this implementation (searching the first 10 IUPAC-coded bases) is because the exact formatting is not inconsistent among different programs. Some files have: 0123456789acgt. that is a 10-character name and the sequence starting on the 11th position. I think this is typical for Phylip. Other software (e.g., PhyML) accepts longer taxa names and require a space before the start of the sequence. About your example: it depends on the order of the data. The following file can be read: 2 10 x AA madagascarAA But if you invert the two sequence lines, it fails. It is the first time I hear about this problem in 9 years, maybe because it requires a particular combination of circumstances. Another drawback of this implementation is that files with less than 10 bases cannot be read. How to solve this? If it were left only to me, I would deprecate the interleaved and sequential formats. FASTA is more flexible, more widespread, easier to parse, can store exactly the same information, and labels are only constrained to be on a single line (but can contain any characters including \n, \t, ...) But I guess many programs use the Phylip formats, so I'd be glad to read other suggestions. As for your 2nd problem, it is now fixed in ape. Best, Emmanuel -Original Message- From: Dan Rabosky drabo...@umich.edu Sender: r-sig-phylo-boun...@r-project.org Date: Wed, 25 Apr 2012 17:51:35 To: r-sig-phylo@r-project.org Subject: [R-sig-phylo] read.dna warnings and pitfalls Hi All- I have spent an inordinate and embarrassing amount of time tracking down an excruciatingly cryptic issue with read.dna, which I rarely use. Here are two key problems: 1) The function automatically assumes it is reading DNA sequences when it encounters a string of 10 continuous DNA-like characters. This includes all characters in the set (ACGTUMRWSYKVHDBN-). This function, unlike the phylip original, does not have limits on taxon name lengths. Hence, I had - in the middle of a large alignment - a species whose name included the string MADAGASCAR, which caused a failure. To be fair, the documentation warns of this, but I think this is extremely easy to overlook, and - moreover - it seems unfortunate to have to parse all your taxon names for a potential IUPAC match before trying to use the function. Presumably, most users who specify sequential spacing will be using whitespace to separate taxon names from DNA sequences, and perhaps it is better to exploit this rather than IUPAC matching. 2) The function is whitespace-sensitive. if you tab-separate the numbers on the first line (numbers of taxa, numbers of sites), you'll receive an errror with the message: the first line of the file must contain the dimensions of the data. It appears that spaces are OK, however. Hopefully this post will be useful to somewhere in the future with a similar issue. Perhaps these can be addressed in a future update to ape? -Dan Rabosky ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
[R-sig-phylo] ape's web site
Hi all, ape.mpl.ird.fr has been refreshed. Besides the new look, two things worth noting: There is a column section entitled EVENTS on the welcome page: it is intended to give information on forthcoming meetings, workshops, courses, seminars, ... If you organize such an event or would like to let others know about it, just tell me. The page on APER has been updated with supplementary materials for the second edition: you can get the data files used in the book and the R scripts of the case studies. There are some additional coloured figures and some animations made from the figures plotted with rgl. Best, Emmanuel -- Emmanuel Paradis IRD, Jakarta, Indonesia http://ape.mpl.ird.fr/ ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Re: [R-sig-phylo] read.dna warnings and pitfalls
I made some changes in read.dna which, I hope, solve the problems. The taxa names can be of any length and must be separated from the sequences by at least one space (or tabulation). write.dna() now follows the same rule. Files with less than 10 nucleotides can now be read by read.dna (bug fixed). I removed the option 'seq.names' of read.dna since it doesn't seem particularly useful and this helped to clarify the code. The new versions are now on ape's SVN: https://svn.mpl.ird.fr/ape/dev/ape/R/read.dna.R https://svn.mpl.ird.fr/ape/dev/ape/R/write.dna.R Tests welcome! Best, Emmanuel Dan Rabosky wrote on 26/04/2012 22:01: Hi Emmanuel- Thanks for fixing the whitespace issue. I think this fix will be useful to many users. On the issue of recognizing 10 IUPAC characters: I think this is a real problem, and may come up again in short order. Maybe it is just that use of this function has been limited? In the single dataset with a modest number of sequences that caused me problems yesterday, I had the following species and/or genus names - all of which constitute 10 character strings drawn from the set of IUPAC codes: brachyurus (x 2) savannarum graduacauda caudacutus Camarhynchus (x 3) madagascariensis I don't suggest deprecating the phylip sequential, but rather, using something that is compatible with raxml (surely one of the most widely used phylogenetics programs today). I think raxml uses a relaxed sequential version of the phylip format with whitespace delimitation. I could read the same alignment in raxml with no problems, but I had multiple issues when reading the same file with read.dna (including the whitespace character on the first line). My guess is that very few people are using the original phylip format, with its limit of 10 characters per taxon name, and with dna seqs beginning immediately after this. So maybe deprecate sequential phylip, but you could use what Stamatakis calls relaxed sequential PHYLIP, which appears to be: (1) taxon names cannot include spaces but can be up to 100 characters; and (2) names separated from sequences by whitespace character (ideally, this should recognize any number of spaces or tabs to prevent user confusion). For users with tab-delimited raxml files (eg each taxon name separated from its dna sequence by a tab), you can use a regular-expressions enabled text editor (like textwrangler) to quickly find potential problems. Just search for [ACGTUMRWSYKVHDBN]{10}.+\t with grep matching enabled. Cheers, ~Dan On Apr 26, 2012, at 2:16 AM, Emmanuel Paradis wrote: Hi Dan, The reason for this implementation (searching the first 10 IUPAC-coded bases) is because the exact formatting is not inconsistent among different programs. Some files have: 0123456789acgt. that is a 10-character name and the sequence starting on the 11th position. I think this is typical for Phylip. Other software (e.g., PhyML) accepts longer taxa names and require a space before the start of the sequence. About your example: it depends on the order of the data. The following file can be read: 2 10 x AA madagascarAA But if you invert the two sequence lines, it fails. It is the first time I hear about this problem in 9 years, maybe because it requires a particular combination of circumstances. Another drawback of this implementation is that files with less than 10 bases cannot be read. How to solve this? If it were left only to me, I would deprecate the interleaved and sequential formats. FASTA is more flexible, more widespread, easier to parse, can store exactly the same information, and labels are only constrained to be on a single line (but can contain any characters including \n, \t, ...) But I guess many programs use the Phylip formats, so I'd be glad to read other suggestions. As for your 2nd problem, it is now fixed in ape. Best, Emmanuel -Original Message- From: Dan Raboskydrabo...@umich.edu Sender: r-sig-phylo-boun...@r-project.org Date: Wed, 25 Apr 2012 17:51:35 To:r-sig-phylo@r-project.org Subject: [R-sig-phylo] read.dna warnings and pitfalls Hi All- I have spent an inordinate and embarrassing amount of time tracking down an excruciatingly cryptic issue with read.dna, which I rarely use. Here are two key problems: 1) The function automatically assumes it is reading DNA sequences when it encounters a string of 10 continuous DNA-like characters. This includes all characters in the set (ACGTUMRWSYKVHDBN-). This function, unlike the phylip original, does not have limits on taxon name lengths. Hence, I had - in the middle of a large alignment - a species whose name included the string MADAGASCAR, which caused a failure. To be fair, the documentation warns of this, but I think this is extremely easy to overlook, and - moreover - it seems unfortunate to have to parse all your taxon names for a potential IUPAC match before trying to use the function. Presumably, most users who specify
Re: [R-sig-phylo] compar.gee: how to extract SE, t, and P from output?
Maybe check that the code of foo has not been altered with a line break from the email (eg, by editing it with fix(foo)). If this is ok, try it with the examples from the help page ?compar.gee. You may also check that: GEE$W prints a 2 by 2 matrix. Best, Emmanuel -Original Message- From: Nina Hobbhahn n.hobbh...@gmail.com Date: Wed, 9 May 2012 16:15:24 To: Emmanuel Paradisemmanuel.para...@ird.fr Cc: r-sig-phylo@r-project.org Subject: Re: [R-sig-phylo] compar.gee: how to extract SE, t, and P from output? Hi Emmanuel, thanks for your reply. Works mostly, however, I get an error message I can't seem to figure out: Error in coef[, 2] - sqrt(diag(x$W)) : replacement has length zero. What does R mean or want here? Thanks again, Nina On 2012-05-09, at 3:19 PM, Emmanuel Paradis wrote: Hi Nina, You cannot do it directly: you have to get the code from print.compar.gee; from R: print.compar.gee From this you can create a function, eg: foo - function(x) { nas - is.na(x$coef) coef - x$coef[!nas] cnames - names(coef) coef - matrix(rep(coef, 4), ncol = 4) dimnames(coef) - list(cnames, c(Estimate, S.E., t, Pr(T |t|))) df - x$dfP - dim(coef)[1] coef[, 2] - sqrt(diag(x$W)) coef[, 3] - coef[, 1]/coef[, 2] coef } Then do: foo(GEE) HTH. Best, Emmanuel Nina Hobbhahn wrote on 08/05/2012 18:55: Dear fellow list users, I would like to extract the values for SE, t, and Pr from the compar.gee output, but seem to unable to do so. The command lines GEE-Ââcompar.gee(trait~reward-Ââ1, phy=phy2, data=DF.Disa, family=gaussian) GEE return the following output: Beginning Cgee S-Ââfunction, @(#) geeformula.q 4.13 98/01/27 running glm to get initial regression estimate rewardN rewardY 2.689142 3.421509 Beginning Cgee S-Ââfunction, @(#) geeformula.q 4.13 98/01/27 running glm to get initial regression estimate rewardN rewardY 2.689142 3.421509 Call: compar.gee(formula = trait ~ reward -Ââ 1, data = DF.Disa, family = gaussian, phy = phy2) Number of observations: 31 Model: Link: identity Variance to Mean Relation: gaussian QIC: 165.2011 Summary of Residuals: Min 1Q Median 3Q Max 3.1462585 0.2560335 1.5256031 2.2012787 4.6689764 Coefficients: Estimate S.E. t Pr(T |t|) rewardN 1.959375 1.091801 1.794626 0.1016430 rewardY 1.908384 1.117150 1.708261 0.1170666 Estimated Scale Parameter: 4.862036 Phylogenetic df (dfP): 12.45332 When typing GEE$coefficients I get rewardN rewardY 1.959375 1.908384 but I'm unable to access SE, t, and Pr(T |t|). Please can any of you help? Thank you very much, Nina Dr. Nina Hobbhahn Post-doctoral fellow Lab of Prof. S. D. Johnson School of Life Sciences University of KwaZulu-Natal Private Bag X01 Scottsville, Pietermaritzburg, 3201 South Africa [[alternative HTML version deleted]] ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo -- Emmanuel Paradis IRD, Jakarta, Indonesia http://ape.mpl.ird.fr/ [[alternative HTML version deleted]] ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Re: [R-sig-phylo] compar.gee: how to extract SE, t, and P from output?
Hi Nina, Indeed. Here's a more complete version: foo - function(x) { nas - is.na(x$coef) coef - x$coef[!nas] cnames - names(coef) coef - matrix(rep(coef, 4), ncol = 4) dimnames(coef) - list(cnames, c(Estimate, S.E., t, Pr(T |t|))) df - x$dfP - dim(coef)[1] coef[, 2] - sqrt(diag(x$W)) coef[, 3] - coef[, 1]/coef[, 2] if (df 0) { warning(not enough degrees of freedom to compute P-values.) coef[, 4] - NA } else coef[, 4] - 2 * (1 - pt(abs(coef[, 3]), df)) coef } Best, Emmanuel Nina Hobbhahn wrote on 10/05/2012 17:18: Hi Emmanuel, thanks, the problem was indeed a line break from the email. The foo function now works, however: If I now request foo(GEE), I get the following output: Estimate S.E. t Pr(T |t|) (Intercept) 1.59575105 1.1944651 1.33595449 1.59575105 rewardY -0.02952008 0.6772938 -0.04358534 -0.02952008 It looks to me as if the P values are not accessed correctly, and instead the Estimate column is printed in its place. Indeed, in the foo function code foo - function(pGEE) + { + nas - is.na(x$coef) + coef - x$coef[!nas] + cnames - names(coef) + coef - matrix(rep(coef, 4), ncol = 4) + dimnames(coef) - list(cnames, c(Estimate, S.E., t, Pr(T |t|))) + df - x$dfP - dim(coef)[1] + coef[, 2] - sqrt(diag(x$W)) + coef[, 3] - coef[, 1]/coef[, 2] + coef + } pGEE-foo(pGEE) pGEE coef[, 4] is not defined. Perhaps that's the problem? If so, could you please advise how coef[, 4] should be defined? Thank you very much, Nina On 2012-05-09, at 5:24 PM, Emmanuel Paradis wrote: Maybe check that the code of foo has not been altered with a line break from the email (eg, by editing it with fix(foo)). If this is ok, try it with the examples from the help page ?compar.gee. You may also check that: GEE$W prints a 2 by 2 matrix. Best, Emmanuel *From: * Nina Hobbhahn n.hobbh...@gmail.com mailto:n.hobbh...@gmail.com *Date: *Wed, 9 May 2012 16:15:24 +0200 *To: *Emmanuel Paradisemmanuel.para...@ird.fr mailto:emmanuel.para...@ird.fr *Cc: *r-sig-phylo@r-project.org mailto:r-sig-phylo@r-project.org *Subject: *Re: [R-sig-phylo] compar.gee: how to extract SE, t, and P from output? Hi Emmanuel, thanks for your reply. Works mostly, however, I get an error message I can't seem to figure out: Error in coef[, 2] - sqrt(diag(x$W)) : replacement has length zero. What does R mean or want here? Thanks again, Nina On 2012-05-09, at 3:19 PM, Emmanuel Paradis wrote: Hi Nina, You cannot do it directly: you have to get the code from print.compar.gee; from R: print.compar.gee From this you can create a function, eg: foo - function(x) { nas - is.na(x$coef) coef - x$coef[!nas] cnames - names(coef) coef - matrix(rep(coef, 4), ncol = 4) dimnames(coef) - list(cnames, c(Estimate, S.E., t, Pr(T |t|))) df - x$dfP - dim(coef)[1] coef[, 2] - sqrt(diag(x$W)) coef[, 3] - coef[, 1]/coef[, 2] coef } Then do: foo(GEE) HTH. Best, Emmanuel Nina Hobbhahn wrote on 08/05/2012 18:55: Dear fellow list users, I would like to extract the values for SE, t, and Pr from the compar.gee output, but seem to unable to do so. The command lines GEE-‐compar.gee(trait~reward-‐1, phy=phy2, data=DF.Disa, family=gaussian) GEE return the following output: Beginning Cgee S-‐function, @(#) geeformula.q 4.13 98/01/27 running glm to get initial regression estimate rewardN rewardY 2.689142 3.421509 Beginning Cgee S-‐function, @(#) geeformula.q 4.13 98/01/27 running glm to get initial regression estimate rewardN rewardY 2.689142 3.421509 Call: compar.gee(formula = trait ~ reward -‐ 1, data = DF.Disa, family = gaussian, phy = phy2) Number of observations: 31 Model: Link: identity Variance to Mean Relation: gaussian QIC: 165.2011 Summary of Residuals: Min 1Q Median 3Q Max 3.1462585 0.2560335 1.5256031 2.2012787 4.6689764 Coefficients: Estimate S.E. t Pr(T |t|) rewardN 1.959375 1.091801 1.794626 0.1016430 rewardY 1.908384 1.117150 1.708261 0.1170666 Estimated Scale Parameter: 4.862036 Phylogenetic df (dfP): 12.45332 When typing GEE$coefficients I get rewardN rewardY 1.959375 1.908384 but I'm unable to access SE, t, and Pr(T |t|). Please can any of you help? Thank you very much, Nina Dr. Nina Hobbhahn Post-doctoral fellow Lab of Prof. S. D. Johnson School of Life Sciences University of KwaZulu-Natal Private Bag X01 Scottsville, Pietermaritzburg, 3201 South Africa [[alternative HTML version deleted]] ___ R-sig-phylo mailing list R-sig-phylo@r-project.org mailto:R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo -- Emmanuel Paradis IRD, Jakarta, Indonesia http://ape.mpl.ird.fr/ Dr. Nina Hobbhahn Post-doctoral fellow Lab of Prof. S. D. Johnson School of Life Sciences University of KwaZulu-Natal Private Bag X01 Scottsville, Pietermaritzburg, 3201 South Africa -- Emmanuel Paradis IRD, Jakarta
Re: [R-sig-phylo] Problems useing query() function of package seqinr
Hi Augusto, I was able to reproduce some of the errors you encountered. I suggest, if you haven't done yet, to contact directly the authors of seqinr. They have their own mailinglist and their web site is: http://pbil.univ-lyon1.fr/software/seqinr/ Best, Emmanuel Augusto Ribas wrote on 11/05/2012 01:13: As i was searching endemic species for sequences, i was supposed to find 0 results. And got stuck in this part of the loop. while (length(res) == 0) { if (verbose) cat(... reading again.\n) res- readLines(socket, n = 1) } I got solutions here: http://r-br.2285057.n4.nabble.com/R-br-Duvida-sobre-a-funcao-query-do-pacote-seqinr-td4619241.html Hope it could help if anyone else get the same problem. Best Wishes Augusto Ribas 2012/5/9 Augusto Ribasribas@gmail.com If it helps library(seqinr) s- choosebank(genbank) query(bb,SP=Rhinella rubescens AND K=16S RIBOSOMAL RNA,s$socket,verbose=T) I'm checking the arguments... ... and everything is OK up to now. I'm checking the status of the socket connection... ... and everything is OK up to now. I'm sending query to server... ... answer from server is: code=0lrank=2count=1type= SQlocus=F I'm trying to analyse answer from server... ... and everything is OK up to now. ... and the rank of the resulting list is: 2 . ... and there are 1 elements in the list. ... and the elements in the list are of type SQ . ... and there are *not* only parent sequences in the list. I'm trying to get the infos about the elements of the list... ... and I have received 1 lines as expected. Now with query(bb,SP=Leptodactylus marmoratus AND K=16S RIBOSOMAL RNA,s$socket,verbose=T) I keep in an eternal loop saying: ... reading again. ... reading again. ... reading again. 2012/5/9 Augusto Ribasribas@gmail.com Hello, i'm new in this list, and new in genetics field too. I'm reading a book Analysis of Phylogenetics and Evolution with R and in the book i saw about the function query() of the package seqinr to search databases about specific sequences of our interest. #A got a list of some frog species: lista.spp-c(Rhinella rubescens, Rhinella fernandezae, Elosia rustica, Crossodactylus gaudichaudii, Leptodactylus pustulatus, Leptodactylus syphax, Dendropsophus cachimbo, Leptodactylus marmoratus, Physalaemus soaresi, Hylodes nasus, Scinax fuscomarginatus, Leptodactylus petersii, Chiasmocleis capixaba, Ceratophrys aurita, Pleurodema diplolister, Cystignatus gigas, Scinax trilineatus, Elosis nasus, Dryadophis bifossatus ) #And stared looking for 16s RIBOSOMAL RNA. #My problem is that for some species it worked great. #For example: library(seqinr) s- choosebank(genbank) query(SP,SP=Rhinella rubescens AND K=16S RIBOSOMAL RNA,s$socket) SP$nelem [1] 1 getName(SP) [1] GU907196.RR1 #1 sequence and the name to get the sequence later query(SP,SP=Leptodactylus syphax AND K=16S RIBOSOMAL RNA,s$socket) print(SP$nelem) [1] 2 print(getName(SP)) [1] JF789923 JF789924 #2 sequences and names to get them later. #The problems is that for some species like query(SP,SP=Leptodactylus marmoratus AND K=16S RIBOSOMAL RNA,s$socket) R stops, i thought it would return too many information, but even using the argument virtual=T it stops the same way, and i have to turn R off. I looked in the genbank site and this species should return no sequence but i don't know what is wrong. Also, in my understanding, Leptodactylus syphax should return 2 sequences, looking in genbank site, but it also stops. But from the 19 species, only some stops, the others work, i cant figure out what is happening. I use windows 7 OS, a residential internet, R 2.15.0 and Tinn-R as a R-gui Also, in the university the port query() function use is blocked, only ports like 80 are open in the firewall, are there other functions like this in R? In the task view, biocondutor project was cited but i still searching there. Could someone enlighten me? Best Wishes Augusto Ribas -- Grato Augusto C. A. Ribas Site Pessoal: http://augustoribas.heliohost.org Lattes: http://lattes.cnpq.br/7355685961127056 -- Grato Augusto C. A. Ribas Site Pessoal: http://augustoribas.heliohost.org Lattes: http://lattes.cnpq.br/7355685961127056 -- Emmanuel Paradis IRD, Jakarta, Indonesia http://ape.mpl.ird.fr/ ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Re: [R-sig-phylo] nonparametric PGLS
Hi Pascal, The assumption of normality in GLS, as in OLS, concerns the residuals not the data. A recent reference on this issue is: Alain F. Zuur, Elena N. Ieno and Chris S. Elphick: A protocol for data exploration to avoid common statistical problems. Methods in Ecology Evolution 2010, 1, 3–14. doi: 10./j.2041-210X.2009.1.x To answer your specific question, I'm not aware of such nonparametric method though a Monte Carlo procedure might be feasible. Best, Emmanuel Pascal Title wrote on 11/05/2012 10:46: Hi everyone I have character data for species on a phylogeny, but simple transformations like log-transformations or square-root transformations are not proving to be sufficient to get the data to be normal. If this were a non-phylogenetic test, I would resort to something like a Spearman correlation, which is non-parametric. But with phylogenetic generalized least squares, is there any method that is nonparametric? Thanks! -Pascal Title -- Emmanuel Paradis IRD, Jakarta, Indonesia http://ape.mpl.ird.fr/ ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Re: [R-sig-phylo] exporting nexus tree in APE
Hi Lucinda, You don't give enough information allowing us to help you efficiently. However, see below. Lucinda Kirkpatrick wrote on 15/05/2012 07:38: Dear all, I am attempting to export a tree I have pruned from 916 taxa to the 53 taxa I am interested. I used this script: overlap- name.check(tree, d) tree- drop.tip(tree, overlap$Tree.not.data) d2- d[!(rownames(d) %in% overlap$Data.not.tree),] plot(tree) name.check(tree,d2) That was successful but now I want to export the reduced tree as a nexus file I get the following error: write.nexus(tree, file=./Results/Tree.nex) Read 921 items This suggests you use an old version of ape (which reads the original data from the NEXUS fil: this feature has been deprecated). I suggest you update ape first; if your problem persists give us more details, possibly with some example data. Best, Emmanuel Error in 1:(start - 1) : argument of length 0 I need the reduced tree in a nexus format for another analysis - how is the best way to do that? I am very new to R so please make any explanation very simple! Thanks, Luci [[alternative HTML version deleted]] ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo -- Emmanuel Paradis IRD, Jakarta, Indonesia http://ape.mpl.ird.fr/ ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Re: [R-sig-phylo] unrooted tree, spread tips
Hi, Walter, Mathias wrote on 18/05/2012 16:42: Hi, I use ape to plot my phylogenetic trees. Often I have to plot unrooted trees with few clades, but a lot of tips per clade. The tip labels are hard to read, even if I use lab4ut=axial. Is there any way to spread the tips? No. The only available algorithm for unrooted trees in ape is the equal-angle one. It seems that for your purpose type=r might be not bad, even though the center of the circle is the root. Best, Emmanuel I saw this in some other tree drawing programs. Kind regards, Mathias ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Re: [R-sig-phylo] unrooted tree, spread tips
Walter, Mathias wrote on 18/05/2012 18:16: Hi, if I use type=r, then quite a lot of lines cross. This should not happen. May be a bug. You have an example of this? E. I tried optimal tree ordering but as soon as I convert my hclust object to a phylo object (via as.phylo), the ordering is lost. Another advantage of the unrooted tree is, that you can often clearly distinguish main clusters. That is not possible with any other kind of tree drawing. -- Kind regards, Mathias 2012/5/18 Emmanuel Paradisemmanuel.para...@ird.fr: Hi, Walter, Mathias wrote on 18/05/2012 16:42: Hi, I use ape to plot my phylogenetic trees. Often I have to plot unrooted trees with few clades, but a lot of tips per clade. The tip labels are hard to read, even if I use lab4ut=axial. Is there any way to spread the tips? No. The only available algorithm for unrooted trees in ape is the equal-angle one. It seems that for your purpose type=r might be not bad, even though the center of the circle is the root. Best, Emmanuel I saw this in some other tree drawing programs. Kind regards, Mathias ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo -- Emmanuel Paradis IRD, Jakarta, Indonesia http://ape.mpl.ird.fr/ ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Re: [R-sig-phylo] length of attribute (names) when calculating independent contrasts
Hi Agus, It seems the problem comes when pic() tries to create names to the contrasts. Probably your tree has badly formatted node labels. What do the following commands return? tree$node.label tree$Nnode Cheers, Emmanuel Agus Camacho wrote on 25/05/2012 01:58: Hi all, When calculating independent contrasts of a variable, I get the following message: ContrastWeight- pic(Weight,tree)Error en names(contr)- lbls : el atributo 'names' [10] debe tener la misma longitud que el vector [9] The attribute names [10] must have the same length than the vector[9] (this was translated by me) After checking the help on the command, i made the following observations, length(Weight)[1] 10 class(Weight)[1] numeric tree$tip.label [1] C.leiolepis C.nicterus C.sinebrachiatus [4] S.catimbau N.ablephara P.erythrocercus [7] P.tetradactylus V.rubricauda M.maximiliani [10] P.paeminosus class(tree)[1] phylo Weight C.leiolepis C.nicterus C.sinebrachiatus S.catimbau 0.9020.7160.53705260.5887500 N.ablephara P.erythrocercus P.tetradactylus V.rubricauda 0.5460.41272730.3970.4725294 M.maximiliani P.paeminosus 0.4830.4677692 However, to me, everything seems to be right about the names and the vector's length. Any hint? Thanks to all. I am new to these analyses and appreciate much the help from the mail list. Agus ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo -- Emmanuel Paradis IRD, Jakarta, Indonesia http://ape.mpl.ird.fr/ ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Re: [R-sig-phylo] as.phylo.hclust
Hi Dave, That's because hclust() computes 'height' as the distance between the two sets that are agglomerated. Consider a case with n=2: (d - dist(rnorm(2))) 1 2 2.562737 h - hclust(d) h$height [1] 2.562737 So if we make a phylo object from h, it will have two edges whose lengths must sum to 2.56: t - as.phylo(h) t$edge.length [1] 1.281369 1.281369 Of course the cophenetic distances must be the same: cophenetic(t) 12 1 0.00 2.562737 2 2.562737 0.00 cophenetic(h) 1 2 2.562737 Cheers, Emmanuel --Original Message-- From: David Bapst Sender: dwba...@gmail.com To: R Sig Phylo Listserv Cc: Emmanuel Paradis Subject: as.phylo.hclust Sent: 21 Jun 2012 05:08 Hello all, I was having some problems with getting the information I wanted from an hclust object so decided I would make it a tree and use all the tools I already know so well. Unfortunately, the distance structure in the hclust structure does not seem to be preserved by as.phylo.hclust. Instead, the tree's total depth/height seems to be halved during conversion. See the example below. set.seed(444) x-dist(rnorm(10)) hc-hclust(x) tr-as.phylo.hclust(hc) #see graphically layout(1:2) plot(hc) plot(tr);axisPhylo() #test depths max(hc$height)==max(dist.nodes(tr)[Ntip(tr)+1,]) #FALSE It looks like what's happening is that the edges are half as long as they should be. tr1-tr tr1$edge.length-tr1$edge.length*2 max(hc$height)==(max(dist.nodes(tr1)[Ntip(tr1)+1,]))#TRUE I am running ape 3.0.4 with R 2.15.0. I notice in the change-log for ape that there used to be an error (around version 2.5) where the edges of as.phylo.hclust output were multiplied by 2. Maybe the reverse bug has crept in? -Dave -- David Bapst Dept of Geophysical Sciences University of Chicago 5734 S. Ellis Chicago, IL 60637 http://home.uchicago.edu/~dwbapst/ http://cran.r-project.org/web/packages/paleotree/index.html ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Re: [R-sig-phylo] using ace() with fixed internal node - singularity error
Dear Annemarie, I have added a better error-catching mechanism in ace (it was already there for type = discrete). You can find the new version here: https://svn.mpl.ird.fr/ape/dev/ape/R/ace.R This will give you the estimated ancestral states but you won't be able to compute SEs of the parameters. BTW, for continuous traits you may also try method = REML which should give you a better estimate of the rate of BM evolution. Cheers, Emmanuel Annemarie Verkerk wrote on 22/06/2012 22:05: Dear R-sig-phylo members, I am trying to use ace() with a fixed internal node as described here: https://stat.ethz.ch/pipermail/r-sig-phylo/2011-June/001465.html My aim is to provide some confidence in the ancestral state estimation of the root node by setting the root node to have specific values that have been proposed in the literature and comparing the log likelihoods of the estimated value and the values proposed in the literature. To that end, a node with branch length zero is added to root node, and the proposed value is given to that newly added node as described in the archived message above. However, I get the following error message when I use ace() with 'method=ML': Error in solve.default(out$hessian) : Lapack routine dgesv: system is exactly singular In addition: There were 50 or more warnings (use warnings() to see the first 50) warnings() Warning messages: 1: In nlm(function(p) dev.BM(p), p = c(1, rep(mean(x), nb.node)), ... : NA/Inf replaced by maximum positive value 2: In nlm(function(p) dev.BM(p), p = c(1, rep(mean(x), nb.node)), ... : NA/Inf replaced by maximum positive value 3: In nlm(function(p) dev.BM(p), p = c(1, rep(mean(x), nb.node)), ... : NA/Inf replaced by maximum positive value I think this is because the newly added edge has zero-length, as mentioned here: http://comments.gmane.org/gmane.comp.lang.r.phylo/1342 When I use 'method=pic', I do not get this error message and it completes succesfully. (The root state is not exactly the same as the value for the newly added node, but is very close.) However, I really do need to work with the ML method, as I need the log likelihoods for my comparisons of the analyses with different values that have been used for the newly added node... As far as I can tell, neither method=pic nor method=PGLS give log likelihoods in their output. Is there any way around this problem? Many thanks, as always, Annemarie -- Emmanuel Paradis IRD, Jakarta, Indonesia http://ape.mpl.ird.fr/ ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Re: [R-sig-phylo] R package for detecting positive selection?
Hi Anna, See the package phangorn: recent versions implement codon-based models. Cheers, Emmanuel Anna Syme wrote on 28/06/2012 08:42: Hi all, I was wondering if there are any packages in R that can detect positive selection, comparable to programs such as PAML. I have two gene copies (each copy in multiple species), and want to see whether either copy has been subject to positive selection since divergence. Thanks, Anna *Dr Anna Syme* Molecular Systematist, National Herbarium of Victoria, Royal Botanic Gardens Melbourne, Private Bag 2000, Birdwood Avenue, South Yarra, 3141, Australia. Phone (03) 9252 2328, Fax (03) 9252 2413. anna.s...@rbg.vic.gov.au mailto:anna.s...@rbg.vic.gov.au www.rbg.vic.gov.au http://www.rbg.vic.gov.au/ -- This email and any attachments may contain information that is personal, confidential, legally privileged and/or copyright. No part of it should be reproduced, adapted or communicated without the prior written consent of the sender and/or copyright owner. It is the responsibility of the recipient to check for and remove viruses. If you have received this email in error, please notify the sender by return email, delete it from your system and destroy any copies. You are not authorised to use, communicate or rely on the information contained in this email. Please consider the environment before printing this email. This email was Anti Virus checked by RBG Astaro Mail Gateway. ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo -- Emmanuel Paradis IRD, Jakarta, Indonesia http://ape.mpl.ird.fr/ ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Re: [R-sig-phylo] Ape: dist.dna, P-distance calculation
Hi, Try the option pairwise.deletion = TRUE. If you still find abnormal values please give some example data. Emmanuel -Original Message- From: Jessica Sabo sabo.j...@gmail.com Sender: r-sig-phylo-boun...@r-project.org Date: Sun, 29 Jul 2012 14:42:17 To: r-sig-phylo@r-project.org Subject: [R-sig-phylo] Ape: dist.dna, P-distance calculation Hi all, I am calculating P-distances using dist.dna (pdistance - dist.dna(convertdata, model = raw). WHen I did this, I got abnormally high p-distances in the .9 region. I then calculated the p distance using Paup and I got more normal values. Ultimately, I would like to use R to calculate p-distances and not Paup. Does anyone know what is going on here? Thanks for your help, Jessica Sabo PhD Student University of Florida ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Re: [R-sig-phylo] sort / select trees by backbone constraint?
Dear Annemarie, Annemarie Verkerk wrote on 30/07/2012 16:48: Dear all, I have a follow-up question about how to sort or select trees from a tree sample as originally asked here: https://stat.ethz.ch/pipermail/r-sig-phylo/2011-March/001194.html I would like to select a set of trees from a sample of multiple trees (i.e., BEAST output) to return all trees that match a constraint backbone topology. Neither of the two answers given to the original question work for me, because they seem to be able to only constrain the tree sample using full trees, not with partial backbone trees. My aim is as follows. If I have a tree sample such as the following: tree1: (t1, (t2, (t3, (t4, t5 tree2: (t1, (t2, (t4, (t3, t5 tree3: (t1, (t3, (t4, (t2, t5 tree4: (t5, (t3, (t4, (t2, t1 tree5: (t1, (t3, (t5, (t2, t4 I would like to select from this tree sample those trees that have: backbone1: (t1, (t2, (XXX))) i.e. those trees that have that topology of t1 and t2, while the internal structure of the rest of the tree does not matter. tree1 and tree2 should be selected, but not tree3, tree4, or tree5; It seems equivalent to test whether the tips excluding t1 and t2 are monophyletic. So something like: is.monophyletic(phy, paste(t, 3:5, sep = )) should tell you whether to select 'phy' or not. You may also think whether to consider your trees as rooted or not (see the option 'reroot' of is.monophyletic). and those trees that have: backbone2: (XXX (t3, (t5, XXX))) The same reasoning might be applied where you first test whether t5 and other tips excluding t3 are monophyletic, and then test whether t3, t5 and the same other tips are monophyletic. This is of course more complicated than the 1st case. Maybe prop.part is more appropriate here (and faster): it will return the list of all monophyletic groups where the 1st element includes all tips (so it's trivially monophyletic). So you could look for the clades (excluding the 1st one) where t5 is present and t3 absent. Then test whether a clade with the same clade plus t3 is present in the list. i.e. those trees in which t5 is nested in a clade that is sister to t3, while the internal structure of the rest of the tree does not matter. tree1, tree3 and tree5 should be selected, but not tree2 or tree4. Of course it would be possible to do this by making the selection with all possible full trees that have this partial topology, but I think this would be time-demanding with trees that have a large number of taxa. Not only time-demanding but the number of possible trees will certainly be too large. Suppose XXX in backbone1 has 50 tips: ape::howmanytrees(50) [1] 2.752921e+76 which is close to the number of atoms in the universe. Best, Emmanuel Does anybody have any ideas on how to do this? Many thanks, as always, Annemarie ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Re: [R-sig-phylo] plotting a coloured tree
Agus, Whether the background appears transparent or not depends on the device where you plot the tree. On Linux, the default device (X11) is transparent: par(bg) [1] transparent When plotting in a file, the default depends on the file format: pdf() has transparent background, but png() has a white one. For the latter you may do: png(tree.png) par(bg = transparent) plot(tree, ... etc... dev.off() The way the background is rendered may depend on the application you use to open this file: on my system the transparent background of a PNG image is shown as a grey checkerboard, but when I open a PDF file with a transparent background, this one appears white. But the difference between a transparent and a white background should be clear if you include such a file in a document or in slides. Best, Emmanuel Marcio Pie wrote on 20/08/2012 07:09: Hi Agustin, The problem seems to be the space between the genus and the species names. Try this: require(ape) require(geiger) tree-read.tree(text= Calyptommatus_leiolepis:0.0168, Calyptommatus_nicterus:0.0099):0.0242 ,Calyptommatus_sinebrachiatus:0.038):0.0724 ,(Scriptosaura_catimbau:0.022,Nothobachia_ablephara:0.0309):0.0648):0.0659 ,Procellosaurinus_erythrocercus:0.0235, Procellosaurinus_tetradactylus:0.0259):0.0462 ,Vanzosaura_rubricauda:0.072):0.142 ,Micrablepharus_maximiliani:0.863):0.0225 ,Psilopthalmus_paeminosus:0.193):0.0131);) rgb(red=0, green=0, blue=0, alpha=100, max=255) plot(tree,use.edge.length = T, font=4,tip.col=black, bg=transparent, cex=1.5,edge.color = 'cyan',edge.width = 3) Marcio On Sun, Aug 19, 2012 at 8:19 PM, Agus Camachoagus.cama...@gmail.com wrote: Dear list, Im trying to plot a coloured tree with transparent background, to do that, im using the following script: require(ape) require(geiger) tree- read.tree(text= Calyptommatus leiolepis:0.0168, Calyptommatus nicterus:0.0099):0.0242 ,Calyptommatus sinebrachiatus:0.038):0.0724 ,(Scriptosaura catimbau:0.022,Nothobachia ablephara:0.0309):0.0648):0.0659 ,Procellosaurinus erythrocercus:0.0235,Procellosaurinus tetradactylus:0.0259):0.0462 ,Vanzosaura rubricauda:0.072):0.142 ,Micrablepharus maximiliani:0.863):0.0225 ,Psilopthalmus paeminosus:0.193):0.0131);) rgb(red=0, green=0, blue=0, alpha=100, max=255) plot(tree,use.edge.length = T, font=4,tip.col=black, bg=transparent, cex=1.5,edge.color = 'cyan',edge.width = 3) However, when I plot that, it does not appear transparent background and it also plots together the species name and the genus. Anybody knows how to do that? Thanks in advance! Agus -- Agustín Camacho Guerrero. Doutorando em Zoologia. Laboratório de Herpetologia, Departamento de Zoologia, Instituto de Biociências, USP. Rua do Matão, trav. 14, nº 321, Cidade Universitária, São Paulo - SP, CEP: 05508-090, Brasil. [[alternative HTML version deleted]] ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo -- Emmanuel Paradis IRD, Jakarta, Indonesia http://ape.mpl.ird.fr/ ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Re: [R-sig-phylo] ladderize + drop.tip = shuffled node labels
Hi, This was a bug in drop.tip(). It is fixed and will be in the next release of ape (the fixed version is in ape's SVN). Best, Emmanuel Rebecca Best wrote on 02/10/2012 13:26: Hi all I have been plotting some pruned trees recently, and have run into a problem using drop.tip() after ladderize(). If you ladderize() and then drop tips from the ladderized tree, then at least in my case the node labels are no longer correct. This may be an unlikely sequence of commands, but I thought I'd post this in case it is an easy fix, or it helps anyone else avoid issues. Thanks! Rebecca ## require(ape) #read tree mytree-read.tree() ((D,(E,G)1)1,((H,J)0.8,(K,(((L,M)0.5,(N,O)0.6)1,(P,(Q,R)1)1)0.7)1)1); #ladderize tree mytree.lad-ladderize(mytree) #node labels display on both trees correctly plot(mytree,show.node.label=TRUE) plot(mytree.lad,show.node.label=TRUE) #drop tips from both trees drop.mytree-drop.tip(mytree,c(L,D,G)) drop.mytree.lad-drop.tip(mytree.lad,c(L,D,G)) #plot both trees, node labels are incorrect for ladderized tree plot(drop.mytree,show.node.label=TRUE) plot(drop.mytree.lad,show.node.label=TRUE) ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo -- Emmanuel Paradis IRD, Jakarta, Indonesia http://ape.mpl.ird.fr/ ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Re: [R-sig-phylo] Scale parameter in compar.gee()
Hi Carlos, Carlos A. Botero wrote on 10/10/2012 03:49: Dear R-sig-phylo users, I am using compare.gee() to run a phylogenetically informed analysis with Poisson error. The scale parameter in my model is estimated to be close to 4, indicating that the data are over-dispersed. It is unclear to me whether compare.gee() uses the scale parameter to inflate the covariance matrix of the parameter estimates (as I understand it is done in SAS) In the GEE formulation, the scale is used to multiply the var-cov matrix of the observations. Its estimated value is then used to multiply the SEs of the parameter estimates by sqrt(scale). or whether it simply reports this statistic as a diagnostic of potential over-dispersion. In other words, do the results of a poisson regression in compare.gee() already account for over dispersion or are they invalid when the scale parameter is greater than one? The former. You may repeat your analyses setting 'scale.fix = TRUE' to compare the results. HTH Best, Emmanuel Guidance would be greatly appreciated. Best, Carlos ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo -- Emmanuel Paradis IRD, Jakarta, Indonesia http://ape.mpl.ird.fr/ ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Re: [R-sig-phylo] breaking up phylogeny figure
Hi Ben, You should be able to play with extract.clade() and drop.tip(): they both have an option 'interactive' so you may click on the plotted backbon tree to build your subtrees. Best, Emmanuel Ben Weinstein wrote on 03/10/2012 23:45: Hi all, I'm wondering if there is a way to plot a phylogeny in R that is broken up and displayed side by side to condense space. I've made a backbone in R that i'd like to make into a figure, but it is too tall. If this isn't the write setup, could someone suggest what program (and format for me to write.tree) from R. I am using windows, in terms of potential third part programs. Thanks, ben ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Re: [R-sig-phylo] Reverting or setting axis label in plottet phylo tree
Hi Jana, Jarecki, Jana wrote on 24/10/2012 20:01: Hi everybody, just a quick question: -- is it possible to revert the scale on the axis added to a tree plotted with plot(phyloobject)? No, but instead of axisPhylo() you may use axis(1). -- is it possible to manually set the axis tick locations and labels? No, but it's fairly easy to do fix(axisPhylo) and modify the code to your need. Or you can use axis(1, at = ). Best, Thanks a lot, best Jana [[alternative HTML version deleted]] ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo -- Emmanuel Paradis IRD, Jakarta, Indonesia http://ape.mpl.ird.fr/ ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Re: [R-sig-phylo] drop.tip weird handling of node labels
Dear Tristan, Thanks alot for spotting this. This is fixed now. The new version is available here: https://svn.mpl.ird.fr/ape/dev/ape/R/drop.tip.R More tests will be welcome. Cheers, Emmanuel Tristan Lefebure wrote on 15/11/2012 18:09: Dear all, I am puzzled with the following dropt.tip behaviour: ## require(ape) tr- read.tree(text='(((t1:1.047859182,t2:1.047859182):0.2756716989,t3:1.323530881):0.9374413868,(t4:1.004540609,(t5:0.825851211,t6:0.825851211):0.1786893979):0.2456960053,((t7:1.056756599,t8:1.056756599):0.06431900218,t9:1.121075601):0.1291610131):0.3349743758,t10:1.58521099):0.2527177518,((t11:0.7045266296,t12:0.7045266296):0.9569230276,(t13:1.434588187,t14:1.434588187):0.2268614706):0.1764790846):0.08226490869,t15:1.279204276,((t16:1.114074687,t17:1.114074687):0.1122507672,t18:1.226325454):0.05287882165):0.1013794601,t19:0.9955287763,t20:0.9955287763):0.1328862159,(((t21:0.5523867955,t22:0.5523867955):0.2108511243,t23:0.7632379198):0.1651743381,((t24:0.6573230631,t25:0.6573230631):0.1870302034,(t26:0.7107704271,t27:0.7107704271):0.1335828394):0.08405899131):0.227343):0.1056949204,(t28:1.153531955,t29:1.153531955):0.08057795766):0.05156567335,((t30:0.7944490533,t31:0.7944490533):0.3247998092,(((t32:0.6657996502,t33:0.6657996502):0.1778661928,((t34:0.1423836508! ,t! 35:0.1423836508):0.5359910966,t36:0.6783747474):0.1652910955):0.08026277837,t37:0.9239286214):0.1953202411):0.1664267234):0.0949081502):0.06478712445,(((t38:0.7909217241,(t39:0.404267298,t40:0.404267298):0.3866544261):0.4221751735,((t41:0.9390399711,t42:0.9390399711):0.1900443404,(t43:0.7331382087,(t44:0.5724991151,(t45:0.4288431567,t46:0.4288431567):0.1436559584):0.1606390935):0.0802190135,t47:0.813357):0.09505692675,t48:0.9084141489):0.1203866568,(((t49:0.748033,t50:0.748033):0.081110064,(t51:0.4635806844,t52:0.4635806844):0.3575301829):0.09861132464,t53:0.9197221919):0.1090786138):0.03016346022,(t54:0.7468362228,t55:0.7468362228):0.3121280432):0.07012004561):0.08401258605):0.1109195687,t56:1.324016466):0.1213543942):0.1153375546,((t57:1.136226544,t58:1.136226544):0.2488130821,(t59:0.493435228,t60:0.493435228):0.8916043976):0.1756687895):0.3594852354):0.3407786169);') torm- paste('t', c(2,3,4,6,10,11,12,17,18,26,29,33,37,38,41,45,46,51,52,53,55), sep='') tr$node.label- paste('n', 1:Nnode(tr), sep='') x11(); plot(tr, cex=0.7); nodelabels(tr$node.label, cex=0.7); tr2- drop.tip(tr, torm) x11(); plot(tr2, cex=0.7); nodelabels(tr2$node.label, cex=0.7); # Just look at the top clade made of t60, t59, t57 and t57. This clade is not impacted by the tree pruning; but look at the node labels, they have been badly reassigned... I've tried to replicate the problem on a smaller data set, but did not succeed. Thanks! -- Tristan Lefebure [[alternative HTML version deleted]] ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo -- Emmanuel Paradis IRD, Jakarta, Indonesia http://ape.mpl.ird.fr/ ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Re: [R-sig-phylo] Why no branch lengths on consensus trees?
Hi Scott, The reason for implementing only the consensus on the topology was after reading the appropriate chapter in Inferring Phylogenies. So I'm glad that Joe himself stepped in. I have added this reference in the help page (it was already cited in my book with respect to this issue). I think that the usage of branch lengths on (or from) a consensus tree depends on your perspective on phylogenetic estimation. If you are a frequentist, the goal of resampling the data is to give an idea of the accuracy of the estimates. So you might not use the bootstrap trees to estimate branch lengths; you already did that by ML (or maybe least squares) with the full data. If you are Bayesian, the trees sampled from an MCMC are here for estimation including of the branch lengths, so you use them to compute some sort of consensus topology as well as its branch lengths. So it makes sense that MrBayes can do a consensus tree with branch lengths. Cheers, Emmanuel Scott Chamberlain wrote on 22/11/2012 09:41: Dear Joe, Thanks for your feedback on this question. I will go read those pages you mentioned. Scott On Wed, Nov 21, 2012 at 5:19 PM, Joe Felsensteinj...@gs.washington.eduwrote: Daniel Barker wrote: What should branch lengths on a consensus tree represent? Scott Chamberlain had written: When making a consensus tree using ape::consensus the branch lengths are lost. Is there a way to not lose the branch lengths? Or to add them somehow to the consensus tree after making it. The issue of what branch lengths ought to be on a consensus tree is not simple. If we have three rooted trees: ((A:1,B:1):1,C:2); ((A:1,B:1):1,C:2); (A:2,(B:1,C:1):1); the consensus tree should be the first tree, but what branch length should be used for (say) the branch ancestral to the AB clade? 1? 0.667? The minute you open this can of worms it becomes clear that the answer depends on what you want that number to convey and what interpretations your audience will tend to draw from the number. There is no obvious answer. So this is not a mere technical computing question. By the way, in my 2004 book, you will find me agonizing about this on page 526, coming down on the side of 0.667, but not overwhelmingly convincingly. You could argue that a branch length should be set 0 when the branch is not there, and all the resulting values averaged, or you could argue that the average should only be taken over those trees for which that branch is present. One possible way to solve the problem is to take the consensus tree as if it were a user-defined tree, use your whole data set, and infer branch lengths on that tree. Daniel has already expressed his legitimate concerns in such a case as to whether it takes (for example) trifurcations as if they were real rather than an expression of our uncertainty. J.F. Joe Felsenstein j...@gs.washington.edu Department of Genome Sciences and Department of Biology, University of Washington, Box 355065, Seattle, WA 98195-5065 USA (from 1 October 2012 to 10 December 2012 on sabbatical leave at) Department of Statistics, University of California, Berkeley, 367 Evans Hall, Berkeley, CA 94710 [[alternative HTML version deleted]] ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo -- Emmanuel Paradis IRD, Jakarta, Indonesia http://ape.mpl.ird.fr/ ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Re: [R-sig-phylo] Modification of figures produced using cophyloplot
Hi David, David Hembry wrote on 03/12/2012 13:12: Dear all, I am plotting associated insect-plant phylogenies with links between tips using the function cophyloplot, and have two questions about how to modify the resulting plot. First, having initially plotted the data successfully, I want to modify the font size of the tip labels on both the insect and plant phylogenies. Using the following command: cophyloplot(glo.tree, epi.tree, file, space = 100, gap = 5, length.line = 5, lty = 2, cex = 0.5); where glo.tree and epi.tree are the two phylogenies, and file is the file indicating associations between tips, I was unable to adjust the font size from the default. I also tried to change the font size in each of the input trees before using cophyloplot, to use the tiplabels() function immediately following cophyloplot(), and to combine both input trees into a single file using write.tree(append = TRUE). None of these attempts were successful. cophyloplot() uses its own engine because it must draw two trees on the same plotting region together with the links, so most graphic functions from ape (eg, tiplabels) will not work with it. Also the ... argument is unused (as specified in the help page). The 'cex' parameter is hard-coded inside the internal function plotCophylo2 so you may edit it, but it's a bit tricky because if you do that the edited version will be in your workspace and it will not be used by cophyloplot() because of ape's namespace. The trick is like this (in case you want to try it, and this might be used in other situations). First make a copy of the internal function: titi - plotCophylo2 then do fix(titi) and modify what you want (here look for calls to text(... cex = ...), save and close. Do fix(cophyloplot) and change plotCophylo2(... by titi(..., save and close. Second, I also tried using the command cophyloplot(use.edge.length = TRUE) in order to show branch length information (i.e., non-time-calibrated trees that result from a likelihood analysis) for each of the phylogenies, but this had the effect of collapsing the trees at either side of the figure into trees with miniscule branch lengths, which were not possible to adjust significantly using the arguments space, gap, and length.line. Is there a way to show non-calibrated trees with the function cophyloplot? Yes, rescale your branch lengths beforehand, eg: glo.tree$edge.length - 100 * glo.tree$edge.length epi.tree$edge.length - 100 * epi.tree$edge.length where you have to find the value that best suits your data. Cheers, Emmanuel I would be grateful for any suggestions. Thank you very much in advance. David Hembry -- Emmanuel Paradis IRD, Jakarta, Indonesia http://ape.mpl.ird.fr/ ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Re: [R-sig-phylo] read.nexus error message
Hi Simon, I have tried with the first 2500 species and it worked fine. Are you sure to have the last version of ape? The file output by this server is a NEXUS file, so use read.nexus, not read.tree. Best, Emmanuel Simon Ducatez wrote on 07/12/2012 01:06: Dear all, I am having troubles importing a nexus file into R. The file contains 100 different trees (containing over 2000 species) from a pseudo-posterior distribution (imported from the website http://birdtree.org/), and I got the same error message whether using read.nexus or read.tree: Error in if (tp[3] != ) obj$node.label- tp[3] : missing value where TRUE/FALSE needed The same error was mentioned before on the web, but I couldn’t find any solution, and I do not understand the message. The different trees seem OK on Mesquite. Note that when using the same website to extract 100 trees, but this time for a sample of 50 species, the importation works without problem. Any help would be greatly appreciated! Thank you very much in advance Best Simon [[alternative HTML version deleted]] ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/ -- Emmanuel Paradis IRD, Jakarta Visiting Professor, Agricultural University of Bogor http://ape.mpl.ird.fr/ ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
Re: [R-sig-phylo] plotting a time axis on a circular phylogram
Hi Marcel, Here's what you can do: fix(plot.phylo) Find the line 104 which should be: xx - seq(0, 2 * pi * (1 - 1/Ntip), 2 * pi/Ntip) and replace it by: open.angle - pi/4 xx - seq(0, 2 * pi * (1 - 1/Ntip) - open.angle, length.out = Ntip) where 'open.angle' is the angle you want to leave blank (in radians). Save and close. Alternatively (and better), you may add 'open.angle' as an option in the function (in which case, of course, do not add the first line above). Don't forget to play with rotate.tree too. This will not draw the scale but will leave space for it so you can add it yourself. BTW, add.scale.bar() should draw a scale bar correctly. axisPhylo() currently fails with circular trees. I'll try to fix that. HTH Emmanuel Marcel Cardillo wrote on 13/12/2012 09:37: Hello all I have been searching for a way to plot a time axis on a circular phylogram created using plot.phylo(type=fan). I have seen such plots published but wasn't sure if these were done by hand, or if there is an automated way. One problem with doing it by hand (eg in Illustrator) is that there doesn't seem to be a large enough gap between the first and last tips in which an axis could be squeezed (my tree has 170 tips). Any advice gratefully accepted. cheers Marcel Cardillo -- Emmanuel Paradis - IRD Visiting Professor, Agricultural University of Bogor http://ape.mpl.ird.fr/ ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
Re: [R-sig-phylo] plotting a time axis on a circular phylogram
This has been included in plot.phylo(). Also axisPhylo() will work correctly with circular trees (an error occurs until now) and a sensible error message is now issued with radial and unrooted trees. These all will be in ape 3.0-7 and are already available on ape's SVN. Emmanuel Marcel Cardillo wrote on 19/12/2012 07:52: Thanks Emmanuel. It works nicely with 'open.angle' as an argument in the function call. Marcel On 18/12/2012 11:06 PM, Emmanuel Paradis wrote: Hi Marcel, Here's what you can do: fix(plot.phylo) Find the line 104 which should be: xx - seq(0, 2 * pi * (1 - 1/Ntip), 2 * pi/Ntip) and replace it by: open.angle - pi/4 xx - seq(0, 2 * pi * (1 - 1/Ntip) - open.angle, length.out = Ntip) where 'open.angle' is the angle you want to leave blank (in radians). Save and close. Alternatively (and better), you may add 'open.angle' as an option in the function (in which case, of course, do not add the first line above). Don't forget to play with rotate.tree too. This will not draw the scale but will leave space for it so you can add it yourself. BTW, add.scale.bar() should draw a scale bar correctly. axisPhylo() currently fails with circular trees. I'll try to fix that. HTH Emmanuel Marcel Cardillo wrote on 13/12/2012 09:37: Hello all I have been searching for a way to plot a time axis on a circular phylogram created using plot.phylo(type=fan). I have seen such plots published but wasn't sure if these were done by hand, or if there is an automated way. One problem with doing it by hand (eg in Illustrator) is that there doesn't seem to be a large enough gap between the first and last tips in which an axis could be squeezed (my tree has 170 tips). Any advice gratefully accepted. cheers Marcel Cardillo -- Emmanuel Paradis - IRD Visiting Professor, Agricultural University of Bogor http://ape.mpl.ird.fr/ ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
Re: [R-sig-phylo] using gamma parameter with dist.dna
Hi Peter, You are right: it's a bug. Peter Chi wrote on 04/01/2013 12:30: Hello list, I am currently trying to estimate distances on data that I've simulated under the JC69+\Gamma model. It appears that I can provide the gamma parameter to the dist.dna function in ape. However, I'm wondering if there may be a bug in the code, because no matter what value I provide, it gives the same estimated distance matrix (though it is different from the resulting distance matrix when gamma=FALSE as it is by default) I found these lines in the dist.dna code: if (!gamma) gamma- alpha- 0 else alpha- gamma- 1 This last line should be 'else alpha - gamma', or: else { alpha - gamma gamma - 1 } You can fix it easily with fix(dist.dna). Thanks for the report. It'll be fixed in the next version of ape which will be released soon. Cheers, Emmanuel So it looks like if gamma=FALSE, then it will set gamma and alpha variables equal to 0. If gamma is anything else (e.g. TRUE, or any numerical value), then it sets the alpha and gamma variables both to 1. I'm not sure if this is what was intended. Looking at the next line of code (where a C function is called), my guess is that the intent is that if gamma=x, then set alpha=x and then set gamma=1. But I can't be sure, because I couldn't locate the C dist_dna function within the ape binaries. I tried some simulations where I modified the code according to my guess of what it should be, and it then seems to behave as I might expect (the estimated distances are in fact closer to the truth on average when I give dist.dna the true gamma parameter than when I leave gamma=FALSE). But if anyone has any experience with this, your input is appreciated. Thanks! Peter ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/ -- Emmanuel Paradis - IRD Visiting Professor Agricultural University of Bogor http://ape.mpl.ird.fr/ ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
[R-sig-phylo] Pre-release of ape 3.0-7
Dear all, ape 3.0-7 is now ready for release on CRAN. It includes a much faster code for reading FASTA files, a new function for estimating chronograms by PL and ML (which will eventually replace chronopl), and various improvements and bug fixes already announced here. The sources are available for testing as well as a compiled version for Windows (thanks to Uwe Ligges's server) on ape's web site, or directly: http://ape.mpl.ird.fr/ape_3.0-7.tar.gz http://ape.mpl.ird.fr/ape_3.0-7.zip Tests are welcome, especially the new FASTA parser which should behave slightly differently on Windows than on other OSs. The release date on CRAN is planned around mid-January. Best, Emmanuel -- Emmanuel Paradis - IRD Visiting Professor Agricultural University of Bogor http://ape.mpl.ird.fr/ ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
Re: [R-sig-phylo] read.dna line endings in ape 3.0-7
Hi Rupert, The Mac version of ape on CRAN is not correct and has a few bugs particularly in read.dna(). Another problem is that G is not read (but g is). This does not affect the Windows version nor the sources. A solution for Mac users is re-install from the sources. Another one is to use the code of read.dna() from a previous release of ape. I'm waiting for other tests and feed-back before releasing ape 3.0-8 (within one or two weeks). Best, Emmanuel Rupert Collins wrote on 31/01/2013 02:20: I have noticed that in ape 3.0-7, read.dna on Linux does not parse fasta files with Windows line endings (i.e. CRLF \r\n). It results in the addition of \r to the taxon label, e.g. taxonA\r rather than the behaviour of 3.0-6, which was taxonA. Hope this makes sense, and thanks for any explanation. Rupert Here's an MWE. ## zz- file(test.fas, w) cat(taxonA, actg, taxonB, actt, taxonC, aatt\r, file = zz, sep = \r\n) close(zz) library(ape)#3.0-7 dat- read.dna(file=test.fas, format=fasta) labels(dat) ## [[alternative HTML version deleted]] ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/ -- Emmanuel Paradis - IRD Visiting Professor Agricultural University of Bogor http://ape.mpl.ird.fr/ ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
Re: [R-sig-phylo] Ancestral state estimates of continuous traits
Alejandro and Liam, You can also try method=REML which gives much better estimates of sigma^2 than ML. I think I'll make it the default for continuous characters. Best, Emmanuel -Original Message- From: Liam J. Revell liam.rev...@umb.edu Sender: r-sig-phylo-boun...@r-project.org Date: Fri, 15 Mar 2013 14:23:20 To: Alejandro Gonzalezalejandro.gonza...@ebd.csic.es Cc: R-phylo Mailing-listr-sig-phylo@r-project.org Subject: Re: [R-sig-phylo] Ancestral state estimates of continuous traits Hi Alejandro. That must be a bug in ace(...,method=GLS). I would also suggest you check out fastAnc in the phytools package. It uses ace(...,method=pic) internally to take advantage of the fact that the contrasts estimate at the root and the MLE assuming Brownian evolution are the same. It re-roots the tree at all internal nodes which sounds computationally intensive but is actually much faster than getting the MLEs via numerical optimization. fastAnc also uses equation (6) from Rohlf (2001) to get the correct 95% interval on the estimates. My analysis (http://blog.phytools.org/2013/02/new-version-of-fastanc-new-build-of.html) suggests that the 95% CIs from ace(...,method=ML) are too small. (This is probably because they rely on asymptotic properties of likelihood that are not satisfied for the relatively small size of most phylogenetic datasets.) All the best, Liam Liam J. Revell, Assistant Professor of Biology University of Massachusetts Boston web: http://faculty.umb.edu/liam.revell/ email: liam.rev...@umb.edu blog: http://blog.phytools.org On 3/15/2013 8:18 AM, Alejandro Gonzalez wrote: Hello, I am using ape to obtain ancestral state estimates for continuous traits. Two options are available, either maximum likelihood or a GLS method. I am comparing the results of both methods and one difference between the two puzzles me, I hope someone can enlighten me. Under GLS the ancestral state estimate at the root has virtually identical values for the point estimate and 95% confidence intervals, here is one example: 100 1.7848301 1.78483005 1.7848301 (100 is the root node number, then the ancestral state estimate and 95% CIs, respectively). However, when I use maximum likelihood for the same ancestral sate estimate, with the same data and tree I get the following result for the root : 100 1.6570119 0.86612615 2.4478977 (numbers in the same order as above). Any ideas as to why GLS gives such narrow confidence intervals for the ancestral state estimate at the basal node? Cheers Alejandro __ Alejandro Gonzalez Voyer Post-doc Estaci�n Biol�gica de Do�ana Consejo Superior de Investigaciones Cient�ficas (CSIC) Av Am�rico Vespucio s/n 41092 Sevilla Spain Tel: + 34 - 954 466700, ext 1749 E-mail: alejandro.gonza...@ebd.csic.es Web site (Under construction): Personal page: http://consevol.org/members/alejandro_combo.html Group page: http://consevol.org/people.html For PDF copies of papers see: http://csic.academia.edu/AlejandroGonzalezVoyer [[alternative HTML version deleted]] ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/ ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/ ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
Re: [R-sig-phylo] Input Format for nuc.div
Hi Mark, On Fri, 22 Mar 2013 10:24:36 -0400 (EDT) markharriso...@aol.com wrote: Hello, I'd like to measure nuc.div for a list of about 20,000 loci across 170 individuals. What is the best way to format the input data? Is it possible to integrate the whole data within one file or would I need a separate file for each gene locus? Both are possible and you should use the solution that best suits your goals. Generally, R is flexible enough so you don't have to change your data files. Additionally, is it possible to use . or any other arbitrary character to denote an invariable base or does each character have to be a G, C, T or an A? The latter (can be in lowercase too). Best, Emmanuel Thanks for you help, Mark ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
Re: [R-sig-phylo] read.dna
Hi again, It's a bug already identified: https://stat.ethz.ch/pipermail/r-sig-phylo/2013-February/002514.html The release of ape 3.0-8 will be done next week. Best, Emmanuel -Original Message- From: markharriso...@aol.com Sender: r-sig-phylo-boun...@r-project.org Date: Fri, 22 Mar 2013 14:10:23 To: r-sig-phylo@r-project.org Subject: [R-sig-phylo] read.dna Hello, I've tried to read a fasta file containing 7 sequences of equal length (5733 bases) and get the following output: df-read.dna(AT1G01040.fasta, format=fasta) df 7 DNA sequences in binary format stored in a list. Mean sequence length: 4224.571 Shortest sequence: 4223 Longest sequence: 4226 Labels: Aa_0 Abd_0 Ag_0 Ak_1 Altai_5 Amel_1 ... Base composition: a c g t 0.393 0.237 0.000 0.370 It appears that Gs are not being read which would also account for the shorter and uneven sequence lengths. Can anyone imagine what has happened here? Thanks for any help, Mark [[alternative HTML version deleted]] ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/ ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
Re: [R-sig-phylo] ML ancestral state reconstruction using different softwares
Hi, You cannot really say that in both ace and Mesquite, it slightly favors the asymmetrical model: the increase in log-likelihood is less than 1 which is far from being significant with one additional parameter. So it seems that all three pieces of software agree well on the estimate of rate for the symmetrical model. You should fit also this model with BayeTraits to do the full comparison. Cheers, Emmanuel -Original Message- From: Jingchun Li jingc...@umich.edu Sender: r-sig-phylo-boun...@r-project.org Date: Thu, 4 Apr 2013 10:15:05 To: r-sig-phylo@r-project.org Subject: [R-sig-phylo] ML ancestral state reconstruction using different softwares Hi all, I am exploring different methods for ancestral state reconstruction for a small phylogeny I have (26 OTUs). I have one binary trait coded as 0s and 1s for all the taxa. And I am more interested in the transition rates between the two states. If I understand correctly, a maximum likelihood ancestral state reconstruction using a MK1 model or an asymmetrical rates model should give me the optimized estimated transition rates between the two states. The formal will assume the two rates are the same, the later will assume independent rates. So I firstly did this using the ape function in ace. ERreconstruction - ace(traits_data, tree, type=discrete, model=ER) ERreconstruction$rates ERreconstruction$loglik ARDreconstruction - ace(traits_data, tree, type=discrete, model=ARD) ARDreconstruction$rates ARDreconstruction$loglik Then I tried using Mesquite, using MK1 model and the Asymmetrical 2-param. Markov-k Model Then I tried using BayesTraits, with the MultiState - Maximum Likelihood option, with no restrictions on q10 and q01. What puzzled me is that I'm getting different results from the three softwares. -- aceER: transition rate: 0.0059, likelihood: -7.00 aceARD: transition rate: 0.019 and 0.00, likelihood: -6.11 Mesquite MK1: transition rate: 0.0059, likelihood: -7.70 Mesquite Asymmetrical: transition rate: 0.015 and 0.005, likelihood: -7.30 BayesTraits no restriction: transition rate: 0.0057 and 0.0057, likelihood: -14.09 I can see that in both ace and Mesquite, it slightly favors the asymmetrical model with two different rates, and the rates and likelihoods are more of less comparable. But BayesTraits seems to think the two rates should be equal, and it has a different likelihood. Does this has something to do with different searching algorithms? Or am I missing something here? I am aware that the scaled likelihoods in ace are the scaled conditional likelihoods, not the joint or marginal reconstructions. But this should not affect the estimated transition rates? Thank you very much! [[alternative HTML version deleted]] ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/ ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
Re: [R-sig-phylo] ML ancestral state reconstruction using different softwares
Jingchun, Thanks to Daniel for giving a detailed answer to your first question. You can use ace() to fit a model with irreversible change. You need to build a custom model like this: Q - matrix(0, 2, 2) Q[2, 1] - 1 Q [,1] [,2] [1,]00 [2,]10 In this case there's one rate to be estimated and the other is fixed to zero (the 0's on the diagonal are ignored). This can be generalized to more complicated models (there's an example in my book). Then call ace with: ace(traits_data, tree, type = d, model = Q) This time you cannot compare this model with the ER one with an LRT: compare directly the log-liks since they have the same number of parameters or use AIC(). BTW, there is an anova methods to compare nested models fitted with ace: this avoids the pain to compute correctly df's. Cheers, Emmanuel -Original Message- From: Jingchun Li jingc...@umich.edu Date: Thu, 4 Apr 2013 22:25:09 To: emmanuel.para...@ird.fr; r-sig-phylo@r-project.org Subject: Re: [R-sig-phylo] ML ancestral state reconstruction using different softwares Sorry, I forgot to include the list. there we go. -- *From: * Jingchun Li jingc...@umich.edu *Date: *Thu, 4 Apr 2013 21:58:56 -0400 *To: *emmanuel.para...@ird.fr *Subject: *Re: [R-sig-phylo] ML ancestral state reconstruction using different softwares Thank you very much, Dr.Paradis. You are right that the two parameter model does not fit the data significantly better. I guess I'm more puzzled about why the three softwares gave me different likelihoods. I did fit the symmetrical model using BayesTraits later and it gave me about the same transition rates as the asymmetrical model and a similar likelihood value (-14.08 for symmetrical and -14.00 for asymmetrical). However I do have biological evidence that transition from 0 to 1 might be extremely hard. So maybe I can also try set q01 to 0 (which is actually what ace gave me under the asymmetrical model)? Cheers, Jingchun On Thu, Apr 4, 2013 at 9:21 PM, Emmanuel Paradis emmanuel.para...@ird.frwrote: Hi, You cannot really say that in both ace and Mesquite, it slightly favors the asymmetrical model: the increase in log-likelihood is less than 1 which is far from being significant with one additional parameter. So it seems that all three pieces of software agree well on the estimate of rate for the symmetrical model. You should fit also this model with BayeTraits to do the full comparison. Cheers, Emmanuel -Original Message- From: Jingchun Li jingc...@umich.edu Sender: r-sig-phylo-boun...@r-project.org Date: Thu, 4 Apr 2013 10:15:05 To: r-sig-phylo@r-project.org Subject: [R-sig-phylo] ML ancestral state reconstruction using different softwares Hi all, I am exploring different methods for ancestral state reconstruction for a small phylogeny I have (26 OTUs). I have one binary trait coded as 0s and 1s for all the taxa. And I am more interested in the transition rates between the two states. If I understand correctly, a maximum likelihood ancestral state reconstruction using a MK1 model or an asymmetrical rates model should give me the optimized estimated transition rates between the two states. The formal will assume the two rates are the same, the later will assume independent rates. So I firstly did this using the ape function in ace. ERreconstruction - ace(traits_data, tree, type=discrete, model=ER) ERreconstruction$rates ERreconstruction$loglik ARDreconstruction - ace(traits_data, tree, type=discrete, model=ARD) ARDreconstruction$rates ARDreconstruction$loglik Then I tried using Mesquite, using MK1 model and the Asymmetrical 2-param. Markov-k Model Then I tried using BayesTraits, with the MultiState - Maximum Likelihood option, with no restrictions on q10 and q01. What puzzled me is that I'm getting different results from the three softwares. -- aceER: transition rate: 0.0059, likelihood: -7.00 aceARD: transition rate: 0.019 and 0.00, likelihood: -6.11 Mesquite MK1: transition rate: 0.0059, likelihood: -7.70 Mesquite Asymmetrical: transition rate: 0.015 and 0.005, likelihood: -7.30 BayesTraits no restriction: transition rate: 0.0057 and 0.0057, likelihood: -14.09 I can see that in both ace and Mesquite, it slightly favors the asymmetrical model with two different rates, and the rates and likelihoods are more of less comparable. But BayesTraits seems to think the two rates should be equal, and it has a different likelihood. Does this has something to do with different searching algorithms? Or am I missing something here? I am aware that the scaled likelihoods in ace are the scaled conditional likelihoods, not the joint or marginal reconstructions. But this should not affect the estimated transition rates? Thank you very much! [[alternative HTML version deleted
Re: [R-sig-phylo] Seeking list of nucleotide substitution models
Hi Daniel, Have you looked at the help page of dist.dna? The list of references there is focused on calculating distances but some are general. There are a few more references in my book too. I suggest you also look at Inferring Phylogenies. Best, Emmanuel -Original Message- From: Daniel Barker d...@st-andrews.ac.uk Sender: r-sig-phylo-boun...@r-project.org Date: Fri, 5 Apr 2013 16:23:36 To: r-sig-phylo@r-project.orgr-sig-phylo@r-project.org Subject: [R-sig-phylo] Seeking list of nucleotide substitution models Hello, Is there a review or list of ~every specific nucleotide substitution model that has been proposed or used in the literature (with references)? I'm interested in reversible models. The most exhaustive list I have is from the jModeltest documentation, https://code.google.com/p/jmodeltest2/wiki/TheoreticalBackground#Models_of_ nucleotide_substitution I realise an enormous number of models is possible (and may have been used in model averaging). I'm keen to know which have some precedent in the literature, along the lines of those listed above. I'm finding it difficult to get beyond the standard sources. Thank you in advance, Daniel -- Daniel Barker http://bio.st-andrews.ac.uk/staff/db60.htm The University of St Andrews is a charity registered in Scotland : No SC013532 ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/ ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
Re: [R-sig-phylo] ML ancestral state reconstruction using different softwares
Hi Jingchun, Yes in this case I'd use the true model. The LRT may be not very powerful with your data. Cheers, Emmanuel -Original Message- From: Jingchun Li jingc...@umich.edu Date: Fri, 5 Apr 2013 10:25:57 To: emmanuel.para...@ird.fr Cc: r-sig-phylo@r-project.org Subject: Re: [R-sig-phylo] ML ancestral state reconstruction using different softwares Thank you very much Daniel and Emmanuel! Now things start to make sense. I guess my last question is, if there is biological evidence that the transition rates between the two traits are different, is it even meaningful to compare the two models? Should I simply avoid using the equal rates model? Cheers, Jingchun On Fri, Apr 5, 2013 at 9:12 AM, Emmanuel Paradis emmanuel.para...@ird.frwrote: ** Jingchun, Thanks to Daniel for giving a detailed answer to your first question. You can use ace() to fit a model with irreversible change. You need to build a custom model like this: Q - matrix(0, 2, 2) Q[2, 1] - 1 Q [,1] [,2] [1,] 0 0 [2,] 1 0 In this case there's one rate to be estimated and the other is fixed to zero (the 0's on the diagonal are ignored). This can be generalized to more complicated models (there's an example in my book). Then call ace with: ace(traits_data, tree, type = d, model = Q) This time you cannot compare this model with the ER one with an LRT: compare directly the log-liks since they have the same number of parameters or use AIC(). BTW, there is an anova methods to compare nested models fitted with ace: this avoids the pain to compute correctly df's. Cheers, Emmanuel -- *From: * Jingchun Li jingc...@umich.edu *Date: *Thu, 4 Apr 2013 22:25:09 -0400 *To: *emmanuel.para...@ird.fr; r-sig-phylo@r-project.org *Subject: *Re: [R-sig-phylo] ML ancestral state reconstruction using different softwares Sorry, I forgot to include the list. there we go. -- *From: * Jingchun Li jingc...@umich.edu *Date: *Thu, 4 Apr 2013 21:58:56 -0400 *To: *emmanuel.para...@ird.fr *Subject: *Re: [R-sig-phylo] ML ancestral state reconstruction using different softwares Thank you very much, Dr.Paradis. You are right that the two parameter model does not fit the data significantly better. I guess I'm more puzzled about why the three softwares gave me different likelihoods. I did fit the symmetrical model using BayesTraits later and it gave me about the same transition rates as the asymmetrical model and a similar likelihood value (-14.08 for symmetrical and -14.00 for asymmetrical). However I do have biological evidence that transition from 0 to 1 might be extremely hard. So maybe I can also try set q01 to 0 (which is actually what ace gave me under the asymmetrical model)? Cheers, Jingchun On Thu, Apr 4, 2013 at 9:21 PM, Emmanuel Paradis emmanuel.para...@ird.fr wrote: Hi, You cannot really say that in both ace and Mesquite, it slightly favors the asymmetrical model: the increase in log-likelihood is less than 1 which is far from being significant with one additional parameter. So it seems that all three pieces of software agree well on the estimate of rate for the symmetrical model. You should fit also this model with BayeTraits to do the full comparison. Cheers, Emmanuel -Original Message- From: Jingchun Li jingc...@umich.edu Sender: r-sig-phylo-boun...@r-project.org Date: Thu, 4 Apr 2013 10:15:05 To: r-sig-phylo@r-project.org Subject: [R-sig-phylo] ML ancestral state reconstruction using different softwares Hi all, I am exploring different methods for ancestral state reconstruction for a small phylogeny I have (26 OTUs). I have one binary trait coded as 0s and 1s for all the taxa. And I am more interested in the transition rates between the two states. If I understand correctly, a maximum likelihood ancestral state reconstruction using a MK1 model or an asymmetrical rates model should give me the optimized estimated transition rates between the two states. The formal will assume the two rates are the same, the later will assume independent rates. So I firstly did this using the ape function in ace. ERreconstruction - ace(traits_data, tree, type=discrete, model=ER) ERreconstruction$rates ERreconstruction$loglik ARDreconstruction - ace(traits_data, tree, type=discrete, model=ARD) ARDreconstruction$rates ARDreconstruction$loglik Then I tried using Mesquite, using MK1 model and the Asymmetrical 2-param. Markov-k Model Then I tried using BayesTraits, with the MultiState - Maximum Likelihood option, with no restrictions on q10 and q01. What puzzled me is that I'm getting different results from the three softwares. -- aceER: transition rate: 0.0059, likelihood: -7.00 aceARD: transition rate: 0.019 and 0.00, likelihood: -6.11 Mesquite MK1: transition rate: 0.0059, likelihood: -7.70 Mesquite Asymmetrical: transition rate: 0.015 and 0.005
Re: [R-sig-phylo] NaN rates and error running ace()
Hi Zhenjiang, This is because some branch lengths are zero. You can change them for a small value, eg: tree$edge.length[tree$edge.length == 0] - 1e-7 I have added a check of this in ace(). Cheers, Emmanuel Fri, 12 Apr 2013 12:22:53 -0600 zhenjiang xu zhenjiang...@gmail.com: Hi all, I am using the ace() from ape package with my phylogenetic tree (downloadable at https://www.dropbox.com/s/ab8b5q45134llti/foo.tre). But I met an error as shown here: tree = read.tree(foo.tre) is.ultrametric(tree) [1] TRUE is.binary.tree(tree) [1] TRUE ## I tested on a lot of data sets which show the same error. ## here I just use one random data for convenience. a=sample(c(0,1), 2858, replace=T) ## I used fix(ace) modify ace() code to just print the obj variable ## to show the cause of the error ace(a, tree, type='d') $loglik [1] -5e+49 $rates [1] NaN Error in nlm(function(p) dev(p), p = obj$rates, iterlim = 1, stepmax = 0, : missing value in parameter At first, I thought this might be due to the large size of the tree, which somehow cause the numerical accuracy problem. But later I tested on a random tree, which ran perfectly: tree2 = rtree(2858) ace(a, tre, type='d') $loglik [1] -1980.203 $rates [1] 14.51092 Ancestral Character Estimation Call: ace(x = a, phy = tre, type = d) Log-likelihood: -1980.203 Rate index matrix: 0 1 0 . 1 1 1 . Parameter estimates: rate index estimate std-err 1 14.5109 10.9866 Scaled likelihoods at the root (type '...$lik.anc' to get them for all nodes): 0 1 0.5 0.5 So Does anyone have idea of what might be wrong? I think it might be the problem of my tree, but I tested it - it is binary and ultrametric... Any idea or help would be appreciated. Thanks! Best, Zhenjiang [[alternative HTML version deleted]] ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/ ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
Re: [R-sig-phylo] Find common ancestor of multiple taxa
Hi Klaus and others, I have now documented getMRCA() and modified it so that it accepts a vector of labels and does some checks. This will be in the next release of ape. The SVN repository is not yet updated: there's a technical problem there. I'll fix that next week. Cheers, Emmanuel -Original Message- From: Klaus Schliep klaus.schl...@gmail.com Sender: r-sig-phylo-boun...@r-project.org Date: Thu, 9 May 2013 18:53:24 To: Nicholas Crouchncro...@uic.edu Cc: r-sig-phylo@r-project.org Subject: Re: [R-sig-phylo] Find common ancestor of multiple taxa Hi Nick, there is a functions getMRCA in ape and and mrca.phylo in phangorn, which do what you are looking for. And both functions are reasonably fast: getMRCA(tree, mytips) mrca.phylo(tree, match(mytips, tree$tip)) Regards, Klaus On 5/9/13, Liam J. Revell liam.rev...@umb.edu wrote: This is exactly what findMRCA in phytools does, but I wrote it a while ago so I'm not sure how fast it is - Liam Liam J. Revell, Assistant Professor of Biology University of Massachusetts Boston web: http://faculty.umb.edu/liam.revell/ email: liam.rev...@umb.edu blog: http://blog.phytools.org On 5/9/2013 11:24 AM, Nicholas Crouch wrote: Hi All, The 'mrca' function in Ape provides a method for finding the common ancestor of a pair of taxa. What I would like have not found is a function where you can pass a (long) list of tips from a phylogeny and find the node which represents the common ancestor to all these tips. Does such a function exist? Or could the 'mrca' function be expanded to loop over all the taxa within a list a somehow boil down to what is the common ancestor that way? Thanks, Nick [[alternative HTML version deleted]] ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/ ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/ -- Klaus Schliep Phylogenomics Lab at the University of Vigo, Spain ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/ ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
Re: [R-sig-phylo] nlm error when using ace for discrete characters
Hi Graeme, The reason of the error is because this custom model does fit your data, so the calculation of the SEs fails. I have modified ace() so that the error is caught and a better message is printed. For these data, you should try to fit the models that are pre-programmed in ace (ER, SYM, ARD) -- maybe you did already. It seems ER is the best one. You can also try other custom models though. Best, Emmanuel -Original Message- From: Graeme Lloyd graemetll...@gmail.com Sender: r-sig-phylo-boun...@r-project.org Date: Sun, 12 May 2013 16:35:30 To: r-sig-phylo@r-project.org Subject: [R-sig-phylo] nlm error when using ace for discrete characters Hi Everyone, I'm trying to use ace to do some discrete ancestral character estimation, but came upon the following error: Error in nlm(function(p) dev(p), p = obj$rates, iterlim = 1, stepmax = 0, : non-finite value supplied by 'nlm' Any ideas? I am using the latest versions of R and ape and my code is given below. Cheers, Graeme # Load library: library(ape) # Load tree with branch lengths: tree-read.tree(text=(Gracilisuchus_stipanicicorum:1,((Terrestrisuchus_gracilis:1,Dibothrosuchus_elaphros:1):1,((Orthosuchus_stormbergi:1,(Protosuchus_richardsoni:2,Edentosuchus_tienshanensis:2):1):1,(Zaraasuchus_shepardi:2,(Sichuanosuchus_shuhanensis:3,(Hsisosuchus_chungkingensis:1,(Fruitachampsa_callisoni:1,(((Araripesuchus_wegeneri:1,(Araripesuchus_tsangatsangana:1,(((Malawisuchus_mwakasyungutiensis:1,(Notosuchus_terrestris:2,(Iberosuchus_macrodon:4,Chimaerasuchus_paradoxus:2):1):1):1,Simosuchus_clarki:2):2,Araripesuchus_gomesii:3):1):1):1,Mahajangasuchus_insignis:3):2,((Theriosuchus_pusillus:1,Alligatorium:1):1,((Pachycheilosuchus_trinquei:1,(Sunosuchus_junggarensis:4,(Bernissartia_fagesii:1,(Shamosuchus_djadochtaensis:3,(Pristichampsus_vorax:2,(Borealosuchus_formidabilis:2,((Eothoracosaurus_mississippiensi:1,Gavialis_gangeticus:1):3,(Crocodylus_niloticus:1,(Diplocynodon_hantoniensis:1,Alligator_mississippiensis:1):1):1):1):2):1):1):1):2):1,((Steneosaurus_bollensis:1,(P! elagosaurus_typus:1,(Metriorhynchus_superciliosus:1,Geosaurus_araucanensis:3):2):1):1,(Sarcosuchus_imperator:1,Terminonaris_robusta:1):4):1):1):2):1):1):1):1):1):1):1);) # Load tip values: tipvals-c(0,0,0,0,1,4,1,0,0,0,3,3,1,0,4,1,3,2,4,0,0,2,2,2,3,3,3,1,3,3,3,3,0,0,0,0,0,0) # Create model: charmodel-rbind(c(0,1,2,3,4),c(1,0,1,2,3),c(2,1,0,1,2),c(3,2,1,0,1),c(4,3,2,1,0)) # Find ancestral states: ace(tipvals,tree,type=discrete,model=charmodel) ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/ ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
Re: [R-sig-phylo] Extracting Independent Subclades
Hi Glenn, Of course %in% is pairwise. I thought you'd figure out the rest of it ;) I can see two ways to solve your problem, though I don't know if they work. The 1st one: 1) Select the clades of appropriate size(s). 2) Build a square symmetric matrix where the rows and columns are the clades selected in 1), and put 0 in the matrix if the two clades are independent, 1 if not. 3) Reorder the rows and columns of this matrix so that the 1's are concentrated near the diagonal. I'm not sure how to do this step. There are algorithms to do that (matrix profile reduction) but I'm not sure whether they are in R. Others may have better advice. 4) Step 3) should result in a matrix with 0's concentrated in the top right and bottom left corners of the matrix. Select these clades. The 2nd one: do a minimum spanning tree with the matrix from step 2) above (using preferrably mst() in pegas). The resulting MST should cluster together clades with links of size 0: select the biggest cluster. Here also I'm not sure about the practical details. There could be other solutions I guess. Best, Emmanuel -Original Message- From: Glenn Seeholzer seeholzer.gl...@gmail.com Date: Tue, 21 May 2013 11:14:12 To: emmanuel.para...@ird.fr Cc: r-sig-phylo-boun...@r-project.org; r-sig-phylo@r-project.org Subject: Re: [R-sig-phylo] Extracting Independent Subclades Hi Emmanuel, Thanks for the suggestion to use prop.part(tree), its definitely faster the what I had come up with. I might have missed something, but I don't think using %in% alone will solve my problem. Essentially, I'm interested in finding an objective way to determine sets of independent subclades (not just two). Ultimately, I will be comparing rates of trait evolution and diversification rates among these subclades using PGLS regressions. The tree I'm working with is ~300 species. My initial subclade partitioning scheme sampled the major genera within the group, resulting in 13 independent subclades. I'd like a more objective way to sample subclades, that will allow me run the analysis across many (if not all) possible independent subclade partitions to examine the effect subclade partition choice has on the outcome of the analysis. While %in% would allow me to determine if two clades are independent, I think I'm looking for an algorithm that will find all possible subclade partitions where each subclade has greater than 10 taxa and all are independent. Variations of this problem seem to be common in combinatorial math i.e. find all disjoint (non-overlapping) sets from a set of sets, but the algorithms proposed in the math forums are difficult for me to understand, let alone implement in R. Has anyone dealt with this issue before? Is there some simple solution I'm missing? Cheers, Glenn On Tue, May 21, 2013 at 12:11 AM, Emmanuel Paradis emmanuel.para...@ird.frwrote: Hi Glenn, An alternative is to use prop.part(tree) which returns the composition of all clades in tree. You can then easily select those of appropriate size, eg: pp - prop.part(tree) which(sapply(pp, length) = 10) Then using match() or %in% you can test whether two sets are independent. Because the list returned by prop.part is ordered along the node numbers of the tree, it is easy to use extract.clade() for your final operation. HTH Best, Emmanuel -Original Message- From: Glenn Seeholzer seeholzer.gl...@gmail.com Sender: r-sig-phylo-boun...@r-project.org Date: Mon, 20 May 2013 23:08:51 To: r-sig-phylo@r-project.org Subject: [R-sig-phylo] Extracting Independent Subclades Hi all, I'd like to sample independent subclades of greater than N tips from a phylogenetic tree much greater than N, say (number of tips = 10*N). I can extract all subclades of greater than N taxa, but sampling independent subclades seems to be a harder problem. I believe I am looking for all disjoint (non-overlapping) sets of species name sets from this list of subclades that maximizes the number of subclades sampled. For example, the following code produces a list of subclades (subclades) for all subclades with greater than 10 tips from a simulated tree with 100 tips. tips.subclades is a list of tip labels for these trees that may be useful. Essentially, I'd like to determine all the sets of subclades in this list of monophyletic trees that are independent, i.e. no shared taxa. Many of the trees formed by combining these sets of monophyletic subclades will be paraphyletic, that is OK. From this list of sets of subclades, I can then sample the sets of independent subclades with the appropriate number of subclades for the analyses I am performing. Thanks in advance for any suggestions. Glenn ### library(ape) library(TreeSim) n.taxa - 100 min.subclade.n.tip - 10 tree - sim.bd.taxa(n.taxa,1,lambda=.5,mu=0)[[1]][[1]] tmp - subtrees(tree) testNtip - function(x){ if(x$Ntip min.subclade.n.tip) x else NA } tmp2 - lapply(tmp, testNtip
Re: [R-sig-phylo] extracting order of tips as in plotted ladderized tree
Hi, Your question is not very specific but I guess you want something like: i - which(tr$edge[, 2] = Ntip(tr)) tr$edge[i, 2] Best, Emmanuel -Original Message- From: Ivalu ivalu.ca...@gmail.com Sender: r-sig-phylo-boun...@r-project.org Date: Thu, 23 May 2013 09:42:32 To: r-sig-phylo@r-project.org Subject: [R-sig-phylo] extracting order of tips as in plotted ladderized tree Hi All, Any advice as to how to extract the tips of a tree ordered as in a ladderized tree? Thanks! -Ivalu ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/ ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
Re: [R-sig-phylo] Calling Clustal in R the DNAbin format for storing nucleotides
Hi Ben, Clustal accepts labels up to 30-character long. You can solve your problem by changing the labels using the function makeLabel (see its help page for some details). I'll change clustal() to avoid this problem in the future. You can find details on DNAbin here: http://ape.mpl.ird.fr/ape_development.html Note that ape provides a number of high-level functions in R, so that it's not always needed to work at the C level. You can find more on these R functions at the following help pages (and some links therein): ?DNAbin ?as.DNAbin ?base.freq ?seg.sites ?where Best, Emmanuel -Original Message- From: Ben Ward (TSL) ben.w...@sainsbury-laboratory.ac.uk Sender: r-sig-phylo-boun...@r-project.org Date: Mon, 27 May 2013 08:21:41 To: r-sig-phylo@r-project.orgr-sig-phylo@r-project.org Subject: [R-sig-phylo] Calling Clustal in R the DNAbin format for storing nucleotides Hi all, I've recently turned to R for scripting an analysis consisting of many boring jobs repeated with many files I could automate. I'm learning perl and had a go with BioPerl but had trouble getting PhyML to get called from within it using the run modules - any who I'm a much longer R user since I began my first degree with more classic stats and model fitting, so I've returned to it for my bioinformatics needs! I want to run the Clustal program from my R script using the function provided in the ape package by Paradis et al, but I kept getting an error: Error in `[.default`(res, , FXCNDTJ02R756X, drop = FALSE) : subscript out of bounds So naturally I called up the function name and had a nosy at what it does – the error occurs after Clustal has successfully run at this line(s): if (original.ordering) res - res[labels(x), ] res Which is where it re-orders sequences according to the original alignment fed in. Herin lies the problem: If I do labels(x) to get the original labels of the sequences (sequences read in using the read.dna() function of ape in fasta format), the labels are like this: FXCNDTJ02RW9CX length=278 xy=3136_3087 region=2 run=R_2009_06_11_14_18_55_ However the labels of the aligned sequences (labels(res)) are like this: FXCNDTJ02RW9CX So the indexing of the matrix containing the aligned sequences is failing because the labels are changed at some point in the function, possibly when the input sequences are written to a tmpdir for the system command to run clustal to find. Has anyone else experience issues similar to this? Of course if I set the original.ordering variable to false this part would not cause the error since the if statement would not evaluate as true. Does anybody have any suggestions? I have a second question: In addition, is there a source as for how nucleotides data are stored in DNAbin and how/why storing them as binary vectors (I've read the brief mention in Paradis' Springer book) speeds up operations?. If I try to look at the structure of a DNAbin object I see in the matrix raw [1:257] 88 88 18 18 … I'm writing a package that implements some stuff my colleagues have used and I figure I want as much compatibility with existing bioinformatics packages for R as possible: up to now what I've written just uses characters and strings of characters a,t,g,c, but if I can learn the ins and outs of the DNAbin format more than the overview given in the book I have it would be very useful. Thanks in advance and best, Ben W. The Sainsbury Lab and UEA(ENV) [[alternative HTML version deleted]] ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
Re: [R-sig-phylo] dist.dna returning NaN's for 3 alignments
Hi Ben, See the option pairwise.deletion of dist.dna. You can use image() also to have a view on the distribution of the -. Cheers, Emmanuel -Original Message- From: Ben Ward (TSL) ben.w...@sainsbury-laboratory.ac.uk Sender: r-sig-phylo-boun...@r-project.org Date: Tue, 28 May 2013 15:30:50 To: r-sig-phylo@r-project.orgr-sig-phylo@r-project.org Subject: [R-sig-phylo] dist.dna returning NaN's for 3 alignments ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/ ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
[R-sig-phylo] new ape's web site (and singleton nodes)
Hi all, ape's web site has been moved to this URI: http://ape-package.ird.fr/ (Currently, the redirection from the old URI does not work correctly. I hope it will be fixed soon.) Regarding the issue of singleton nodes and phylomatic, this is mentioned in ape's FAQ (FAQ #15) but in a different way than originally asked by Kellen. I have added a link to Megan's excellent post. Supporting singleton nodes is one feature that might be included in ape in the future (among many others). Best, Emmanuel Fri, 7 Jun 2013 00:32:58 -0400 Liam J. Revell liam.rev...@umb.edu: Hi Kellen. I'm not sure if this is still relevant, but I just posted source code (http://www.phytools.org/read.newick/v0.4/read.newick.R) for a function that will read this type of Newick string (i.e., with node labels singletons) into R. To create the attached plot I had to first collapse singles, i.e., require(phytools) source(read.newick.R) tree-read.newick(file=output.txt) tree-collapse.singles(tree) plotTree(tree,type=fan) All the best, Liam Liam J. Revell, Assistant Professor of Biology University of Massachusetts Boston web: http://faculty.umb.edu/liam.revell/ email: liam.rev...@umb.edu blog: http://blog.phytools.org On 6/6/2013 4:25 PM, Megan Bartlett wrote: Dear Kellen, It looks to me like the problem is that you have genera labels containing only one species, like (Sabatia_angularis:46.296295)sabatia. The frustrating thing about this notation is that Phylocom is fine with this, but R can't read it, even though it's technically legal in the Newick format. I don't think collapse.single would solve your problem, since R isn't even able to input the file to perform a function on it. For troubleshooting, I would try running clean.phylo on this tree in Phylocom, and seeing if that solves your problem. If so, I would run clean.phylo on your undated tree before you run BLADJ. (This would be your Phylomatic output if you assembled your undated tree with Phylomatic). Because having node ages for singleton taxa will not affect your branch lengths for that taxa (for example, Sabatia angularis will have the same branch length whether or not you have an age for the genus Sabatia), you should get the same results. I believe the Phylocom manual discusses this in the Phylomatic section. Best, Megan [[alternative HTML version deleted]] ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/ ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
Re: [R-sig-phylo] compare.gee issues with multiple predictor models
Hi Sandra, Le 01/07/2013 22:03, sandra goutte a écrit : Hello all, I am sorry if the question has already been answered, i have not found it int the archive. I am using compar.gee to look at possible correlations between behavioral traits and ecological variables. I have two problems: 1) if i try a model with more than 3 predictors, the function cannot compute the p-values. Is that a problem of df? Yes. In other words, your species are too closely related to give enough information to compute the P-values, but this is for GEEs only. Since you use a gaussian family for the response, you could use gls() instead: the estimated coefficients will be the same and you'll get P-values. 2) this one is more problematic to me: if i add predictors in my model, it actually changes the values estimated for the first one. to be clear here is an example of my output: That's expected and this is for any kind of regression models (and most estimation problems). Try with lm() if you want to make sure. Best, Emmanuel # with only canopy as a predictor Call: compar.gee(formula = DF ~ canopy, phy = tree2) Number of observations: 18 Model: Link: identity Variance to Mean Relation: gaussian QIC: 225.2231 Summary of Residuals: Min 1Q Median 3QMax -1.3742940 -0.6158815 1.1275603 2.6748027 10.2016164 Coefficients: EstimateS.E.t Pr(T |t|) (Intercept) 2.21716210 2.366080952 0.937061 0.43327224 canopy 0.02050857 0.006936446 2.956640 0.07879305 Estimated Scale Parameter: 12.71506 Phylogenetic df (dfP): 4.395587 # with canopy and splfrog as predictors Call: compar.gee(formula = DF ~ splfrog * canopy, phy = tree2) Number of observations: 18 Model: Link: identity Variance to Mean Relation: gaussian QIC: 197.5136 Summary of Residuals: Min 1Q Median 3QMax -1.4887445 -0.4669401 1.0553884 2.1243035 9.7526188 Coefficients: Estimate S.E. t Pr(T |t|) (Intercept)-1.2294629060 3.8542626753 -0.3189878 0.8481567 splfrog 0.0638736816 0.0542900060 1.1765274 0.6056463 canopy 0.0308125063 0.0507169033 0.6075392 0.7408100 splfrog:canopy -0.0002361583 0.0007856787 -0.3005787 0.8561095 Estimated Scale Parameter: 12.32929 Phylogenetic df (dfP): 4.395587 The values ar really different! Am i missing something obvious? Thank you all in advance for your help! Sandra. ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/ ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
Re: [R-sig-phylo] Problems with write.nexus of the package ape
Hi Agus, The problem is that spaces are ignored in Newick and NEXUS files. Maybe you do not need to replace _ for ? Best, Emmanuel Le 29/06/2013 15:15, Agus Camacho a écrit : Dear all, After correcting the names of a otherwise perfectly functional tree using gsub, tree$tip.label=gsub(_, , tree$tip.label the new tree looks nice, but when i save it using write.nexus and reopen it, i get: wtree=read.nexus(tree_corrected_names.nexus)Warning message:In matrix(x, ncol = 2, byrow = TRUE) : data length [105] is not a sub-multiple or multiple of the number of rows [53] wtree$tip.label [1] Tupinambis 2ocellifer Kentropyx [5] 4atriventris Anotosaura 6 [9] percarinatum Arthrosaura 8reticulata [13] Bachia 10 panoplia Bachia [17] 12 bicarinatus Placosoma14 [21] glabellumNeusticurus 16 quadrilineata [25] Cercosaura 18 ocelatta Cercosaura [29] 20 brachylepis Heterodactylus 22 [33] modesta Iphisa 24 paeminosus [37] Gymnophthalmus 26 ablepharaCalyptommatus [41] 28 leiolepisCalyptommatus30 [45] atticolusMicrablepharus 32 agilis [49] Vanzosaura 34 erythrocercusProcellosaurinus [53] 1 thanks in advance Agus ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/ ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
[R-sig-phylo] workshop on phylogeography
Hi all, I'm looking for information on workshops, trainings or courses that include phylogeography. That could be with R of course, but not absolutely necessarily. If you organize, or know about, such a course or workshop, at the PhD-candidate level and/or above, and in English preferably, please let me know. Thanks in advance. Best, Emmanuel ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
Re: [R-sig-phylo] PGLS vs lm
Hi Tom, In an OLS regression, the residuals from both regressions (varA ~ varB and varB ~ varA) are different but their distributions are (more or less) symmetric. So, because the residuals are independent (ie, their covariance is null), the residual standard error will be the same (or very close in practice). In GLS, the residuals are not independent, so this difference in the distribution of the residuals affects the estimation of the residual standard errors (because we need to estimate the covaraince of the residuals), and consequently the associated tests. Best, Emmanuel Le 11/07/2013 11:03, Tom Schoenemann a écrit : Hi all, I ran a PGLS with two variables, call them VarA and VarB, using a phylogenetic tree and corPagel. When I try to predict VarA from VarB, I get a significant coefficient for VarB. However, if I invert this and try to predict VarB from VarA, I do NOT get a significant coefficient for VarA. Shouldn't these be both significant, or both insignificant (the actual outputs and calls are pasted below)? If I do a simple lm for these, I get the same significance level for the coefficients either way (i.e., lm(VarA ~ VarB) vs. lm(VarB ~ VarA), though the values of the coefficients of course differ. Can someone help me understand why the PGLS would not necessarily be symmetric in this same way? Thanks, -Tom outTree_group_by_brain_LambdaEst_redo1 - gls(log_group_size_data ~ log_brain_weight_data, correlation = bm.t.100species_lamEst_redo1,data = DF.brain.repertoire.group, method= ML) summary(outTree_group_by_brain_LambdaEst_redo1) Generalized least squares fit by maximum likelihood Model: log_group_size_data ~ log_brain_weight_data Data: DF.brain.repertoire.group AIC BIClogLik 89.45152 99.8722 -40.72576 Correlation Structure: corPagel Formula: ~1 Parameter estimate(s): lambda 0.7522738 Coefficients: Value Std.Error t-value p-value (Intercept) -0.0077276 0.2628264 -0.029402 0.9766 log_brain_weight_data 0.4636859 0.1355499 3.420778 0.0009 Correlation: (Intr) log_brain_weight_data -0.637 Standardized residuals: Min Q1Med Q3Max -1.7225003 -0.1696079 0.5753531 1.0705308 3.0685637 Residual standard error: 0.5250319 Degrees of freedom: 100 total; 98 residual Here is the inverse: outTree_brain_by_group_LambdaEst_redo1 - gls(log_brain_weight_data ~ log_group_size_data, correlation = bm.t.100species_lamEst_redo1,data = DF.brain.repertoire.group, method= ML) summary(outTree_brain_by_group_LambdaEst_redo1) Generalized least squares fit by maximum likelihood Model: log_brain_weight_data ~ log_group_size_data Data: DF.brain.repertoire.group AIC BIC logLik -39.45804 -29.03736 23.72902 Correlation Structure: corPagel Formula: ~1 Parameter estimate(s): lambda 1.010277 Coefficients: Value Std.Error t-value p-value (Intercept) 1.2244133 0.20948634 5.844836 0. log_group_size_data -0.0234525 0.03723828 -0.629796 0.5303 Correlation: (Intr) log_group_size_data -0.095 Standardized residuals: Min Q1Med Q3Max -2.0682836 -0.3859688 1.1515176 1.5908565 3.1163377 Residual standard error: 0.4830596 Degrees of freedom: 100 total; 98 residual _ P. Thomas Schoenemann Associate Professor Department of Anthropology Cognitive Science Program Indiana University Bloomington, IN 47405 Phone: 812-855-8800 E-mail: t...@indiana.edu Open Research Scan Archive (ORSA) Co-Director Consulting Scholar Museum of Archaeology and Anthropology University of Pennsylvania http://www.indiana.edu/~brainevo [[alternative HTML version deleted]] ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/ ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
Re: [R-sig-phylo] PGLS vs lm
Hi all, I would like to react a bit on this issue. Probably one problem is that the distinction correlation vs. regression is not the same for independent data and for phylogenetic data. Consider the case of independent observations first. Suppose we are interested in the relationship y = b x + a, where x is an environmental variable, say latitude. We can get estimates of b and a by moving to 10 well-chosen locations, sampling 10 observations of y (they are independent) and analyse the 100 data points with OLS. Here we cannot say anything about the correlation between x and y because we controlled the distribution of x. In practice, even if x is not controlled, this approach is still valid as long as the observations are independent. With phylogenetic data, x is not controlled if it is measured on the species -- in other words it's an evolving trait (or intrinsic variable). x may be controlled if it is measured outside the species (extrinsic variable) such as latitude. So the case of using regression or correlation is not the same than above. Combining intrinsic and extinsic variables has generated a lot of debate in the literature. I don't think it's a problem of using a method and not another, but rather to use a method keeping in mind what it does (and its assumptions). Apparently, Hansen and Bartoszek consider a range of models including regression models where, by contrast to GLS, the evolution of the predictors is modelled explicitly. If we want to progress in our knowledge on how evolution works, I think we have to not limit ourselves to assess whether there is a relationship, but to test more complex models. The case presented by Tom is particularly relevant here (at least to me): testing whether group size affects brain size or the opposite (or both) is an important question. There's been also a lot of debate whether comparative data can answer this question. Maybe what we need here is an approach based on simultaneous equations (aka structural equation models), but I'm not aware whether this exists in a phylogenetic framework. The approach by Hansen and Bartoszek could be a step in this direction. Best, Emmanuel Le 13/07/2013 02:59, Joe Felsenstein a écrit : Tom Schoenemann asked me: With respect to your crankiness, is this the paper by Hansen that you are referring to?: Bartoszek, K., Pienaar, J., Mostad, P., Andersson, S., Hansen, T. F. (2012). A phylogenetic comparative method for studying multivariate adaptation. Journal of Theoretical Biology, 314(0), 204-215. I wrote Bartoszek to see if I could get his R code to try the method mentioned in there. If I can figure out how to apply it to my data, that will be great. I agree that it is clearly a mistake to assume one variable is responding evolutionarily only to the current value of the other (predictor variables). I'm glad to hear that *somebody* here thinks it is a mistake (because it really is). I keep mentioning it here, and Hansen has published extensively on it, but everyone keeps saying Well, my friend used it, and he got tenure, so it must be OK. The paper I saw was this one: Hansen, Thomas F Bartoszek, Krzysztof (2012). Interpreting the evolutionary regression: The interplay between observational and biological errors in phylogenetic comparative studies. Systematic Biology 61 (3): 413-425. ISSN 1063-5157. J.F. Joe Felsenstein j...@gs.washington.edu Department of Genome Sciences and Department of Biology, University of Washington, Box 355065, Seattle, WA 98195-5065 USA ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/ ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
Re: [R-sig-phylo] number of fitted parameters in lm and pgls in R
Hi Sereina, You will get the same AIC with lm and gls, for a given regression model, if: 1. the GLS model is fitted assuming independence of observations (this is the default in nlme::gls), 2. the GLS model is fitted by ML (this is not the default). Some correlation structures require to estimate some additional parameters (eg, corPagel) and others don't (eg, corBrownian, or if you set fixed = TRUE when building the correlation structure). To check that, use the generic function logLik which returns the number of free parameters (as 'df'). Best, Emmanuel Le 30/07/2013 20:59, Sereina Graber a écrit : Hi everybody, I was comparing a simple lm model and a pgls model in R and realised that the AICs are not identical. I found out that for the pgls there is one fitted parameter (k) less than in the lm. Can anybody explain me the difference and why in a simple lm there is one more fitted parameter in the model compared to a PGLS? Imporant to note is that of course in both models I included the extact same covariates and response variable and further all the estimates, std. errors etc are identical except the AIC. For any help I am very grateful. Best ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/ ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
Re: [R-sig-phylo] problem with corMartins (ape)
Hi, Xavier: if I understand correctly what you describe, this can be done with: V - vcv(phy, corr = TRUE) diag(V) - 0 vcv() returns the expected variance-covariance matrix under Brownian motion for a given tree. If this tree is not ultrametric (I guess this is what you mean with where the tips do not have the same lengths) then it is correct that the variances are not the same. Note that if a tree is rescaled (say by multiplying its branch lengths) then the result of vcv() will be different but vcv(, corr = TRUE) will return the same (correlation) matrix. Anyway, if you really found some wrong results, please report this. Sandra: thanks for solving your problem by yourself _and_ giving some feedback on the list. I guess the information you needed can be found in the vignette delivered with ape. Best, Emmanuel Le 01/08/2013 22:13, Xavier Prudent a écrit : Salut Sandra, I have experienced that the matrix given by vcv can lead to wrong results if taken as a correlation matrix in the case of trees where the tips do not have the same lengths. I then built my own correlation, using the common branch length shared by two tips, normalized by this same length plus the lengths of the last branches of these tips. The diagonals being forced to zero. If you are interested in it, I can send you the code, cheers, Xavier 2013/8/1 sandra goutte gou...@mnhn.fr Ok so i found out what i was doing wrong and i just answer my own post so nobody will waste time answering me: i was not making a proper weight matrix using vcv(), so i fixed the problem by using 1/distance matrix from this variance-covariance matrix and then fixing the sum of each row to 1. Now the Moran.I behaves as expected with random data for both OU and BM models. Sandra. 2013/7/29 sandra goutte gou...@mnhn.fr Dear list, I am trying to run a gls with an OU correlation structure using corMartins. I first wanted to check whether i had a phylogenetic autocorrelation in my trait, so i used the Moran's I index (Moran.I): vcovm-vcv(corMartins(1,tree, fixed=T)) Moran.I(trait, vcovm, alt=greater) And i obtained a p-value equal to zero. Changing the alpha parameter, i kept getting p-values=0. For the Brownian motion however, i got a normal p-value. I then tried with random values for my trait, and when for a brownian correlation structure i get what i expect, that is most of the time non-signiifcant autocorrelation, with the cor.Martins structure i always get significant p-values (although not equal to zero). In addition, if i run the GLS with the corMartins structure like this: corm-corMartins(1,phy=tree, fixed=T) glsp- gls(trait~var1+var2+var3, correlation=corm, data=data) i get better and better AIC for increasing the alpha parameter (indefinitely!), which would mean that the stronger the constraint on the trait, the better is the fit. Does anyone know why the corMartins is giving me these wierd results? Am I using this function in the wrong way or is it a bugg? Thank you in advance for your help, Sandra. -- PhD Student Muséum National d'Histoire Naturelle Département Systématique et Évolution USM 601 / UMR 7205 Origine, Structure et Évolution de la Biodiversité Reptiles Amphibiens - Case Postale 30 25 rue Cuvier F-75005 Paris Tel : +33 (0) 1 40 79 34 90 Mobile: +33 (0) 6 79 20 29 99 -- PhD Student Muséum National d'Histoire Naturelle Département Systématique et Évolution USM 601 / UMR 7205 Origine, Structure et Évolution de la Biodiversité Reptiles Amphibiens - Case Postale 30 25 rue Cuvier F-75005 Paris Tel : +33 (0) 1 40 79 34 90 Mobile: +33 (0) 6 79 20 29 99 [[alternative HTML version deleted]] ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/ ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/ ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
Re: [R-sig-phylo] rooting multiple trees
Hi Silvia, Check the help page of root(): it says the tree is unrooted before before being rooted. So your 2nd output is as expected. See the options of this function to resolve the root. For the 1st method, I think you should not do unclass(). best, Emmanuel -Original Message- From: SILVIA CALVO ARANDA silvit...@hotmail.com Sender: r-sig-phylo-boun...@r-project.org Date: Wed, 7 Aug 2013 06:08:18 To: Liam J. Revellliam.rev...@umb.edu Cc: r-sig-phylo@r-project.orgr-sig-phylo@r-project.org Subject: Re: [R-sig-phylo] rooting multiple trees ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/ ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
Re: [R-sig-phylo] Substitute for functions on the laser package
Hi Mariana, You may try the function bd.time in ape. The companion paper gives examples similar to what laser does. Best, Emmanuel -Original Message- From: Mariana Vasconcellos marian...@utexas.edu Sender: r-sig-phylo-boun...@r-project.org Date: Tue, 3 Sep 2013 23:20:17 To: Liam J. Revellliam.rev...@umb.edu Cc: r-sig-phylo@r-project.org Subject: Re: [R-sig-phylo] Substitute for functions on the laser package Thanks, Liam! But, unfortunately I don't know how to build from source. I downloaded Xcode and the laser_2.3.tar.gz. But, I have no idea of how to install from source. If someone could help with any advise, that would be great! Also, does anyone have another option to calculate decreasing speciation rate, increasing extinction rate or both using a different package? Thank you very much! -- Mariana Vasconcellos Ecology, Evolution Behavior Integrative Biology The University of Texas at Austin On Sep 3, 2013, at 10:05 PM, Liam J. Revell liam.rev...@umb.edu wrote: Hi Mariana. You can download old versions of laser from the CRAN archive (http://cran.r-project.org/src/contrib/Archive/laser/); however you will have to build from source. This is easy if you have Xcode (for Mac OS) or a gcc compiler installed. If you do not, and cannot figure out how to install from source, then respond to the list I'm sure someone will help out post a package binary for you. - Liam Liam J. Revell, Assistant Professor of Biology University of Massachusetts Boston web: http://faculty.umb.edu/liam.revell/ email: liam.rev...@umb.edu blog: http://blog.phytools.org On 9/3/2013 10:39 PM, Mariana Vasconcellos wrote: Dear all: I just performed a test of the gamma-statistic on my tree and I found that diversification is not constant in time. So, I would like to perform the function fitSPVAR, fitEXVAR and fitBOTHVAR using the laser package. But, I just saw that the laser package was removed from the CRAN repository. Could anyone tell me what other alternative package I could use to develop and test models of decreasing speciation rate, increasing extinction rate or both? Does MEDUSA do this sort of analyses? Thank you for your help! Mariana [[alternative HTML version deleted]] ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/ [[alternative HTML version deleted]] ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/ ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
Re: [R-sig-phylo] selecting sequences from an alignment
Hi, What about this: match(selected, rownames(woodmouse)) ? If this is not what you want yet, please give an example of the output you wish to get. Best, Emmanuel -Original Message- From: john d dobzhan...@gmail.com Sender: r-sig-phylo-boun...@r-project.org Date: Sat, 5 Oct 2013 08:20:34 To: mailman, r-sig-phylor-sig-phylo@r-project.org Subject: [R-sig-phylo] selecting sequences from an alignment Hi David, Thanks for your input. However, I guess I should have been more specific in my question. I'd need to extract part of the alignment based on a list of taxon names. Is there a simple way to know which lines in the DNAbin object correspond to a list of taxon names? Thanks! On Fri, Oct 4, 2013 at 11:35 PM, David Winter david.win...@gmail.com wrote: Hey Carlos, You can use the normal x[ i , j ] style to extract subsections of a DNAbin object: sub_alignment - woodmouse[selected,] To see what's going on check out ?DNAbin and dim(woodmouse) dim(sub_alignment) David [[alternative HTML version deleted]] ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/ ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
Re: [R-sig-phylo] most efficient way to read trees from huge text file
Hi, Le 10/10/2013 21:07, Juan Antonio Balbuena a écrit : Hello, I need to handle in R 10M trees produced with the program evolver of PAML. With a smaller number of trees one could create a multi-phylo object with ape as tree.m - read.tree(filename.tre) but this is impractical in this case because it takes all the RAM memory. I've tried something like library(ape) con = file(evolver.out, open=r) # where evolver.out contains 10M trees lin = readLines(con) for (i in 1:length(lin)) { + tree - read.tree(lin[i]) Replace the line above by: tree - read.tree(text = lin[i]) This may work if PAML has actually written one tree per line (ie, each tree ends with a '\n'). HTH, Emmanuel + # my additional syntax mrca analysis etc. here + } But it doesn't work: Error in file(file, r) : cannot open the connection In addition: Warning messages: 1: closing unused connection 3 (evolver.out) 2: In file(file, r) : cannot open file 'S6: 0.33544, (S1: 0.02314, S10: 0.02314): 0.31230): 0.53388, S5: 0.86932): 0.04830, (S4: 0.51901, (S7: 0.05901, S2: 0.05901): 0.46000): 0.39861): 0.08238, (S3: 0.30236, (S8: 0.13236, S9: 0.13236): 0.17000): 0.69764);': No such file or directory I would very much appreciate your help. Note that computing efficiency is key here. Thank you very much for your attention. Juan A. Balbuena ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/ ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
Re: [R-sig-phylo] Problems with bootstraps of NJ tree from SSRs data
Le 16/10/2013 00:20, Vojtěch Zeisek a écrit : Hi, Emmanuel, thank You very much! Dne Út 15. října 2013 21:45:12, Emmanuel Paradis napsal(a): Hi Vojtěch, Le 14/10/2013 23:14, Vojtěch Zeisek a écrit : Hello, I know this is FAQ, but I still haven't found solution, which would work for me... :-( I'd like to get unrooted bootstraped NJ tree. I was looking through discussions (incl. on this list), but proposed solutions seem very complicated or not suitable for my data/method. Function boot.phylo() from ape package looks promising, but it causes R to crash... This is my workflow: # Read data mydata.loci - read.loci(mydata.txt, header=TRUE, loci.sep=\t, allele.sep=/, col.pop=2, col.loci=4:17, row.names=1) # Convert to genind object mydata.genind - loci2genind(mydata.loci) # Compute genetic distances mydata.dist - dist(mydata.genind, method=euclidean, diag=TRUE, upper=TRUE) # Do the NJ - the last command, which works mydata.nj - nj(mydata.dist) # Bootstrap the tree - I see progress bar running towards 100% and when it reaches that value, R session crashes... mydata.boot - boot.phylo(mydata.nj, mydata.genind$tab, function(xx) nj(dist(xx)), B=100) With this command, you resample the columns of the matrix of alleles. I am not sure this is the best thing to do. I'd rather resample the matrix of loci, like this: mydata.boot - boot.phylo(mydata.nj, mydata.loci, function(xx) nj(dist(loci2genind(xx))), B = 100) Perfect, it works! I suppose the problem with your previous implementation (resampling the matrix of alleles) is that most alleles are at low frequencies (as common with micro-sat data), so most bootstrap samples were not meaningful. And what if I'd like to make the bootstrap of population tree? I then can't use mydata.loci, as the matrix has different size. Or if I'd like to bootstrap tree based on another method like UPGMA? If I make tree by hclust, the object is of class hclust, not phylo (which is required by boot.phylo). For the latter, it's easy: you have to incorporate as.phylo in the estimation procedure. The recommended way is to build a function that makes the phylo object from the original data, something like: f - function(x) as.phylo(hclust(dist(loci2genind(x Then build the tree directly using f: mytree - f(mydata.loci) and do the bootstrap: myboot - boot.phylo(mytree, mydata.loci, f) This way, you can modifiy f() as you want and rerun the bootstrap accordingly. Best, Emmanuel Note that if you have many individuals, computing the bootstrap values may take some time (I'm working on a new code that will be faster). HTH. Emmanuel Thank You have a nice day! Vojtěch # Plot the tree (this works, of course) plot(mydata.nj) # Add bootstrap values (mydata.boot doesn't exist) nodelabels(mydata.boot) What do I do wrong? Is there any better way, how to calculate bootstraps for microsatellite data? Sincerely, Vojtěch ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/ ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
Re: [R-sig-phylo] match two tree edges
Hi, Have a look at the function makeNodeLabel with the option method = md5sum: it will create a label to each node depending on the tips descending from it. You can then use them to match nodes and edges of different trees. Best, Emmanuel -Original Message- From: Jingchun Li jingc...@umich.edu Sender: r-sig-phylo-boun...@r-project.org Date: Mon, 11 Nov 2013 19:06:08 To: Liam J. Revellliam.rev...@umb.edu Cc: r-sig-phylo@r-project.org Subject: Re: [R-sig-phylo] match two tree edges Thanks Liam. Unfortunately for now, the second tree is read in separately because it is generated from a different software. So the nodes numbers do not match. But I get the general idea. I'll have to think more about it. Cheers, Jingchun On Mon, Nov 11, 2013 at 6:02 PM, Liam J. Revell liam.rev...@umb.edu wrote: Hi Jingchun. The function ladderize in ape does not change the node numbers of the tree in phy$edge, only the order of the edges in the matrix. To verify this, you can use the phytools function matchNodes which matches nodes between two trees. Note that if a tree that is ladderized using ladderize is written to then read from file or a text string, the node numbers *will* change. That being said, since the order of the rows in phy$edge is different between the two trees, you will need to get the new order (in terms of the original tree) to plot your edge lengths. This might look something like this: ii-sapply(B$edge[,2],function(x,y) which(y==x),y=A$edge[,2]) plot(B,no.margin=TRUE) edgelabels(A$edge.length[ii]) If your node numbers *do not* match, then this gets a little more complicated - but the same general logic can be used. All the best, Liam Liam J. Revell, Assistant Professor of Biology University of Massachusetts Boston web: http://faculty.umb.edu/liam.revell/ email: liam.rev...@umb.edu blog: http://blog.phytools.org On 11/11/2013 5:00 PM, Jingchun Li wrote: Dear all, I am trying to make a tree figure and I can not figure out the right way to do it. I have two trees, A and B. They have the same set of taxa but are in different orders. A is in the default order when read by read.tree(). B is ladderized. For my purpose, I want to plot A and label edges of A using the corresponding edge length of B. I struggled for a while and can't find a way to find matching edges between A and B. Is there a function out there that already does this or can anyone point out me a way? Thank you very much! Sorry I can't just ladderize A as well and make them match for some other reasons. Best, Jingchun [[alternative HTML version deleted]] ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r- sig-ph...@r-project.org/ [[alternative HTML version deleted]] ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/ ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
Re: [R-sig-phylo] About function .getSEs
Hi, From R: ape:::.getSEs You can also get the code from the sources on CRAN. Best, Emmanuel --Original Message-- From: Tane Kim Sender: r-sig-phylo-boun...@r-project.org To: r-sig-phylo@r-project.org Subject: [R-sig-phylo] About function .getSEs Sent: 16 Nov 2013 02:54 Hi,I have used the function ace in ape packages, and I am curious how the standard errors are calculated.I realized that .getSEs function is responsible for it, but I cannot access to its code.Would there be any way to see the code? Thanks. [[alternative HTML version deleted]] ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/ ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
Re: [R-sig-phylo] Mismatch distribution (pegas): expected versus empirical and other questions
Hi David, Thanks for this. I have included it into pegas with an option to select the types of lines, and the legend is moved to the top of the plot. I'll release this new version by the end of the year. I'm taking this opportunity to advertise the last version of coalescentMCMC (0.4) which is now stable. I made a few comparisons with LAMARC on real and simulated data and the results agree overall (but faster to get with coalescentMCMC). Four different time-dependent models are available (LAMARC has only one). The package comes with a vignette that explains how to run several chains sequentially, in parallel, or in interaction, and how to compare models (LRT, DIC, ...) For the moment, coalescentMCMC is focused on analysis of temporal dynamics with DNA sequences. If this fits your need, it is worth a try. Best, Emmanuel -Original Message- From: David Winter david.win...@gmail.com Sender: r-sig-phylo-boun...@r-project.org Date: Tue, 10 Dec 2013 12:47:39 To: Alastair Pottspott...@gmail.com Cc: mailman, r-sig-phylor-sig-phylo@r-project.org Subject: Re: [R-sig-phylo] Mismatch distribution (pegas): expected versus empirical and other questions Hi Alistair, I wrote a function to make the standard expectation under constant population size quite a while ago. It's up here https://github.com/dwinter/Pegas/blob/new_functions/R/MMD.stable.R It's been a little while since I've had to think about mismatch distributions, so, though I'm sure I would checked everything out at the time, you should probably confirm this function behaves as expected. (BTW, Emmanuel, if you want to include any of that function in PEGAS you are most welcome to) David Winter On Tue, Dec 10, 2013 at 6:27 AM, Alastair Potts pott...@gmail.com wrote: Hi all, The mismatch distribution function in pegas (MMD) produces the familiar histogram and then also a line that I always took as the theoretical expected distribution under a scenario of constant population size. I had a quick look at the MMD function and noticed that this line is not calculated using the theoretical expectations based on theta, but rather is some sort of moving-average based on the density() function. Does anyone know how to implement the correct theoretical expectation? I have dipped into the literature and am now completely lost. Also, there are a range of statistics (such as the raggedness index) and Arlequin has somehow managed to calculate a significance value for these statistics. Has anyone implemented (or can easily implement) these statistics and significance tests in R? Thanks in advance for any comments, thoughts and solutions. Cheers, Alastair Potts [[alternative HTML version deleted]] ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/ ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/ ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
Re: [R-sig-phylo] plotting support value
Hi Thomas, Have you tried prop.clades? Best, Emmanuel -Original Message- From: Thomas Manke ma...@ie-freiburg.mpg.de Sender: r-sig-phylo-boun...@r-project.org Date: Mon, 16 Dec 2013 11:52:33 To: r-sig-phylo@r-project.org Subject: [R-sig-phylo] plotting support value Dear Forum, I have a list containing sampled trees for the same taxa and would like to plot their consensus tree with the support values (absolute numbers or frequencies) for the various branches. (as done by the Consense program of the Phylip suite) I understand that boot.phylo() returns an object bs that contains such information as bs$BP. My scenario is similar, but the sampled trees are not based on sequence information and boot.phylo() is not applicable here. I assume the support values can be obtained from prop.part(), but I could not figure how. Thank you for any help or suggestions, Thomas #tr is list of trees ct-consensus(tr, p=0.5) # majority consensus pt-prop.part(tr) pc-prop.part(ct) # calculate support values sv - f(pc, pt) = ??? plot(ct); nodelabels(sv) ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/ ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
Re: [R-sig-phylo] Using the function mst in APE to produce coloured graphs
Hi, Try mst() in pegas. It returns an object of class haploNet with a more flexible plot method. Best, Emmanuel -Original Message- From: Stefani stef...@irsa1.irsa.cnr.it Sender: r-sig-phylo-boun...@r-project.org Date: Wed, 18 Dec 2013 17:35:26 To: r-sig-phylo@r-project.org Subject: [R-sig-phylo] Using the function mst in APE to produce coloured graphs Dear All, I'm a new user of R in phylogenetics and population genetics. So, please, apologize me in advance if what I'm asking is a basic and banal question. Briefly, I used the function mst of the package APE to produce a minimun spanning tree startin from a distance matrix. I used the default argument graph (nsca or cirle), or I plotted the samples by using externally produced coordinates (i.e. PCA coordinates). In all the cases, I didn't manage to apply colours to a set of defined categories of samples. I tried by defining a categorical vector and using the argument col of the function plot, like in this example: colour-as.character(PCAcor[,3]) plot(MSTtre, x1 = PCAcor[,1], x2 = PCAcor[,2], col=colour) Or I directly use the argument col of the function plot, like this (one among the various tempts): plot(MSTtre, graph=nsca, col=blue, black). In any case, I didn't get back any error message, but the plot was always black and white. May you suggest me the right approach? Many thanks! Regards, Fabrizio ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/ ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
Re: [R-sig-phylo] Warning messages if using family=poisson in compar.gee, and changing link function
Hi Nina, Don't confound warning with error. If you have the former, then R certainly didn't crash. These warnings seem to come from the fact that the model fitting did not converge after the default number of iterations. Unfortunately, this cannot be controlled from compar.gee, so you can edit the function with fix(compar.gee), then find the line starting with: geemod - do.call(gee, list(. and insert in the list this item: maxiter = 100 (or more; don't forget to separate with a comma). Save and close the editor. For the second problem you report (which is really an error), this is a bug which you can fix again with fix(compar.gee) and find this line: Qlik - switch(fname, gaussian . and replace 'fname' by 'fname$family'. BTW, your formula can be simply: FlowerNumber ~ sex * PollinationSystem The * codes for main effects + interactions, which is equivalent to (: codes for the interaction): FlowerNumber ~ sex + PollinationSystem + sex:PollinationSystem HTH Emmanuel Le 17/12/2013 21:46, Nina Hobbhahn a écrit : Dear fellow list users, we would like to analyze three variables that are count data (no zeros, no integers) as responses to two categorical predictor variables using compar.gee. An example would be compar.gee(FlowerNumber ~ sex + PollinationSystem + sex*PollinationSystem). One of our variables is normally distributed, the other two can be normalized by log-transformation. So far we have modelled our data using family=gaussian, but we would prefer to model untransformed data using a log-link function, i.e. family=poisson. However, several of our models with family=poisson crash R or result in the following warning messages: Warning messages: 1: In gee(noFH ~ poll * sex, c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, : Maximum number of iterations consumed 2: In gee(noFH ~ poll * sex, c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, : Convergence not achieved; results suspect 3: In gee(noFH ~ poll * sex, c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, : Cgee had an error (code= 104). Results suspect. Can any of you explain why are we getting this warning message? Is there a fix to any of this? Alternatively, if family=poisson really doesn't work for our data, is it possible to change the link function associated with family=gaussian from link=identity to link=log? We tried this by writing compar.gee(model, data, phy, family=gaussian(link=log)) and get the following error message: Error in switch(fname, gaussian = -sum((Y - MU)^2)/2, binomial = sum(Y * : EXPR must be a length 1 vector In addition: Warning message: In if (fname == binomial) W - summary(glm(formula, family = quasibinomial, : the condition has length 1 and only the first element will be used Is it at all possible to change the link function in compar.gee? Any input would be greatly appreciated. Nina Hobbhahn and Megan Welsford ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/ ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
Re: [R-sig-phylo] ape package gives 'Error in FI[i]:LA[i] : NA/NaN argument'
Hi Matt, This is because this accession number does not point to the sequence. For this particular one, you could use: seq1 - read.GenBank(U00096.3) Best, Emmanuel Le 27/02/2014 01:35, Matt Curcio a écrit : Greetings all,, I received this error while using R version 2.15.1 and ape 3.0.11. Error in FI[i]:LA[i] : NA/NaN argument I have tried several approaches: ref = NC_000913.3 seq1 = read.GenBank(ref) Error in FI[i]:LA[i] : NA/NaN argument ref = read.csv(file=test2.csv, head=T) ref = str(ref) # Where test2.csv is in two rows 'data.frame': 1 obs. of 1 variable: $ seq: Factor w/ 1 level NC+AF8-000913: 1 seq2 = read.GenBank(ref) Error in FI[i]:LA[i] : NA/NaN argument I was wondering if someone could lend a hand based on their own experience. ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
Re: [R-sig-phylo] Phylogenetic regression available in R (Grafen, 1989)
Hi Gustavo, Grafen's method is partially implemented in ape. The function corGrafen defines a correlation structure according to Grafen's method (see ?corClasses for all corStruct defined in ape). When used with nlme::gls this makes possible to estimate the parameter of the branch length transformation. There is no correction of the number of degrees of freedom in ape. So I guess that the results from ape::corGrafen and phyreg should be the same if the phylogeny is assumed to be known without error. Best, Emmanuel Le 05/03/2014 14:30, Gustavo Paterno a écrit : Hello Xavier, Thanks for your answer. I also might be wrong, but as far as I know, Ape has the function - compute.brlen which can calculate branch lengths throght Grafen method but can not do phylogenetic regression sensu (Grafen, 1989). Grafen method uses a different approach to control phylogeny (hanging on a Tree”, see paper for details). PGLS uses a covariance matriz to correct for phylogeny, so your degree of fredon remains the same. In Grafen method you can lose degree of freedom depending on your species relatedness. In the other hand, Brunch and crunch functions can not deal with complex models (eg. with continuous and categorigal variables together as explanatory variables) while Grafen method can deal complex models and also work if you do not know your full phylogeny (when if you have politomies) Grafen method was only available in GLIM and SAS until feb 2014. In the way I see, his new package for R is the only one that can really run his full method - phylogenetic regression. Any one has more information about it? Best wishes, On Mar 5, 2014, at 10:04 AM, Xavier Prudent prudentxav...@gmail.com wrote: Hi Gustavo, Thanks for that notification, I may be wrong, but was'nt that method already implemented in the CAPER package by David Orme (function pgls, brunch, crunch)? If yes, then which new functionalities does phyreg bring? Regards, Xavier 2014-03-05 13:46 GMT+01:00 Gustavo Paterno aspessoasmu...@gmail.com: Hello all, I just got the confirmation that the Grafen method for phylogenetic regression (Grafen, 1989) was implemented in R ! The package “phyreg” has lots of details in help. Best wishes for all. Gustavo Paterno ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/ -- --- Xavier Prudent Computational biology and evolutionary genomics Guest scientist at the Max-Planck-Institut für Physik komplexer Systeme (MPI-PKS) Noethnitzer Str. 38 01187 Dresden Max Planck-Institute for Molecular Cell Biology and Genetics (MPI-CBG) Pfotenhauerstraße 108 01307 Dresden Phone: +49 351 210-2621 Mail: prudent [ at ] mpi-cbg.de --- [[alternative HTML version deleted]] ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/ ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
Re: [R-sig-phylo] Difference running Rscript vs source with ape library
Hi Carlos, I tested your script in both ways and it may work or fail depending on the tree which is generated. However, it seems that failure is more frequent with source (about 70%) than with Rscript (about 40%). One possible explanation for this may be that the random number generator (RNG) is initialized at each call of Rscript whereas it is generated only once for all calls of source; however, why this may make a difference is not clear to me. Anyway, this error should not happen and seems to be due to the small size of the simulated tree. I continue to work on improving the underlying code so that, hopefully, this will be fixed soon. Best, Emmanuel Le 02/05/2014 18:03, Carlos Anderson a écrit : I have a very simple script using bd.time from the ape library that runs OK within the R environment or when running the script using Rscript, but it fails when calling it with source. I have posted my problem in detail on stackoverflow.com: http://stackoverflow.com/questions/23416763/difference-running-rscript-vs-source-with-ape-library Thanks in advance. Carlos Anderson Postdoctoral Research Fellow Department of Ecology and Evolutionary Biology University of Michigan carlo...@umich.edu [[alternative HTML version deleted]] ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/ ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
Re: [R-sig-phylo] What order are the tips of a tree plotted in?
Hi Henry, I suppose the two trees do not have the same topology (otherwise there would be no difficulty). Have you considered using rotateConstr() on the second tree? That'd be something like: tree2 - rotateConstr(tree2, tree1$tip.label) The node numbering won't be changed, so you can still nodelabels() without trouble. Best, Emmanuel Le 06/05/2014 11:25, Ferguson-Gow, Henry a écrit : Dear List I am trying to plot two trees facing each other with pie charts at the nodes indicating marginal likelihoods of an ancestral state. I have broken the plotting frame into 3 sections, with one in the centre for the tip labels. I am using text(rep()) to plot the tip labels into this section, but they do not match up with the tips plotted by the ladderised tree. It seems like the plot() function is not plotting the tips in the edge order, and so when the tip labels are written in 1:n order, they do not match up. How can I reorder the tip labels to match the order in which the tips are plotted by plot()? I am happy with the tip order on the plotted tree, so I would rather work out how to reorder the tip labels in the text() command, rather than reorder the tips on the tree. I hope that makes sense! Henry Ferguson-Gow [[alternative HTML version deleted]] ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/ ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
Re: [R-sig-phylo] adjacency matrix / edgelist from igraph into a phylo class
Hi Wesley, You are right: this definition is out-dated. Have a look at: http://ape-package.ird.fr/ in the 'Development' section, there is a document explaining the structure of the class phylo. Also the second edition of APER has been updated on this point. Best, Emmanuel Le 12/05/2014 10:10, Wesley Goi a écrit : Hi all, I've search quite around for methods of converting from a edgelist (from igraph) to a phylo class taken from the 2006: Analysis of Phylogenetics and Evolution with R the instructions for building an object of the phylo class seems to have been changed. Please correct me if I'm mistaken. An object of class phylo is a list with the following components. edge a two-column matrix where each row represents a branch (or edge) of the tree; the nodes and the tips are symbolized with numbers; the nodes are represented with negative numbers (the root being -1), and the tips are represented with positive numbers. For each row, the first column gives the ancestor. This representation allows an easy manipulation of the tree. ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
Re: [R-sig-phylo] plot bars across tips of a circular phylogeny
Hi Matt, See ?phydataplot in ape. This has a couple of examples. Best, Emmanuel Le 14/05/2014 16:44, Matthew Helmus a écrit : Hi all, Does anyone know of R code (or perhaps another program) to plot bars across the tips of a radial/fan phylogeny? Specifically, I have a large phylogeny and a corresponding vector of continuous trait values for the tips, and while I could use those values to vary the color and size of the tip labels, or also to plot points of varying color or size at those tips, I think a better depiction of the data would be to plot bars that vary in height. Thanks!! Matt [[alternative HTML version deleted]] ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/ ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
Re: [R-sig-phylo] Distances from nods to the root.
Alexey, See ?dist.nodes in ape. If your tree is named tr, you'll get exactly what you want with: dist.nodes(tr)[, Ntip(tr) + 1] which extracts one column of the matrix returned by dist.nodes. best, Emmanuel Le 21/05/2014 06:56, Alexey Fomin a écrit : Which package allow calculate all distances from all nods to the root? Alexey. [[alternative HTML version deleted]] ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/ ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
Re: [R-sig-phylo] cophyloplot: how to marks links differently
Hi Juan, cophyloplot() has the options col, lwd, and lty which specify the aspects of the association lines. Le 27/05/2014 16:09, Juan Antonio Balbuena a écrit : Hello This is a simple question and hope that there is a simple answer. Plotting a tanglegram, I would like to write a function where congruent tips on both trees are marked, say, with continuous lines, whereas incongruent tips are marked differently (for instance with stippled lines). A starting point could be using cophyloplot() specifying assoc=NULL and then use segment() to plot the lines accordingly, but how can one get the x, y coordinates for segment()? This is not possible (at least easily) because this function uses a different system than plot.phylo(). HTH Best, Emmanuel Any thoughts will be much appreciated. Juan A. balbuena -- Dr. Juan A. Balbuena Cavanilles Institute of Biodiversity and Evolutionary Biology University of Valencia http://www.uv.es/~balbuena http://www.uv.es/%7Ebalbuena P.O. Box 22085 http://www.uv.es/cophylpaco http://www.uv.es/cavanilles/zoomarin/index.htm 46071 Valencia, Spain e-mail: j.a.balbu...@uv.es mailto:j.a.balbu...@uv.estel. +34 963 543 658fax +34 963 543 733 *NOTE!*For shipments by EXPRESS COURIER use the following street address: C/ Catedrático José Beltrán 2, 46980 Paterna (Valencia), Spain. ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/ ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
Re: [R-sig-phylo] Pegas Tajimas Test
Hi Peters, This test cannot be used with 3 sequences. I've added a warning message about this. Best, Emmanuel Le 25/06/2014 12:17, Peters, Stuart a écrit : Hi, I am trying to run the Tajimas.test function from the r package 'Pegas' on a small sample of 3 DNA sequences and I get this error: Error in if (p 0.5) 2 * p else 2 * (1 - p) : missing value where TRUE/FALSE needed ? Any ideas? Kind regards, Stuart [[alternative HTML version deleted]] ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/ ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
Re: [R-sig-phylo] convergence in compar.gee (ape)
Dear Sereina, The number of iterations is not output by default. You can modify the function with fix(compar.gee), find this line: geemod - do.call(gee, list(formula, and insert this command after it: cat(number of iterations:, geemod$iterations, \n) save and close. HTH Best, Emmanuel Le 12/08/2014 10:57, Sereina Graber a écrit : Dear list members, I have a question concerning the compar.gee function in ape. Often, the models don`t converge, and now I would like to safe whether a model converged or not. Is there a way to somehow get the number of iterations reached the max number, or is there something like a convergence attribute which tells you whether the model converged or not? I couldn`t find anything when looking at the attributes of the model output. Thank you best wishes, Sereina ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/ ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
Re: [R-sig-phylo] Trying to read a tree with numerical node labels, but without edge length
Hi Liam and others, read.nexus() can read NEXUS files that have no TRANSLATE block. There are two problems with this file: the extra spaces between begin.../end and the semicolon, and the linebreak between the tree declaration and the Newick string. So the following file can be read by read.nexus: #NEXUS begin trees; tree tagged_tree = [U] (A,(B,(C,D)60)100); end; Best, Emmanuel Le 29/08/2014 23:25, Liam J. Revell a écrit : Hi Daniel. I think the problem is actually that read.nexus expects a translation table, and has nothing to do with the labels. You could use this simplified reading function which does not permit a translation table uses phytools read.newick internally: library(phytools) readNexus-function(file){ obj-readLines(file) obj-strsplit(obj[grep(=,obj)], ) obj-sapply(obj,function(x) x[length(x)]) if(length(obj)1) trees-read.newick(text=obj) else { trees-lapply(obj,read.newick) class(trees)-multiPhylo } trees } Let me know if this works. All the best, Liam Liam J. Revell, Assistant Professor of Biology University of Massachusetts Boston web: http://faculty.umb.edu/liam.revell/ email: liam.rev...@umb.edu blog: http://blog.phytools.org On 8/29/2014 4:42 PM, Daniel Rafael Miranda-Esquivel wrote: Dear all, I tried to read a tree like this: #NEXUS begin trees ; tree tagged_tree = [U] (A,(B,(C,D)60)100); end ; using read.nexus; while figtree reads it, ape refuses, telling that there is an error: Error in start:end : NA/NaN argument I tried readNexus in phylobase, but there is a problem reading the labels (in the real tree, not in this example), as the program complains that a label is duplicated. I will be grateful for any tip to solve this problem, All the best, Daniel ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/ ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/ ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
Re: [R-sig-phylo] problem with write.nexus.data
Hi Liam and Nicholas. write.nexus.data() accepts only lists as specified in the help page, but it's a bit of an anomaly since the same help page says the sequences must be aligned. I have modified this function so that it now accepts both lists and matrices. Best, Emmanuel Le 18/09/2014 07:22, Liam J. Revell a écrit : Hi Nicholas. I think this is a bug. Try the following to circumvent: write.nexus.data(as.list(data), file=test.nex, interleaved=TRUE, charsperline=100) Since I don't have your dataset, I can't check it; but it fixed the problem in another dataset with which I was able to reproduce the error. All the best, Liam Liam J. Revell, Assistant Professor of Biology University of Massachusetts Boston web: http://faculty.umb.edu/liam.revell/ email: liam.rev...@umb.edu blog: http://blog.phytools.org On 9/17/2014 10:46 PM, Nicholas Crouch wrote: Hi, I am having a problem with write.nexus.data, such that the file generated is nonsense. I have a very, very large data set, but have been working with a subset trying to solve this problem. My data is in .fasta format, and I am looking to convert it into nexus format. I load the data: library(ape) data - read.dna(concat.fasta, format=fasta, as.matrix=TRUE) This gives the following: data 5 DNA sequences in binary format stored in a matrix All sequences of same length: 5023 Labels: Species etc. Base composition: a c g t Also: class(data) DNAbin This is exactly the same as the woodmouse example data provided in the package ape, which I have been following when trying to solve this issue (see ?write.nexus.data). When I export the data: write.nexus.data(data, file=test.nex, interleaved=TRUE, charsperline=100) a file is produced which begins: BEGIN DATA; DIMENSIONS NTAX=135165 NCHAR=1; FORMAT DATATYPE=DNA MISSING=? GAP=- INTERLEAVE=YES; MATRIX 1 040 etc. I haven't been able to find other posts on this, any help is greatly appreciated. Sincerely, Nick ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/ ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
Re: [R-sig-phylo] Random sampling of branch lengths
Hi, compute.brtime() can use a set of pre-calculated branching times (the default is indeed a set of coalescent times), for instance: tr - read.tree(text = (A,(((B,C),D),E));) compute.brtime(tr, method = 4:1/4) will produce a tree with root time = 1 and regular inter-node times. Best, Emmanuel Le 25/09/2014 13:43, Arbuckle, Kevin a écrit : Hi, I have a problem that I am not entirely sure how to code appropriately, and any help here would be gratefully appreciated. Imagine we have a tree consisting only of a topology, for example this newick format tree: '(A,(((B,C),D),E));'. Now, consider that we wish to explore how variation in branch lengths on this topology influences the results of a particular comparative method. One way to do this would be to generate lots of trees with randomly sampled branch lengths and look to see whether the conclusions drawn are strongly influenced by the branch lengths chosen. In a simple case where ultrametric trees are unnecessary or at least not considered important then we could just randomly sample edge lengths from a given distribution, but this poses some problems when ultrametric trees are desired. I've attached a figure of the example tree given above for ease of explanation. If the total tree height is some arbitrary value (let's say 1), then the edge length of the branch leading to A is always going to be 1. If there was a sister species to A (let's call it A') along this branch then it would also be easy to do - simply randomly sample a value between 0 and 1 (call this x), assign that to both branches leading the common ancestor (A,A') to A and A', and then assign the edge length from the root to (A,A') as 1-x. But my problem is thinking how to code a general case for the whole tree, incorporating two constraints. Firstly, that the sum of all branches leading from each species to root is 1. Secondly, that branch lengths are dependent on those of their sister taxon. To illustrate the last point consider that the edges of the branches leading from common ancestor (B,C) to each of B and C is 0.2. This automatically constrains the branch leading to D (call it y) to be greater than 0.2 and also that the length of the branch leading from (B,C,D) to (B,C) must equal y-0.2. These dependencies make me unsure how to code such a function to 'randomly' sample branch lengths to generate an ultrametric tree. As an additional complication, though not an absolutely necessary one, consider that we have information about some node ages. For instance, say we know that the age of node (B,C,D,E) is 0.7. How could I incorporate these additional (user-specified) constraints into such a function as I'm looking for? I am currently using the compute.brtime function in ape to obtain branch lengths from a coalescent model, but these are often biased towards 'tippy' trees and so may not be a representative sample of the possible tree space. Am I missing something in the form of an existing function, or does anyone have any code that could do what I'm asking? I hope I have explained myself clearly enough. Cheers, Kev ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/ ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
[R-sig-phylo] new version system for ape
Dear all, There is a new system of package version for ape. A package repository has been set up at ape-package.ird.fr hosting testing versions as sources and Windows binaries (also for pegas and coalescentMCMC). The version numbers of the testing packages are always higher than the current versions on CRAN, and lower than the upcoming versions on CRAN. So, new versions can be checked for with the usual R commands. Testing versions should be useful for maintainers who develop packages depending on ape so that they can test the compatibilities before a new version of ape is on CRAN. They can also be used by anyone who wants to try the last new features. One can check if a new testing version is available with: old.packages(repos = http://ape-package.ird.fr/;) The release of a new (stable) version on CRAN is checked with the usual: old.packages() All practical details can be found on ape's web site: http://ape-package.ird.fr/ menu Installation - Stable and Testing Versions. Best, Emmanuel ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
Re: [R-sig-phylo] Potential glitch in phymltest
Hi Luiz, Thanks for the fix. It will be in the next release. Cheers, Emmanuel Le 07/11/2014 20:42, Luiz Max Carvalho a écrit : Hello Phylofolks, I've been using ape::phymltest() for a while now on a project and I just realised there is a potential glitch in the function. The reason is that the order of the models in the 'mdl' object inside the function is different from the order in .phymltest.model, what causes the cmd object to mess up. The following MWE should do the job of showing my point: ### begin MWE library(ape) .phymltest.model - c(JC69, JC69+I, JC69+G, JC69+I+G, K80, K80+I, K80+G, K80+I+G, F81, F81+I, F81+G, F81+I+G, F84, F84+I, F84+G, F84+I+G, HKY85, HKY85+I, HKY85+G, HKY85+I+G, TN93, TN93+I, TN93+G, TN93+I+G, GTR, GTR+I, GTR+G, GTR+I+G) phymltest.test - function(seqfile, format = interleaved, itree = NULL, exclude = NULL, execname = NULL, append = TRUE) { N - length(.phymltest.model) format - match.arg(format, c(interleaved, sequential)) fmt - rep(, N) if (format != interleaved) fmt[] - -q boot - rep(-b 0, N) # to avoid any testing mdl - paste(-m, rep(c(JC69, K80, F81,HKY85,F84, TN93, GTR), each = 4)) tstv - rep(-t e, N) # ignored by PhyML with JC69 or F81 inv - rep(c(, -v e), length.out = N) ## no need to use the -c option of PhyML (4 categories by default if '-a e' is set): alpha - rep(rep(c(-c 1, -a e), each = 2), length.out = N) tree - rep(, N) cmd - paste(execname, -i, seqfile, fmt, boot, mdl, tstv, inv, alpha, tree, --append ) imod - 1:N if (!is.null(exclude)) imod - imod[!.phymltest.model %in% exclude] for (i in imod) print(cmd[i]) } phymltest.test(seqfile = seqname, exclude = c(K80, K80+I, K80+G, K80+I+G, F81, F81+I, F81+G, F81+I+G, F84, F84+I, F84+G, F84+I+G, TN93, TN93+I, TN93+G, TN93+I+G), execname = execname) 1] execname -i seqname -b 0 -m JC69 -t e -c 1 --append [1] execname -i seqname -b 0 -m JC69 -t e -v e -c 1 --append [1] execname -i seqname -b 0 -m JC69 -t e -a e --append [1] execname -i seqname -b 0 -m JC69 -t e -v e -a e --append [1] execname -i seqname -b 0 -m F84 -t e -c 1 --append [1] execname -i seqname -b 0 -m F84 -t e -v e -c 1 --append [1] execname -i seqname -b 0 -m F84 -t e -a e --append [1] execname -i seqname -b 0 -m F84 -t e -v e -a e --append [1] execname -i seqname -b 0 -m GTR -t e -c 1 --append [1] execname -i seqname -b 0 -m GTR -t e -v e -c 1 --append [1] execname -i seqname -b 0 -m GTR -t e -a e --append [1] execname -i seqname -b 0 -m GTR -t e -v e -a e --append ## end MWE I think it's easily solved by replacing mdl with paste(-m, rep(c(JC69, K80, F81, F84HKY85, TN93, GTR), each = 4)) Anyways, hope to clarify this minor issue. Cheers, Luiz ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
Re: [R-sig-phylo] overall p-value from multiple PGLS regression
Hi Pascal, See ?anova.gls. HTH Emmanuel Le 08/11/2014 17:49, Pascal Title a écrit : Hi all, I am running a number of PGLS regressions, some of which are multiple regressions. I am using the nlme package with the corBrownian error structure. If I build a model M via multiple regression, then I can get a summary of the model that includes p values as well as a number of other statistics. But how can I get an overall p-value for this model? In the same vein, as value, std. error, t-value and p-value are all returned for each of the traits separately, what does one report for multiple regression models? I have been unable to find a paper that actually reported this type of information for such a model. Here is a toy example: require(nlme) require(phytools) tree - pbtree(n=20) dat - as.data.frame(fastBM(tree, nsim=4)) colnames(dat) - paste('trait', 1:4, sep='') M - gls(trait1 ~ trait2 + trait3 + trait4, data=dat, correlation=corBrownian(1, tree)) summary(M) Any help/advice would be greatly appreciated. Thanks! -Pascal ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
Re: [R-sig-phylo] advice about diversification analyses
Hi Karla, You may consider using the methods we developed in this paper: Paradis E., Tedesco P. A. Hugueny B. 2013. Quantifying variation in speciation and extinction rates with clade data. Evolution 67: 3617–3627. The associated R code is available on-line as SI of the paper. Best, Emmanuel Le 08/01/2015 13:50, Karla Shikev a écrit : Dear R-sig-phyloers, I have a backbone phylogeny (~200 terminals) and diversity data (~2 species) and a multistate character scored for each tip (including a few polymorphisms). I'd like to estimate both possible effects of different states of this character on lineage diversification and the transition probabilities between states. My impression is that MuSSE wouldn't be able to handle such high level of missing data, and other methods that I've tried will either do one analysis or the other. What methods would you suggest? Thanks! Karla [[alternative HTML version deleted]] ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/ ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/