Re: [R-sig-phylo] Simulation function from Discrete

2011-12-06 Thread Emmanuel Paradis
Hi Ekaphan,

See the function rTraitMult. You have to program your model though.

Cheers,

Emmanuel
-Original Message-
From: Ekaphan Kraichak ekraic...@gmail.com
Sender: r-sig-phylo-boun...@r-project.org
Date: Tue, 6 Dec 2011 09:58:34 
To: r-sig-phylo@r-project.org
Subject: [R-sig-phylo] Simulation function from Discrete

Hi all, 

I am currently doing analyses on phylogenetic correlations of multiple discrete 
traits (based on Pagel 1994 paper). For the most traits, there were only a few 
transitions on the tree, making it difficult to use likelihood ratio tests for 
hypothesis testing. I was wondering if there is a way to simulate character 
data under Pagel's Independent Model in R. This function used to be in the 
Discrete software, which is no longer distributed. 

I am aware of rTraitDisc and sim.char in ape and geiger, but not sure how to 
make it simulate data for correlated traits. Any help on this would greatly be 
appreciated. 

Thank you, 

Ekaphan Kraichak
___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo


Re: [R-sig-phylo] taxonomic tables and trees

2011-12-13 Thread Emmanuel Paradis
Hi David,

It's in ape:

?as.phylo.formula

Cheers,

Emmanuel
-Original Message-
From: David Buckley dbuck...@mncn.csic.es
Sender: r-sig-phylo-boun...@r-project.org
Date: Tue, 13 Dec 2011 12:56:18 
To: r-sig-phylo@r-project.org
Subject: [R-sig-phylo] taxonomic tables and trees

Hi all,
I remember seeing an R script/application to transform a taxon table (say an 
excell table containing taxonomic information for a group organized as orders, 
families, genera, etc. in columns) onto a 'phylogenetic/taxonomic' tree (groups 
defined by the nested taxonomic ranks), but I can't remember where was that. 
Any hint?

best and thanks

david


David Buckley
Dpt. Biodiversidad y Biología Evolutiva
Museo Nacional de Ciencias Naturales, CSIC
c/José Gutiérrez Abascal 2
28006-Madrid
Spain
Phone: +34 91 411 13 28 ext. 1124
dbuck...@mncn.csic.es

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo


Re: [R-sig-phylo] Meaning of Transformed Phylogenetically Dependent Variables in GLS?

2012-01-11 Thread Emmanuel Paradis

Hi Nicolai,

Nicolai Cryer wrote on 06/01/2012 01:14:

Hi all,

I am under the impression that, given two continuous variables that can be
shown (Pagel's Lambda or Gittleman  Kot's autocorrelation) to be dependent
on phylogeny, calling gls(variable1 ~ variable 2, correlation = corStruct)
returns a regression on derived variables that have been transformed given
the known correlation structure (under some evolutionary model).


Yes, your impression is correct: these are the phylogenetically 
independent contrasts.



My question is: is there any point to plotting a regression on this
internally transformed data?


Yes, see ?pic.


I have some continuous variables that are quite meaningful, but when I
manually perform the transformation I get values that are meaninglessly
high (like a forearm length of 1,2 meters, compared to 6,5 centimeters
originally).

I might be doing the manual transformation the wrong way, in which case I'd
appreciate a comment.


I guess you want to ask a quite different question that is: given the 
phenotypes for species 1, species 2, ..., what would be the phenotype of 
species 2 if it were /not/ evolved from a common ancestor with species 
1? I think it's a more complicated issue that cannot be solved with a 
simple back-transformation.


Best,

Emmanuel


For variables X and Y

varmatrix- vcv.phylo(phy = tree, correlation = corGrafen(1, tree))
transmatrix- chol(varmatrix)

X.transformed- transmatrix %*% X
Y.transformed- transmatrix %*% Y
mylm- lm(Y ~ X)
plot(Y ~ X)
abline(coef(mylm))

The resulting estimates are different than from a normal gls, gls(y ~ x,
correlation = corGrafen(1, tree)), so I would be happy to know what it is
I'm looking for.

Thanks a lot,

Best,

NC

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo



--
Emmanuel Paradis
IRD, Jakarta, Indonesia
http://ape.mpl.ird.fr/

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo


Re: [R-sig-phylo] 2 root edges in phylocom tree, but suggested fix in ape FAQ doesn't help

2012-01-12 Thread Emmanuel Paradis

Hi Ben  Megan,

It's not the labelling of internal nodes which is the problem here: the 
'extra' right parentheses imply the presence of nodes of degree two (ie, 
non-bifurcating).


read.tree(text=(((abc),(ghi)),(mno));)

also fails (for sure, the error message is not very explicit). It's been 
some time that this issue is around: I think it's time to create a new 
class derived from phylo to solve it. This will involve some recoding 
of course. Currently, I'm working on finalizing the release of ape 3.0, 
so this will have to wait a couple of months.


Best,

Emmanuel

Ben Bolker wrote on 13/01/2012 02:10:

On 12-01-12 01:54 PM, Megan Bartlett wrote:

Hi,

I'm trying to use R package ape to manipulate the phylocom megatree,
version R20091110. However, whenever I try to read in the tree with:

read.tree(R20091110.new.txt)

I get the error message:

Error in if (sum(obj[[i]]$edge[, 1] == ROOT) == 1  dim(obj[[i]]$edge)[1]

  :

   missing value where TRUE/FALSE needed

I know that the suggested fix is to delete ( and )euphyllophyte from
the beginning and end of the file, but this doesn't solve the problem.
Trying to read the new tree gives me exactly the same error message. What
should I be doing instead?

Thanks so much for your help!

Best,

Megan Bartlett

[[alternative HTML version deleted]]



   I started to respond to this on r-h...@r-project.org ... it's best not
to cross-post if you can help it, or at least to say that you're
crossposting (and why).

   Based on this small experiment:

library(ape)
r-
read.tree(url(http://svn.phylodiversity.net/tot/megatrees/R20091110.new;))

r- read.tree(text=(((abc)def,(ghi)jkl),(mno)pqr);)
r- read.tree(text=((def,jkl),pqr);)

   it looks like ape's read.tree() doesn't like the internal-node
labeling convention that phylocom is using.  Someone else will probably
respond more helpfully ...

   Ben Bolker


--
Emmanuel Paradis
IRD, Jakarta, Indonesia
http://ape.mpl.ird.fr/

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo


Re: [R-sig-phylo] Add new leaves to a tree

2012-01-12 Thread Emmanuel Paradis

Hi Brian,

See bind.tree.

Best,

Emmanuel

Brian Pellerin wrote on 10/01/2012 21:36:

Hello R users,

How do I grow a whole new set of leaves onto an existing tree?

### example with random tree
library(ape)
nleaf-1000  #Number of leaves on a tree
br-sample(0:20,2*(nleaf-1),replace=T)#Branch lengths for tree
tr-rtree(nleaf,br=br)   #Tree crown


Thanks for your help.

Sincerely,
Brian

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo



--
Emmanuel Paradis
IRD, Jakarta, Indonesia
http://ape.mpl.ird.fr/

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo


Re: [R-sig-phylo] randomizing clades within a node

2012-01-18 Thread Emmanuel Paradis

Hi Juan,

Juan Antonio Balbuena wrote on 13/01/2012 21:57:

  Hello

I would very much appreciate if someone can help me with this one. I
wish to compare two additive trees that are identical for some but not
all the clades.

So I use rtree of the ape package with, say, 10 tips to generate tree
A. Then I wish to modify A to get tree B by randomizing the clades
and patristic distances in a given number of nodes. If the number of
nodes is, for instance, three, the 3 nodes furthest apart from the
origin will be randomized. Then A and B will be compared. The whole
procedure needs to be carried out a large number of times (10,000).


The functions you need are extract.clade and bind.tree. The both have 
options to allow you to flexibly handle eventual root edges, so have a 
look at their help pages.


HTH

Emmanuel


Any help would be much appreciated.

Juan A. Balbuena

--
Dr. Juan A. Balbuena
Marine Zoology Unit
Cavanilles Institute of Biodiversity and Evolutionary Biology
University of Valencia http://www.uv.es/~balbuena
http://www.uv.es/%7Ebalbuena
P.O. Box 22085 http://www.uv.es/cavanilles/zoomarin/index.htm
46071 Valencia, Spain http://cetus.uv.es/mullpardb/index.html
e-mail: j.a.balbu...@uv.es mailto:j.a.balbu...@uv.es tel. +34 963 543
658 fax +34 963 543 733

NOTE! For shipments by EXPRESS COURIER use the following street address:
C/ Catedrático José Beltrán 2, 46980 Paterna (Valencia), Spain.




___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo


--
Emmanuel Paradis
IRD, Jakarta, Indonesia
http://ape.mpl.ird.fr/

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo


Re: [R-sig-phylo] plotting character states on a circular cladogram

2012-02-02 Thread Emmanuel Paradis

Hi Rafael,

Rafael Rubio de Casas wrote on 02/02/2012 06:24:

Hi Nicolai,
Thanks for your message. I guess I wasn't completely clear in my
previous message.
tiplabels() does do a similar thing to what I want. In fact, it is what
I ahve been using. The problem is that I think that with tiplabels you
cannot place the colored dots at a particular position relative to the
tips on a circular cladogram.


You are right.


I thought that points() would be easier to adapt for that, but maybe not.


tiplabels() actually calls points() so you have to tell the function how 
to offset the coordinates which is not obvious because they are 
circular, -- though plot.phylo() does it when label.offset is used.


I guess what you want is to prevent the labels to hide the terminal 
edges. Here's a suggestion for this:


plot(phy, type=fan, label.offset=2, plot=FALSE)
tiplabels(pch=21, cex=1.5, bg=label[phy$tip.label])
par(new=TRUE)
plot(phy, type=fan, label.offset=2)

The idea is to first draw the tip labels, and then the tree, so that the 
terminal edges are not hidden by the labels. The first line sets the 
drawing region but nothing is plotted so the labels can be drawn. The 
trick is to call par(new=TRUE) so that plot() does not refresh the graph.


HTH

Emmanuel


Thanks for your help, anyways.
Cheers,
Rafa
On 2/1/2012 5:25 PM, Nicolai Cryer wrote:

Hey Rafael,

(By the way, thanks for the easily reproducible code).

I think you're looking for the tiplabels() function, so instead of
using points, try:

tiplabels(pch=21,cex=1.5, bg=label[phy$tip.label])

This should amount to the same kind of color coding, but the layout
doesn't break when plotting a radial tree.

Hope that helps,

Best,

Nicolai

On 2/1/2012 2:40 PM, Rafael Rubio de Casas wrote:

Dear R-Sig-Phylo users,
I have what I believe should be an easy problem:
I want to plot character states on the tips of a fan cladogram produced
by ape.
What I am looking for, I believe, is something similar to what you can
do with points on a rectangular phylogram. That is, assign points to
tips colored according to their trait value.
For instance, the following code generates what I want for a rectangular
phylogram:
library(ape); library(geiger)
phy=rcoal(25, tip.label=c(letters[1:25]))
phy=rescaleTree(phy,25)
trait=as.vector(sample(0:1,25,replace=T))
names(trait)=phy$tip.label
label=character(length(phy$tip.label))
names(label)=names(trait)
label[trait==0]=dodgerblue
label[trait==1]=firebrick
plot(phy, show.tip.label=T, label.offset=1.5,no.margin=TRUE)
points(rep(26, length(phy$tip.label)), 1:length(phy$tip.label), pch=21,
cex=1.5, bg=label[phy$tip.label])
My problem is that I don't know how to get the same sort of mapping of
trait states at the tips when using
plot(phy, type=fan)

Thanks in advance,
Rafa





--
Emmanuel Paradis
IRD, Jakarta, Indonesia
http://ape.mpl.ird.fr/

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo


Re: [R-sig-phylo] Estimating MRCA dates

2012-02-02 Thread Emmanuel Paradis

Hi Dan,

This is indeed an issue. I had the opportunity a few months ago to get 
the outputs from r8s and chronopl on the same data: this showed some 
difficulties with r8s, namely the starting and final values of the PL 
were identical.


I'm currently working on chronopl because it fails to find decent 
starting values when several nodes are defined within intervals (it's 
looping endlessly). I've more or less solved this problem and this 
should be fixed in ape 3.0.


I admit that chronopl has some weaknesses. I can see three explanations 
(all non-exclusive):


1) The implemantation in chronopl is flawed in which case it can be fixed.

2) The model of correlated rates is not appropriate in most cases. I 
remember to have seen a paper stating that the relaxed clock model 
(without correlation of rates) better fits real data.


3) The problem of optimizing a function of so many parameters is too 
difficult for the currently used algorithm in which case a Bayesian 
approach might be more appropriate, or using a better optimization 
algorithm.


Cheers,

Emmanuel

Dan Rabosky wrote on 02/02/2012 22:55:


Hi Brian-

It has been awhile since I played with chronopl(), but at the time, I convinced 
myself that results were substantially different from penalized likelihood as 
implemented in r8s. I haven't explored this exhaustively, but at the time, I 
was very much convinced that the results of chronopl were potentially 
problematic. Other folks have mentioned to me that they've observed similar 
differences between r8s and chronopl and that they've recovered bizarre 
smoothed trees from chronopl. Perhaps someone with more recent chronopl 
experience can weigh in on this...

~Dan



On Feb 2, 2012, at 6:06 AM, Brian O'Meara wrote:


Ape can do this, function chronopl. In general, one good way to find out
which package can do something is by looking at the task view for
phylogenetics: http://cran.r-project.org/web/views/Phylogenetics.html .
This will also allow you to install all the phylogenetics packages (at
least those on CRAN) with just a couple of commands. An updated version
should be coming out later today (and if someone notices other things
missing, please let me know).

Brian

___
Brian O'Meara
Assistant Professor
Dept. of Ecology  Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info

Students wanted: Applications due Dec. 15, annually
Postdoc collaborators wanted: Check NIMBioS' website
Funding wanted: Want to collaborate on a grant?




___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo



--
Emmanuel Paradis
IRD, Jakarta, Indonesia
http://ape.mpl.ird.fr/

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo


Re: [R-sig-phylo] plotting character states on a circular cladogram

2012-02-02 Thread Emmanuel Paradis

Rafael Rubio de Casas wrote on 02/02/2012 22:02:

  Hi,
Thanks, Emmanuel. I did try to play around with tip.labels (by the by, I
did not know it called points, although I guess it makes a lot of sense).
The problem I run into is that, for some reason, label.offset does not
seem to work with the graphical labels. For instance, I would expect two
concentric circles of colored dots with:


It is indeed possible to do what you want (the code is in plot.phylo), 
but since you are working on a textbook example, my humble suggestion is 
to use a linear tree for this; it'll be also much easier to plot 
information on several traits. You'll get the same information with a 
fan tree but more difficult to decrypt by the human eye -- though 
others might have different opinions on this.



plot(phy, type=fan, label.offset=5, plot=FALSE)
tiplabels(pch=21, cex=1.5, bg=label[phy$tip.label])
par(new=TRUE)
plot(phy, type=fan, label.offset=2, plot=FALSE)
tiplabels(pch=21, cex=1.5, bg=label[phy$tip.label])

However only one is produced, and those dots are placed right on the
tips. You can see that if you then follow with

par(new=TRUE)
plot(phy, type=fan, label.offset=2)

I guess this is mostly irrelevant, But I am working on a figure for a
book and I would like it to be a textbook example
Also, when node states are drawn with nodelabels(), if if the terminal
branches are very short, the plot of the nodes get confused with those
of the tips
For instance, if you try:

ace.ER=ace(trait,phy,type=discrete)
nodelabels(pie=ace.ER$lik.anc, piecol=c(dodgerblue,firebrick),
cex=0.5,lwd=0.5)

You can see that the last node plots are at the same level of the tip
plots. If tips were denser (i.e., if the tree had more terminals) that
would make the plot virtually unreadable.


You might prefer to plot your tree with:

plot(phy, c, FALSE)

You might want to try 'thermo' instead of 'pie'.

Best,

Emmanuel


Cheers,
Rafa

On 2/2/2012 7:03 AM, Emmanuel Paradis wrote:

Hi Rafael,

Rafael Rubio de Casas wrote on 02/02/2012 06:24:

Hi Nicolai,
Thanks for your message. I guess I wasn't completely clear in my
previous message.
tiplabels() does do a similar thing to what I want. In fact, it is what
I ahve been using. The problem is that I think that with tiplabels you
cannot place the colored dots at a particular position relative to the
tips on a circular cladogram.


You are right.


I thought that points() would be easier to adapt for that, but maybe
not.


tiplabels() actually calls points() so you have to tell the function
how to offset the coordinates which is not obvious because they are
circular, -- though plot.phylo() does it when label.offset is used.

I guess what you want is to prevent the labels to hide the terminal
edges. Here's a suggestion for this:

plot(phy, type=fan, label.offset=2, plot=FALSE)
tiplabels(pch=21, cex=1.5, bg=label[phy$tip.label])
par(new=TRUE)
plot(phy, type=fan, label.offset=2)

The idea is to first draw the tip labels, and then the tree, so that
the terminal edges are not hidden by the labels. The first line sets
the drawing region but nothing is plotted so the labels can be drawn.
The trick is to call par(new=TRUE) so that plot() does not refresh the
graph.

HTH

Emmanuel


Thanks for your help, anyways.
Cheers,
Rafa
On 2/1/2012 5:25 PM, Nicolai Cryer wrote:

Hey Rafael,

(By the way, thanks for the easily reproducible code).

I think you're looking for the tiplabels() function, so instead of
using points, try:

tiplabels(pch=21,cex=1.5, bg=label[phy$tip.label])

This should amount to the same kind of color coding, but the layout
doesn't break when plotting a radial tree.

Hope that helps,

Best,

Nicolai

On 2/1/2012 2:40 PM, Rafael Rubio de Casas wrote:

Dear R-Sig-Phylo users,
I have what I believe should be an easy problem:
I want to plot character states on the tips of a fan cladogram produced
by ape.
What I am looking for, I believe, is something similar to what you can
do with points on a rectangular phylogram. That is, assign points to
tips colored according to their trait value.
For instance, the following code generates what I want for a
rectangular
phylogram:
library(ape); library(geiger)
phy=rcoal(25, tip.label=c(letters[1:25]))
phy=rescaleTree(phy,25)
trait=as.vector(sample(0:1,25,replace=T))
names(trait)=phy$tip.label
label=character(length(phy$tip.label))
names(label)=names(trait)
label[trait==0]=dodgerblue
label[trait==1]=firebrick
plot(phy, show.tip.label=T, label.offset=1.5,no.margin=TRUE)
points(rep(26, length(phy$tip.label)), 1:length(phy$tip.label), pch=21,
cex=1.5, bg=label[phy$tip.label])
My problem is that I don't know how to get the same sort of mapping of
trait states at the tips when using
plot(phy, type=fan)

Thanks in advance,
Rafa








--
National Evolutionary Synthesis Center
*NESCent http://www.nescent.org/*
2024 W. Main Street, Suite A200
Durham, NC27705
r...@nescent.org mailto:r...@duke.edu
919.668.9107


--
Emmanuel Paradis
IRD, Jakarta, Indonesia
http://ape.mpl.ird.fr

[R-sig-phylo] query about the option 'original.data' in write.nexus()

2012-02-05 Thread Emmanuel Paradis

Hi all,

I am thinking to remove the option 'original.data' in write.nexus(). The 
reasons are:


1. It doesn't seem useful because most NEXUS files read by this function 
store only trees (at least those reported to me). I believe most users 
keep their data (seqs, morpho, ...) and trees in separate files.


2. If some tips are deleted/removed from the tree, keeping track with 
the original file might be tricky (currently nothing is done to check this).


3. phylobase has better support for NEXUS files, so trying to fix the 
above in ape will be redundant.


4. Removing this option will make write.tree() slightly more efficient.

Any advice/comment welcome.

Cheers,

Emmanuel
--
Emmanuel Paradis
IRD, Jakarta, Indonesia
http://ape.mpl.ird.fr/

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo


Re: [R-sig-phylo] collapse nodes based on node labels

2012-02-16 Thread Emmanuel Paradis

Hello Ben,

Ben Weinstein wrote on 13/02/2012 00:36:

Hello all,

Is there an analogous function to ape's di2multi which
collapses descendant branches based on an input node? Instead of minimum
branch tolerance to create polytomies, what would the best strategy to
collapse all descendant branches of node X. I've played around with
node.leaves, but haven't had great luck. I'm considering pruning the tree,
collapsing the entire subtree and regrafting it on.


which.edge() might be useful here. You may need to combine it with 
prop.part() though, and a way to identify internal edges, eg:


phy$edge[, 2]  Ntip(phy)


Similarly to this, is there a way to maintain the original node numbering,
even after setting nodelabels? For example if i set my nodelabels to my
support values, but still want to highlight the first node in the tree. The
support values are non unique (multiple 99s), so it becomes harder to
specify commands to specific nodes.


I don't follow. nodelabels() only draws labels on a plotted tree and 
doesn't modify the node numbering. See the examples to see how to call 
this function several times on the same tree.


Best,

Emmanuel


I appreciate the help,

Ben Weinstein



--
Emmanuel Paradis
IRD, Jakarta, Indonesia
http://ape.mpl.ird.fr/

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo


Re: [R-sig-phylo] adding a new taxa to an existing clade in a tree set

2012-02-16 Thread Emmanuel Paradis

Hi Annemarie,

Annemarie Verkerk wrote on 16/02/2012 09:57:

Hi all,

I have a question regarding the manual addition of a leaf to an existing
clade in a set of trees. The reason I ask is because I want to test the
influence of adding ancestral data to an ancestral state estimation. I
have some ancestral data on the feature I want to investigate, and I
know from other sources where the leaf is situated on the trees, but I
don't have any data that I can use to actually build new trees which
also incorporate this leaf at the moment. So I'm hoping to add it
manually and see whether it makes a difference.

I know it is possible to add a leaf to any of the external edges (I
mean, edges leading to the leaves) of a tree using bind.tree() in ape.
However, if I wanted to add a leaf to a lower position in the tree, say
on an branch that splits and leads to two other leaves, it becomes
difficult: not all the trees in the sample might feature the grouping of
those two leaves, and even if they do, the relevant edge to which I
would want to add the new leaf will have a different number in each tree.

So I guess I have two questions:
1. Is there an easy way to select the trees that have a certain clade,
so I can try to add a new leaf to only those trees? (I suspect a lot of
error messages if I try to do it for the complete tree sample, as there
will be trees that do not have the relevant clade.)


see ?is.monophyletic


2. Is there a way (for those trees that have the clade I want to add the
new leaf to) to define the relevant edge using only the leaf numbers and
not the internal nodes/edge numbers? I.e., I want to add the leaf to the
internal edge that leads to the leaves with number 5, 6, and 7, for
instance, is there a way identify that internal edge?


Yes, you can use mrca() like this:

m - mrca(phy)
phy$edge[, 2] == m[tip1, tip2]

A more general way (ie, with clades made of 2 or more tips) is to use 
prop.part(phy): it will return a list of the tips descendant from each 
node (in other words, the clades of the tree). This allows you also to 
combine both steps:


list_of_tips - sort(tip1, tip2, ...etc

pp - prop.part(phy)
node - NULL
for (i in seq_along(phy)) {
if (isTRUE(all.equal(pp[[i]], list_of_tips))) {
 node - i + Ntip(phy)
 break
}
}
if (is.null(node)) ## go to next tree
else {
phy$edge[, 2] ==  node
## etc...

HTH

Emmanuel


Many thanks for any suggestions,
Annemarie


___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo


Re: [R-sig-phylo] How to replace a clade with a single node?

2012-03-01 Thread Emmanuel Paradis
Hello Ania,

See the functions drop.tip and bind.tree. They both have an 'interactive' 
option. If you want to use node labels, if available, you can find a node 
number from its label, say xyz, with:

grep(xyz, phy$node.label) + Ntip(phy)

Best,

Emmanuel
-Original Message-
From: Ania Tassinari anna.krentow...@gmail.com
Sender: r-sig-phylo-boun...@r-project.org
Date: Thu, 1 Mar 2012 13:58:23 
To: r-sig-phylo@r-project.org
Subject: [R-sig-phylo] How to replace a clade with a single node?

Hello everyone,

I am working with a large tree with ~450 tips. Some leaves are
redundant as the organisms come from same Genus. I would like to be
able to replace the redundant clades with a single leaf representative
of the members of the clade.

For example, if I have:

library(ape)
myNewick = (A,(B,(C1,C2)));
myTree  = read.tree(text=myNewick)
plot(myTree)

I would like to be able to replace the C1 and C2 clade with a single C
leaf, so that:

myNewick2 = (A,(B,C));

I would like to not rely on the edge.length if possible but instead on
the node label. Although, for some reason myTree$node.label isn't
always an available option.

Thank you very much in advance for any help!

Best,
Ania

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo


Re: [R-sig-phylo] 3. partial correlation with gls residuals? (Tom Schoenemann)

2012-03-14 Thread Emmanuel Paradis
 of Anthropology
Indiana University
Bloomington, IN  47405
Phone: 812-855-8800
E-mail: t...@indiana.edu

Open Research Scan Archive (ORSA) Co-Director
Consulting Scholar
Museum of Archaeology and Anthropology
University of Pennsylvania

Homepage: http://mypage.iu.edu/~toms/










[[alternative HTML version deleted]]

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo


__

Alejandro Gonzalez Voyer

Post-doc

Estación Biológica de Doñana
Consejo Superior de Investigaciones Científicas (CSIC)
Av Américo Vespucio s/n
41092 Sevilla
Spain

Tel: + 34 - 954 466700, ext 1749

E-mail: alejandro.gonza...@ebd.csic.es

Web site (Under construction):

Personal page: http://consevol.org/members/alejandro.html

Group page: http://consevol.org/index.html

For PDF copies of papers see:

http://csic.academia.edu/AlejandroGonzalezVoyer



[[alternative HTML version deleted]]




___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo


--
Emmanuel Paradis
IRD, Jakarta, Indonesia
http://ape.mpl.ird.fr/

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo


Re: [R-sig-phylo] plot.phylo x.lim/y.lim problem

2012-03-21 Thread Emmanuel Paradis
Hi Alex,

To fix this, do fix(plot.phylo) and look for this bit (around line #259):

if (phyloORclado  direction == downwards)
yy - y.lim[2] - yy

Change the second line for:

yy - max(yy) - yy

This will be in the next release. Thank you for the report.

Cheers,

Emmanuel
-Original Message-
From: Alex Perkins taperk...@ucdavis.edu
Sender: r-sig-phylo-boun...@r-project.org
Date: Wed, 21 Mar 2012 15:51:50 
To: r-sig-phylo@r-project.org
Subject: [R-sig-phylo] plot.phylo x.lim/y.lim problem

I am trying to plot a phylogram along a specified time axis. For
example, if a phylogeny spans 30 time units, I would like to be able
to set the time axis to span 50 time units. This can be achieved with
the x.lim or y.lim options, depending on the direction of the tree.
However, this always adds the extra time on the node side, whereas I
need to add it on the root side. In other words, the tree always gets
placed on the top of the plot in the code below, but I need to have it
on the bottom. Any help would be greatly appreciated!

library(ape)
data(bird.orders)
# As you can see below, it doesn't matter what I set y.lim to. I get
the same thing!
# I'm sure this is a rookie mistake, but I don't know what else to do.
plot(bird.orders, show.tip.label = FALSE, direction = 'downwards',
y.lim = c(-70, 30))
plot(bird.orders, show.tip.label = FALSE, direction = 'downwards',
y.lim = c(0, 100))
plot(bird.orders, show.tip.label = FALSE, direction = 'downwards', y.lim = 100)

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo


Re: [R-sig-phylo] behavior of is.monophyletic()

2012-03-23 Thread Emmanuel Paradis
Hi Bret,

This is fixed and will be in the next release. In the meantime, you can edit 
the function and change the last line to:

all.equal(tips, desc)

Thanks for the report. 

Cheers,

Emmanuel
-Original Message-
From: Bret Larget lar...@stat.wisc.edu
Sender: r-sig-phylo-boun...@r-project.org
Date: Thu, 22 Mar 2012 18:05:24 
To: jnylan...@users.sourceforge.net
Cc: r-sig-phylo@r-project.org
Subject: [R-sig-phylo] behavior of is.monophyletic()


Please note the following discrepency in the behavior of 
is.monophyletic().

--
 foo = read.tree(text=((1:2.0,(2:1.0,3:1.0):1.0):1.0,4:3.0);)
 is.monophyletic(foo,tips=2:3)
[1] TRUE
 is.monophyletic(foo,tips=c(2,3))
[1] FALSE
-

I am aware that the function works properly when tips are specified as 
characters.  However, the documentation states that tips may be specified 
as numeric.  The difference between c(2,3) and 2:3 is that the former is 
class numeric and the latter is class integer.  I cannot think of any good 
reason for why this distinction should result in different behavior of the 
function.  It would be best if the function worked correctly in all 
situations, but in particular when the input matches the usage specified 
by the documentation.

--
Usage

is.monophyletic(phy, tips, reroot = !is.rooted(phy), plot = FALSE, ...)

Arguments

phy  a phylogenetic tree description of class phylo.
tips a vector of mode numeric or character specifying the tips to be
   ^^^
  tested.
reroot   a logical. If FALSE, then the input tree is not unrooted before
  the test.
plot a logical. If TRUE, then the tree is plotted with the specified
  group tips highlighted.
...  further arguments passed to plot.
-


I hope that this can be corrected in a future version of ape.

-Bret Larget

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo


Re: [R-sig-phylo] Error with PGLS under OU model using corMartins

2012-03-23 Thread Emmanuel Paradis
Hi Nick,

You can fix the value of alpha in corMartins:

corMartins(1, tree, fixed=TRUE)

You can try with different values to see which one gives the best fit.

Best,

Emmanuel
-Original Message-
From: Nicholas Mason nicholas.albert.ma...@gmail.com
Sender: r-sig-phylo-boun...@r-project.org
Date: Thu, 22 Mar 2012 12:57:59 
To: r-sig-phylo@r-project.org
Subject: [R-sig-phylo] Error with PGLS under OU model using corMartins

Dear Rsigphylo users,
I am having issues running PGLS (with nlme function gls()) under an OU model 
using the corMartins command from ape. I've included links to a reproducible 
example via my dropbox account below. 

Previous posts have been made about the same error I've encounter, but neither 
thread seems to have relevant replies or have resolved the issue. Apologies if 
I've missed something else from previous rsigphylo threads.
https://stat.ethz.ch/pipermail/r-sig-phylo/2011-January/000875.html
http://www.mail-archive.com/r-sig-phylo@r-project.org/msg00577.html

In short, the data set I've linked to will run gls() under a correlation 
structure as defined by corBrownian and corPagel but not corMartins. The 
following error is encountered with corMartins:

Error in recalc.corStruct(object[[i]], conLin) : 
  NA/NaN/Inf in foreign function call (arg 4)

Has anyone else experienced this issue or know how to resolve it? What does 
this message mean? This happens with each variable I examine, not just the 
example included here.

Thanks in advance for any help or answers provided.

Cheers,
Nick Mason

The R object below contains a list called sampledata. sampledata[[1]] is a data 
frame with tip values and sampledata[[2]] is the corresponding tree.

Link to R objects: http://db.tt/ablyDBt1
Link to Rscript via public dropbox: http://db.tt/tJ8jg0JQ

require(ape)
require(nlme)

load(sampledata.Rdata)

sampledata[[1]]-tipdata
sampledata[[2]]-tree

gls(TraitA~TraitB,data=tipdata,correlation=corMartins(1,tree))
gls(TraitA~TraitB,data=tipdata,correlation=corBrownian(1,tree))
gls(TraitA~TraitB,data=tipdata,correlation=corPagel(1,tree))
-
Nicholas Albert Mason
MS Candidate, Burns Lab
http://kevinburnslab.com/
Department of Biology - EB Program
San Diego State University
5500 Campanile Drive
San Diego, CA 92182-4614
845-240-0649 (cell)
[[alternative HTML version deleted]]

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo


Re: [R-sig-phylo] trouble with subtreeplot returning a tree object on exit

2012-04-12 Thread Emmanuel Paradis

Hi Casey,

It works for me under Linux. I guess it must be a specific problem to 
Windows.


Have you tried with the trex() function?

Cheers,

Emmanuel

Casey Bergman wrote on 12/04/2012 13:21:

Hello -

I have a question about the use of the subtreeplot function in APE.  The example code 
generates an interactive tree as expected, but I do not get a tree2 object returned when 
I click the close button on the window or use cmd+w.  I am wondering if I am 
exit-ing improperly or if you can help make sense of the following error 
message:

Error in identify.default(coor[, 1], coor[, 2], labels = labs, n = 1) :
  No graphics device is active

Full transcript and platform details are below. Many thanks in advance for your 
help with this query.

Best regards,
Casey Bergman, Ph.D.
Faculty of Life Sciences
University of Manchester
Michael Smith Building
Oxford Road, M13 9PT
Manchester, UK

Tel: +44-(0)161-275-1713
Lab: +44-(0)161-275-5980
Fax: +44-(0)161-275-5082
skype: caseymbergman

Email: casey.berg...@manchester.ac.uk
Web: http://bergmanlab.smith.man.ac.uk/
Twitter: http://twitter.com/bergmanlab

***

R version 2.15.0 (2012-03-30)
Copyright (C) 2012 The R Foundation for Statistical Computing
ISBN 3-900051-07-0
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)


tree1-rtree(50) #random tree with 50 leaves
tree2-subtreeplot(tree1, wait=TRUE) # on exit, tree2 will be a subtree of 
tree1.

wait... Node 1 out of 49 treated
wait... Node 2 out of 49 treated
wait... Node 3 out of 49 treated
wait... Node 4 out of 49 treated
wait... Node 5 out of 49 treated
wait... Node 6 out of 49 treated
wait... Node 7 out of 49 treated
wait... Node 8 out of 49 treated
wait... Node 9 out of 49 treated
wait... Node 10 out of 49 treated
wait... Node 11 out of 49 treated
wait... Node 12 out of 49 treated
wait... Node 13 out of 49 treated
wait... Node 14 out of 49 treated
wait... Node 15 out of 49 treated
wait... Node 16 out of 49 treated
wait... Node 17 out of 49 treated
wait... Node 18 out of 49 treated
wait... Node 19 out of 49 treated
wait... Node 20 out of 49 treated
wait... Node 21 out of 49 treated
wait... Node 22 out of 49 treated
wait... Node 23 out of 49 treated
wait... Node 24 out of 49 treated
wait... Node 25 out of 49 treated
wait... Node 26 out of 49 treated
wait... Node 27 out of 49 treated
wait... Node 28 out of 49 treated
wait... Node 29 out of 49 treated
wait... Node 30 out of 49 treated
wait... Node 31 out of 49 treated
wait... Node 32 out of 49 treated
wait... Node 33 out of 49 treated
wait... Node 34 out of 49 treated
wait... Node 35 out of 49 treated
wait... Node 36 out of 49 treated
wait... Node 37 out of 49 treated
wait... Node 38 out of 49 treated
wait... Node 39 out of 49 treated
wait... Node 40 out of 49 treated
wait... Node 41 out of 49 treated
wait... Node 42 out of 49 treated
wait... Node 43 out of 49 treated
wait... Node 44 out of 49 treated
wait... Node 45 out of 49 treated
wait... Node 46 out of 49 treated
wait... Node 47 out of 49 treated
wait... Node 48 out of 49 treated
wait... Node 49 out of 49 treated
Error in identify.default(coor[, 1], coor[, 2], labels = labs, n = 1) :
  No graphics device is active

ls()

[1] tree1

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo



--
Emmanuel Paradis
IRD, Jakarta, Indonesia
http://ape.mpl.ird.fr/

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo


Re: [R-sig-phylo] birthdeath {ape} fails every time.

2012-04-19 Thread Emmanuel Paradis

Hi Simon,

This is fixed and you can find the new version on ape's SVN. Another 
thing I fixed: it could happen that the calculation of the Hessian-based 
SEs fails: the error is now caught and the MLEs are returned with SEs 
set to NA (the profile-likelihood CIs are still computed of course).


Matt is right that the distribution of branching times is weird with a 
coalescent tree, but an answer is still expected from birthdeath().


Regarding the transformation on mu and lambda, the present version 
imposes r (= lambda - mu)  0 and/or a (= mu/lambda)  1 because the 
likelihood computation uses log(r) and log(1 - a). This also avoids most 
warnings now. diversitree uses log(abs(... which, at first sight, is 
strange to me. But there may be some justification for this. Rich may 
comment on it?


Best,

Emmanuel

Simon Greenhill wrote on 20/04/2012 06:52:

Morning all,

Whenever I try to use ape's (v 3.0-2) birthdeath function e.g.


birthdeath(rcoal(sample(10:100, 1)))


I get the following error:

Error in while (foo(up)  0) up- up + inc :
   missing value where TRUE/FALSE needed

The warnings all appear to be this sort of thing:

1: In log(1 - a) : NaNs produced
2: In log(exp(r * x[2:N]) - a) : NaNs produced
3: In nlm(function(p) dev(p[1], p[2]), c(0.1, 0.2), hessian = TRUE) :
   NA/Inf replaced by maximum positive value

and seem to be generated from this line:
 out- nlm(function(p) dev(p[1], p[2]), c(0.1, 0.2), hessian = TRUE)


Is there a fix/workaround for this?

Kind regards,

Simon

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo



--
Emmanuel Paradis
IRD, Jakarta, Indonesia
http://ape.mpl.ird.fr/

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo


Re: [R-sig-phylo] problems with assign(), paste(), and data.frame() for folders containing trees

2012-04-24 Thread Emmanuel Paradis
Hi John,

You seem to transform a (relatively) simple problem into a complicated one. 
First, you can get all the tree file names in one command, such as (I moved to 
a directory with one subdir with trees estimated by ML and another one by NJ; 
this is slightly arranged):

 f - grep(\\.tre, list.files(recursive = TRUE), value = TRUE)
 f
[1] ML/trees_55species.tre  ML/treesNADH2_45species.tre
[3] ML/TR.ML.Dloop_55sp.tre ML/TR.ML.NADH2_45sp.tre
[5] nj/trees_55species.tre  nj/treesNADH2_45species.tre
[7] nj/TR.NJ.Dloop_55sp.tre nj/TR.NJ.NADH2_45sp.tre

Because in R file paths are resolved relatively, there is no need to navigate 
with setwd().

Second, I think you should use a list instead of a data frame because (I 
presume) you may have files with different numbers of trees (if this is not the 
case, you can transform the list in a data frame later).

You may have commands like this, eg, if you want to get the mean branch length 
of each tree:

ntree - length(f)
L - list()
for (i in 1:ntree) {
tr - read.tree(f[i])
if (class(tr) == phylo) L[[i]] - mean(tr$edge.length)
if (class(tr) == multiPhylo) L[[i]] - sapply(tr, function(x) 
mean(x$edge.length))
}

Finally, you may name your list with the file names:

names(L) - f

This has the advantage that you can select some of the results, eg, the trees 
that were estimated by NJ:

 grep(nj/, names(L))
[1] 5 6 7 8

or those from D-loop:

 grep(Dloop, names(L))
[1] 3 7

You get the number of trees in each file with sapply(L, length).

There can be many variations around this scheme. For instance, if you want to 
extract the branch lengths as in your example, the two lines above with mean 
would become:

if (class(tr) == phylo) L[[i]] - tr$edge.length
if (class(tr) == multiPhylo) L[[i]] - lapply(tr, $, edge.length)

Best,

Emmanuel
-Original Message-
From: John Denton jden...@amnh.org
Sender: r-sig-phylo-boun...@r-project.org
Date: Tue, 24 Apr 2012 20:30:26 
To: R Sig Phylo Listservr-sig-phylo@r-project.org
Subject: [R-sig-phylo] problems with assign(), paste(),
 and data.frame() for folders containing trees

Hi folks,

I am trying to recurse through several numbered subfolders in a directory. Each 
folder has many trees that I want to display summary values for. I have been 
expanding data frames using code with the structure name - rbind(name, 
newvals) to produce a data frame with n rows equal to the number of files in 
one of the folders, and n column equal to the number of values in the file.

I can loop over the values within a single subdirectory fine with, for example,

library(ape)

trees - list.files(pattern=*.tre)
iters=length(trees)

branchdata.5 - data.frame()

iterations - as.character(c(1:length(trees)))

for (i in 1:iters) {

tree - read.tree(trees[i])
iteration.edges.5 - as.numeric(tree$edge.length)

branchdata.5 - rbind(branchdata.5, iteration.edges.5)

}

The problem comes when I want to iterate through the numbered subdirectories 
while also iterating through the files in a given directory. I want to 
recursively assign these data frames as well, with something like

f - list.dirs(path = /.../.../etc, full.names = FALSE, recursive = FALSE)

for (j in 1:length(f)) {

setwd(paste(/.../.../.,j,sep=))

assign( paste(branchdata.5,j,sep=), data.frame() )

iterations - as.character(c(1:length(trees)))

for (i in 1:iters) {

tree - read.tree(trees[i])
assign(paste(iteration.edges.5,j,sep=), as.numeric(tree$edge.length) )

paste(branchdata.5,j,sep=) - rbind(paste(branchdata.5,j,sep=), 
paste(iteration.edges.5,j,sep=))

}

names(iterations) - NULL
boxplot(t(paste(branchdata.5,j,sep=)) , horizontal=TRUE , names=iterations 
, ylim=c(0,2), xlab=Branch Lengths , ylab=Iterations , main = )

}

The problem seems to be in the rbind() when using values with assign() and 
paste(). I would love some help on this! 


John S. S. Denton
Ph.D. Candidate
Department of Ichthyology and Richard Gilder Graduate School
American Museum of Natural History
www.johnssdenton.com
___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo


Re: [R-sig-phylo] read.dna warnings and pitfalls

2012-04-26 Thread Emmanuel Paradis
Hi Dan,

The reason for this implementation (searching the first 10 IUPAC-coded bases) 
is because the exact formatting is not inconsistent among different programs. 
Some files have:

0123456789acgt.

that is a 10-character name and the sequence starting on the 11th position. I 
think this is typical for Phylip. Other software (e.g., PhyML) accepts longer 
taxa names and require a space before the start of the sequence.

About your example: it depends on the order of the data. The following file can 
be read:

2 10
x AA
madagascarAA

But if you invert the two sequence lines, it fails.

It is the first time I hear about this problem in 9 years, maybe because it 
requires a particular combination of circumstances. Another drawback of this 
implementation is that files with less than 10 bases cannot be read.

How to solve this? If it were left only to me, I would deprecate the 
interleaved and sequential formats. FASTA is more flexible, more widespread, 
easier to parse, can store exactly the same information, and labels are only 
constrained to be on a single line (but can contain any characters including 
\n, \t, ...) But I guess many programs use the Phylip formats, so I'd be glad 
to read other suggestions.

As for your 2nd problem, it is now fixed in ape.

Best,

Emmanuel
-Original Message-
From: Dan Rabosky drabo...@umich.edu
Sender: r-sig-phylo-boun...@r-project.org
Date: Wed, 25 Apr 2012 17:51:35 
To: r-sig-phylo@r-project.org
Subject: [R-sig-phylo] read.dna warnings and pitfalls


Hi All-

I have spent an inordinate and embarrassing amount of time tracking down an 
excruciatingly cryptic issue with read.dna, which I rarely use. Here are two 
key problems:

1) The function automatically assumes it is reading DNA sequences when it 
encounters a string of 10 continuous DNA-like characters. This includes all 
characters in the set (ACGTUMRWSYKVHDBN-). This function, unlike the phylip 
original, does not have limits on taxon name lengths. Hence, I had - in the 
middle of a large alignment - a species whose name included the string 
MADAGASCAR, which caused a failure.  To be fair, the documentation warns of 
this, but I think this is extremely easy to overlook, and - moreover - it seems 
unfortunate to have to parse all your taxon names for a potential IUPAC match 
before trying to use the function. Presumably, most users who specify 
sequential spacing will be using whitespace to separate taxon names from DNA 
sequences, and perhaps it is better to exploit this rather than IUPAC matching. 

2) The function is whitespace-sensitive. if you tab-separate the numbers on the 
first line (numbers of taxa, numbers of sites), you'll receive an errror with 
the message: the first line of the file must contain the dimensions of the 
data. It appears that spaces are OK, however. 

Hopefully this post will be useful to somewhere in the future with a similar 
issue. Perhaps these can be addressed in a future update to ape? 

-Dan Rabosky

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo


[R-sig-phylo] ape's web site

2012-05-02 Thread Emmanuel Paradis

Hi all,

ape.mpl.ird.fr has been refreshed. Besides the new look, two things 
worth noting:


There is a column section entitled EVENTS on the welcome page: it is 
intended to give information on forthcoming meetings, workshops, 
courses, seminars, ... If you organize such an event or would like to 
let others know about it, just tell me.


The page on APER has been updated with supplementary materials for the 
second edition: you can get the data files used in the book and the R 
scripts of the case studies. There are some additional coloured figures 
and some animations made from the figures plotted with rgl.


Best,

Emmanuel
--
Emmanuel Paradis
IRD, Jakarta, Indonesia
http://ape.mpl.ird.fr/

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo


Re: [R-sig-phylo] read.dna warnings and pitfalls

2012-05-03 Thread Emmanuel Paradis
I made some changes in read.dna which, I hope, solve the problems. The 
taxa names can be of any length and must be separated from the sequences 
by at least one space (or tabulation). write.dna() now follows the same 
rule. Files with less than 10 nucleotides can now be read by read.dna 
(bug fixed).


I removed the option 'seq.names' of read.dna since it doesn't seem 
particularly useful and this helped to clarify the code.


The new versions are now on ape's SVN:

https://svn.mpl.ird.fr/ape/dev/ape/R/read.dna.R
https://svn.mpl.ird.fr/ape/dev/ape/R/write.dna.R

Tests welcome!

Best,

Emmanuel

Dan Rabosky wrote on 26/04/2012 22:01:


Hi Emmanuel-

Thanks for fixing the whitespace issue. I think this fix will be useful to many 
users.

On the issue of recognizing 10 IUPAC characters: I think this is a real 
problem, and may come up again in short order. Maybe it is just that use of 
this function has been limited? In the single dataset with a modest number of 
sequences that caused me problems yesterday, I had the following species and/or 
genus names - all of which constitute 10 character strings drawn from the set 
of IUPAC codes:

brachyurus (x 2)
savannarum
graduacauda
caudacutus
Camarhynchus (x 3)
madagascariensis

I don't suggest deprecating the phylip sequential, but rather, using something that is compatible 
with raxml (surely one of the most widely used phylogenetics programs today). I think raxml uses a 
relaxed sequential version of the phylip format with whitespace delimitation. I could read the same 
alignment in raxml with no problems, but I had multiple issues when reading the same file with 
read.dna (including the whitespace character on the first line). My guess is that very few people 
are using the original phylip format, with its limit of 10 characters per taxon name, and with dna 
seqs beginning immediately after this. So maybe deprecate sequential phylip, but you 
could use what Stamatakis calls relaxed sequential PHYLIP, which appears to be: (1) 
taxon names cannot include spaces but can be up to 100 characters; and (2) names separated from 
sequences by whitespace character (ideally, this should recognize any number of spaces or tabs to 
prevent user confusion).

For users with tab-delimited raxml files (eg each taxon name separated from its 
dna sequence by a tab), you can use a regular-expressions enabled text editor 
(like textwrangler) to quickly find potential problems. Just search for

[ACGTUMRWSYKVHDBN]{10}.+\t

with grep matching enabled.

Cheers,
~Dan


On Apr 26, 2012, at 2:16 AM, Emmanuel Paradis wrote:


Hi Dan,

The reason for this implementation (searching the first 10 IUPAC-coded bases) 
is because the exact formatting is not inconsistent among different programs. 
Some files have:

0123456789acgt.

that is a 10-character name and the sequence starting on the 11th position. I 
think this is typical for Phylip. Other software (e.g., PhyML) accepts longer 
taxa names and require a space before the start of the sequence.

About your example: it depends on the order of the data. The following file can 
be read:

2 10
x AA
madagascarAA

But if you invert the two sequence lines, it fails.

It is the first time I hear about this problem in 9 years, maybe because it 
requires a particular combination of circumstances. Another drawback of this 
implementation is that files with less than 10 bases cannot be read.

How to solve this? If it were left only to me, I would deprecate the 
interleaved and sequential formats. FASTA is more flexible, more widespread, 
easier to parse, can store exactly the same information, and labels are only 
constrained to be on a single line (but can contain any characters including 
\n, \t, ...) But I guess many programs use the Phylip formats, so I'd be glad 
to read other suggestions.

As for your 2nd problem, it is now fixed in ape.

Best,

Emmanuel
-Original Message-
From: Dan Raboskydrabo...@umich.edu
Sender: r-sig-phylo-boun...@r-project.org
Date: Wed, 25 Apr 2012 17:51:35
To:r-sig-phylo@r-project.org
Subject: [R-sig-phylo] read.dna warnings and pitfalls


Hi All-

I have spent an inordinate and embarrassing amount of time tracking down an 
excruciatingly cryptic issue with read.dna, which I rarely use. Here are two 
key problems:

1) The function automatically assumes it is reading DNA sequences when it encounters a string of 10 
continuous DNA-like characters. This includes all characters in the set 
(ACGTUMRWSYKVHDBN-). This function, unlike the phylip original, does not have limits on taxon name 
lengths. Hence, I had - in the middle of a large alignment - a species whose name included the 
string MADAGASCAR, which caused a failure.  To be fair, the documentation warns of 
this, but I think this is extremely easy to overlook, and - moreover - it seems unfortunate to have 
to parse all your taxon names for a potential IUPAC match before trying to use the function. 
Presumably, most users who specify

Re: [R-sig-phylo] compar.gee: how to extract SE, t, and P from output?

2012-05-09 Thread Emmanuel Paradis
Maybe check that the code of foo has not been altered with a line break from 
the email (eg, by editing it with fix(foo)). If this is ok, try it with the 
examples from the help page ?compar.gee. You may also check that:

GEE$W

prints a 2 by 2 matrix.

Best,

Emmanuel
-Original Message-
From: Nina Hobbhahn n.hobbh...@gmail.com
Date: Wed, 9 May 2012 16:15:24 
To: Emmanuel Paradisemmanuel.para...@ird.fr
Cc: r-sig-phylo@r-project.org
Subject: Re: [R-sig-phylo] compar.gee: how to extract SE, t, and P from output?

Hi Emmanuel,

thanks for your reply. Works mostly, however, I get an error message I can't 
seem to figure out:

Error in coef[, 2] - sqrt(diag(x$W)) : replacement has length zero.

What does R mean or want here?

Thanks again,

Nina



On 2012-05-09, at 3:19 PM, Emmanuel Paradis wrote:

 Hi Nina,
 
 You cannot do it directly: you have to get the code from print.compar.gee; 
 from R:
 
  print.compar.gee
 
 From this you can create a function, eg:
 
 foo - function(x)
 {
nas - is.na(x$coef)
coef - x$coef[!nas]
cnames - names(coef)
coef - matrix(rep(coef, 4), ncol = 4)
dimnames(coef) - list(cnames,
   c(Estimate, S.E., t, Pr(T  |t|)))
df - x$dfP - dim(coef)[1]
coef[, 2] - sqrt(diag(x$W))
coef[, 3] - coef[, 1]/coef[, 2]
coef
 }
 
 Then do:
 
 foo(GEE)
 
 HTH.
 
 Best,
 
 Emmanuel
 
 Nina Hobbhahn wrote on 08/05/2012 18:55:
 Dear fellow list users,
 
 I would like to extract the values for SE, t, and Pr from the compar.gee 
 output, but seem to unable to do so. The command lines
 
 GEE-­‐compar.gee(trait~reward-­‐1, phy=phy2, data=DF.Disa, 
 family=gaussian)
 
 GEE
 
 return the following output:
 
 Beginning Cgee S-­‐function, @(#) geeformula.q 4.13 98/01/27 running glm 
 to get initial regression estimate
 
 rewardN rewardY
 2.689142 3.421509
 Beginning Cgee S-­‐function, @(#) geeformula.q 4.13 98/01/27 running glm 
 to get initial regression estimate
 
 rewardN rewardY
 2.689142 3.421509
 Call: compar.gee(formula = trait ~ reward -­‐ 1, data = DF.Disa,
 
 family = gaussian, phy = phy2) Number of observations: 31
 
 Model:
 Link: identity
 
 Variance to Mean Relation: gaussian
 
 
 QIC: 165.2011
 
 Summary of Residuals:
 Min 1Q Median 3Q Max
 
 3.1462585 0.2560335 1.5256031 2.2012787 4.6689764
 
 Coefficients:
 Estimate S.E. t Pr(T  |t|)
 
 rewardN 1.959375 1.091801 1.794626 0.1016430 rewardY 1.908384 1.117150 
 1.708261 0.1170666
 
 Estimated Scale Parameter: 4.862036 Phylogenetic df (dfP): 12.45332
 
 When typing
 
 GEE$coefficients
 
 I get
 
 
 rewardN rewardY
 
 1.959375 1.908384
 
 
 but I'm unable to access SE, t, and Pr(T  |t|). Please can any of you help? 
 Thank you very much,
 
 
 Nina
 
 
 
 
 Dr. Nina Hobbhahn
 Post-doctoral fellow
 Lab of Prof. S. D. Johnson
 School of Life Sciences
 University of KwaZulu-Natal
 Private Bag X01
 Scottsville, Pietermaritzburg, 3201
 South Africa
  [[alternative HTML version deleted]]
 
 
 
 
 ___
 R-sig-phylo mailing list
 R-sig-phylo@r-project.org
 https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
 
 -- 
 Emmanuel Paradis
 IRD, Jakarta, Indonesia
 http://ape.mpl.ird.fr/


[[alternative HTML version deleted]]

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo


Re: [R-sig-phylo] compar.gee: how to extract SE, t, and P from output?

2012-05-11 Thread Emmanuel Paradis

Hi Nina,

Indeed. Here's a more complete version:

foo - function(x)
{
nas - is.na(x$coef)
coef - x$coef[!nas]
cnames - names(coef)
coef - matrix(rep(coef, 4), ncol = 4)
dimnames(coef) - list(cnames,
c(Estimate, S.E., t, Pr(T  |t|)))
df - x$dfP - dim(coef)[1]
coef[, 2] - sqrt(diag(x$W))
coef[, 3] - coef[, 1]/coef[, 2]
if (df  0) {
warning(not enough degrees of freedom to compute P-values.)
coef[, 4] - NA
} else coef[, 4] - 2 * (1 -  pt(abs(coef[, 3]), df))
coef
}

Best,

Emmanuel

Nina Hobbhahn wrote on 10/05/2012 17:18:

Hi Emmanuel,

thanks, the problem was indeed a line break from the email. The foo
function now works, however:

If I now request foo(GEE), I get the following output:

Estimate S.E. t Pr(T  |t|)
(Intercept) 1.59575105 1.1944651 1.33595449 1.59575105
rewardY -0.02952008 0.6772938 -0.04358534 -0.02952008

It looks to me as if the P values are not accessed correctly, and
instead the Estimate column is printed in its place. Indeed, in the foo
function code


 foo - function(pGEE)

+ {
+ nas - is.na(x$coef)
+ coef - x$coef[!nas]
+ cnames - names(coef)
+ coef - matrix(rep(coef, 4), ncol = 4)
+ dimnames(coef) - list(cnames, c(Estimate, S.E., t, Pr(T  |t|)))
+ df - x$dfP - dim(coef)[1]
+ coef[, 2] - sqrt(diag(x$W))
+ coef[, 3] - coef[, 1]/coef[, 2]
+ coef
+ }

 pGEE-foo(pGEE)
 pGEE


coef[, 4] is not defined. Perhaps that's the problem?

If so, could you please advise how coef[, 4] should be defined?

Thank you very much,

Nina



On 2012-05-09, at 5:24 PM, Emmanuel Paradis wrote:


Maybe check that the code of foo has not been altered with a line
break from the email (eg, by editing it with fix(foo)). If this is ok,
try it with the examples from the help page ?compar.gee. You may also
check that:

GEE$W

prints a 2 by 2 matrix.

Best,

Emmanuel

*From: * Nina Hobbhahn n.hobbh...@gmail.com
mailto:n.hobbh...@gmail.com
*Date: *Wed, 9 May 2012 16:15:24 +0200
*To: *Emmanuel Paradisemmanuel.para...@ird.fr
mailto:emmanuel.para...@ird.fr
*Cc: *r-sig-phylo@r-project.org mailto:r-sig-phylo@r-project.org
*Subject: *Re: [R-sig-phylo] compar.gee: how to extract SE, t, and P
from output?

Hi Emmanuel,

thanks for your reply. Works mostly, however, I get an error message I
can't seem to figure out:

Error in coef[, 2] - sqrt(diag(x$W)) : replacement has length zero.

What does R mean or want here?

Thanks again,

Nina



On 2012-05-09, at 3:19 PM, Emmanuel Paradis wrote:


Hi Nina,

You cannot do it directly: you have to get the code from
print.compar.gee; from R:

 print.compar.gee

From this you can create a function, eg:

foo - function(x)
{
nas - is.na(x$coef)
coef - x$coef[!nas]
cnames - names(coef)
coef - matrix(rep(coef, 4), ncol = 4)
dimnames(coef) - list(cnames,
c(Estimate, S.E., t, Pr(T  |t|)))
df - x$dfP - dim(coef)[1]
coef[, 2] - sqrt(diag(x$W))
coef[, 3] - coef[, 1]/coef[, 2]
coef
}

Then do:

foo(GEE)

HTH.

Best,

Emmanuel

Nina Hobbhahn wrote on 08/05/2012 18:55:

Dear fellow list users,

I would like to extract the values for SE, t, and Pr from the
compar.gee output, but seem to unable to do so. The command lines

GEE-­‐compar.gee(trait~reward-­‐1, phy=phy2, data=DF.Disa,
family=gaussian)

GEE

return the following output:

Beginning Cgee S-­‐function, @(#) geeformula.q 4.13 98/01/27 running
glm to get initial regression estimate

rewardN rewardY
2.689142 3.421509
Beginning Cgee S-­‐function, @(#) geeformula.q 4.13 98/01/27 running
glm to get initial regression estimate

rewardN rewardY
2.689142 3.421509
Call: compar.gee(formula = trait ~ reward -­‐ 1, data = DF.Disa,

family = gaussian, phy = phy2) Number of observations: 31

Model:
Link: identity

Variance to Mean Relation: gaussian


QIC: 165.2011

Summary of Residuals:
Min 1Q Median 3Q Max

3.1462585 0.2560335 1.5256031 2.2012787 4.6689764

Coefficients:
Estimate S.E. t Pr(T |t|)

rewardN 1.959375 1.091801 1.794626 0.1016430 rewardY 1.908384
1.117150 1.708261 0.1170666

Estimated Scale Parameter: 4.862036 Phylogenetic df (dfP): 12.45332

When typing

GEE$coefficients

I get


rewardN rewardY

1.959375 1.908384


but I'm unable to access SE, t, and Pr(T |t|). Please can any of
you help? Thank you very much,


Nina




Dr. Nina Hobbhahn
Post-doctoral fellow
Lab of Prof. S. D. Johnson
School of Life Sciences
University of KwaZulu-Natal
Private Bag X01
Scottsville, Pietermaritzburg, 3201
South Africa
[[alternative HTML version deleted]]




___
R-sig-phylo mailing list
R-sig-phylo@r-project.org mailto:R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo


--
Emmanuel Paradis
IRD, Jakarta, Indonesia
http://ape.mpl.ird.fr/




Dr. Nina Hobbhahn
Post-doctoral fellow
Lab of Prof. S. D. Johnson
School of Life Sciences
University of KwaZulu-Natal
Private Bag X01
Scottsville, Pietermaritzburg, 3201
South Africa



--
Emmanuel Paradis
IRD, Jakarta

Re: [R-sig-phylo] Problems useing query() function of package seqinr

2012-05-11 Thread Emmanuel Paradis

Hi Augusto,

I was able to reproduce some of the errors you encountered. I suggest, 
if you haven't done yet, to contact directly the authors of seqinr. They 
have their own mailinglist and their web site is:


http://pbil.univ-lyon1.fr/software/seqinr/

Best,

Emmanuel

Augusto Ribas wrote on 11/05/2012 01:13:

As i was searching endemic species for sequences, i was supposed to find 0
results.
And got stuck in this part of the loop.


while (length(res) == 0) {
 if (verbose)
 cat(... reading again.\n)
 res- readLines(socket, n = 1)
 }

I got solutions here:
http://r-br.2285057.n4.nabble.com/R-br-Duvida-sobre-a-funcao-query-do-pacote-seqinr-td4619241.html

Hope it could help if anyone else get the same problem.

Best Wishes
Augusto Ribas

2012/5/9 Augusto Ribasribas@gmail.com


If it helps



library(seqinr)
s- choosebank(genbank)
query(bb,SP=Rhinella rubescens AND K=16S RIBOSOMAL

RNA,s$socket,verbose=T)
I'm checking the arguments...
... and everything is OK up to now.
I'm checking the status of the socket connection...
... and everything is OK up to now.
I'm sending query to server...
... answer from server is: code=0lrank=2count=1type=
SQlocus=F
I'm trying to analyse answer from server...
... and everything is OK up to now.
... and the rank of the resulting list is: 2 .
... and there are 1 elements in the list.
... and the elements in the list are of type SQ .
... and there are *not* only parent sequences in the list.
I'm trying to get the infos about the elements of the list...
... and I have received 1 lines as expected.




Now with
query(bb,SP=Leptodactylus marmoratus AND K=16S RIBOSOMAL
RNA,s$socket,verbose=T)

I keep in an eternal loop saying:
... reading again.
... reading again.
... reading again.



2012/5/9 Augusto Ribasribas@gmail.com


Hello, i'm new in this list, and new in genetics field too.
I'm reading a book Analysis of Phylogenetics and Evolution with R and
in the book i saw about the function query() of the package seqinr to
search databases about specific sequences of  our interest.

#A got a list of some frog species:

lista.spp-c(Rhinella rubescens, Rhinella fernandezae, Elosia
rustica,
Crossodactylus gaudichaudii, Leptodactylus pustulatus, Leptodactylus
syphax,
Dendropsophus cachimbo, Leptodactylus marmoratus, Physalaemus
soaresi,
Hylodes nasus, Scinax fuscomarginatus, Leptodactylus petersii,
Chiasmocleis capixaba, Ceratophrys aurita, Pleurodema diplolister,
Cystignatus gigas, Scinax trilineatus, Elosis nasus, Dryadophis
bifossatus
)

#And stared looking for 16s RIBOSOMAL RNA.
#My problem is that for some species it worked great.
#For example:


library(seqinr)
s- choosebank(genbank)
query(SP,SP=Rhinella rubescens AND K=16S RIBOSOMAL RNA,s$socket)
SP$nelem

[1] 1

getName(SP)

[1] GU907196.RR1

#1 sequence and the name to get the sequence later


query(SP,SP=Leptodactylus syphax AND K=16S RIBOSOMAL RNA,s$socket)
print(SP$nelem)

[1] 2

print(getName(SP))

[1] JF789923 JF789924

#2 sequences and names to get them later.

#The problems is that for some species like

query(SP,SP=Leptodactylus marmoratus AND K=16S RIBOSOMAL

RNA,s$socket)

R stops, i thought it would return too many information, but even using
the argument virtual=T it stops the same way, and i have to turn R off.
I looked in the genbank site and this species should return no sequence
but i don't know what is wrong.
Also, in my understanding, Leptodactylus syphax should return 2
sequences, looking in genbank site, but it also stops.
But from the 19 species, only some stops, the others work, i cant figure
out what is happening.

I use windows 7 OS, a residential internet, R 2.15.0 and Tinn-R as a R-gui

Also, in the university the port query() function use is blocked, only
ports like 80 are open in the firewall, are there other functions like this
in R?
In the task view, biocondutor project was cited but i still searching
there.

Could someone enlighten me?

Best Wishes
Augusto Ribas

--
Grato
Augusto C. A. Ribas

Site Pessoal: http://augustoribas.heliohost.org
Lattes: http://lattes.cnpq.br/7355685961127056





--
Grato
Augusto C. A. Ribas

Site Pessoal: http://augustoribas.heliohost.org
Lattes: http://lattes.cnpq.br/7355685961127056







--
Emmanuel Paradis
IRD, Jakarta, Indonesia
http://ape.mpl.ird.fr/

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo


Re: [R-sig-phylo] nonparametric PGLS

2012-05-11 Thread Emmanuel Paradis

Hi Pascal,

The assumption of normality in GLS, as in OLS, concerns the residuals 
not the data. A recent reference on this issue is:


Alain F. Zuur, Elena N. Ieno and Chris S. Elphick: A protocol for data 
exploration to avoid common statistical problems. Methods in Ecology  
Evolution 2010, 1, 3–14. doi: 10./j.2041-210X.2009.1.x


To answer your specific question, I'm not aware of such nonparametric 
method though a Monte Carlo procedure might be feasible.


Best,

Emmanuel

Pascal Title wrote on 11/05/2012 10:46:

Hi everyone

I have character data for species on a phylogeny, but simple
transformations like log-transformations or square-root transformations are
not proving to be sufficient to get the data to be normal. If this were a
non-phylogenetic test, I would resort to something like a Spearman
correlation, which is non-parametric.
But with phylogenetic generalized least squares, is there any method that
is nonparametric?

Thanks!


-Pascal Title

--
Emmanuel Paradis
IRD, Jakarta, Indonesia
http://ape.mpl.ird.fr/

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo


Re: [R-sig-phylo] exporting nexus tree in APE

2012-05-15 Thread Emmanuel Paradis

Hi Lucinda,

You don't give enough information allowing us to help you efficiently. 
However, see below.


Lucinda Kirkpatrick wrote on 15/05/2012 07:38:

Dear all,

I am attempting to export a tree I have pruned from 916 taxa to the 53 taxa
I am interested. I used this script:

overlap- name.check(tree, d)
tree- drop.tip(tree, overlap$Tree.not.data)
d2- d[!(rownames(d) %in% overlap$Data.not.tree),]


plot(tree)
name.check(tree,d2)

That was successful but now I want to export the reduced tree as a nexus
file I get the following error:

  write.nexus(tree, file=./Results/Tree.nex)
Read 921 items


This suggests you use an old version of ape (which reads the original 
data from the NEXUS fil: this feature has been deprecated). I suggest 
you update ape first; if your problem persists give us more details, 
possibly with some example data.


Best,

Emmanuel


Error in 1:(start - 1) : argument of length 0

I need the reduced tree in a nexus format for another analysis - how is the
best way to do that?  I am very new to R so please make any explanation
very simple!

Thanks, Luci

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo



--
Emmanuel Paradis
IRD, Jakarta, Indonesia
http://ape.mpl.ird.fr/

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo


Re: [R-sig-phylo] unrooted tree, spread tips

2012-05-18 Thread Emmanuel Paradis

Hi,

Walter, Mathias wrote on 18/05/2012 16:42:

Hi,

I use ape to plot my phylogenetic trees. Often I have to plot unrooted
trees with few clades, but a lot of tips per clade. The tip labels are
hard to read, even if I use lab4ut=axial. Is there any way to spread
the tips?


No. The only available algorithm for unrooted trees in ape is the 
equal-angle one. It seems that for your purpose type=r might be not 
bad, even though the center of the circle is the root.


Best,

Emmanuel


I saw this in some other tree drawing programs.

Kind regards,
Mathias


___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo


Re: [R-sig-phylo] unrooted tree, spread tips

2012-05-18 Thread Emmanuel Paradis

Walter, Mathias wrote on 18/05/2012 18:16:

Hi,

if I use type=r, then quite a lot of lines cross.


This should not happen. May be a bug. You have an example of this?

E.


I tried optimal
tree ordering but as soon as I convert my hclust object to a phylo
object (via as.phylo), the ordering is lost.

Another advantage of the unrooted tree is, that you can often clearly
distinguish main clusters. That is not possible with any other kind of
tree drawing.

--
Kind regards,
Mathias

2012/5/18 Emmanuel Paradisemmanuel.para...@ird.fr:

Hi,

Walter, Mathias wrote on 18/05/2012 16:42:


Hi,

I use ape to plot my phylogenetic trees. Often I have to plot unrooted
trees with few clades, but a lot of tips per clade. The tip labels are
hard to read, even if I use lab4ut=axial. Is there any way to spread
the tips?



No. The only available algorithm for unrooted trees in ape is the
equal-angle one. It seems that for your purpose type=r might be not bad,
even though the center of the circle is the root.

Best,

Emmanuel


I saw this in some other tree drawing programs.

Kind regards,
Mathias


___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo



--
Emmanuel Paradis
IRD, Jakarta, Indonesia
http://ape.mpl.ird.fr/

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo


Re: [R-sig-phylo] length of attribute (names) when calculating independent contrasts

2012-05-24 Thread Emmanuel Paradis

Hi Agus,

It seems the problem comes when pic() tries to create names to the 
contrasts. Probably your tree has badly formatted node labels. What do 
the following commands return?


tree$node.label
tree$Nnode

Cheers,

Emmanuel

Agus Camacho wrote on 25/05/2012 01:58:

Hi all,

When calculating independent contrasts of a variable, I get the following
message:


ContrastWeight- pic(Weight,tree)Error en names(contr)- lbls :

   el atributo 'names' [10] debe tener la misma longitud que el vector [9]
The attribute names [10] must have the same length than the vector[9]
(this was translated by me)

After checking the help on the command, i made the following

observations,
length(Weight)[1] 10  class(Weight)[1] numeric  tree$tip.label
[1] C.leiolepis  C.nicterus   C.sinebrachiatus
  [4] S.catimbau   N.ablephara  P.erythrocercus
  [7] P.tetradactylus  V.rubricauda M.maximiliani
[10] P.paeminosus  class(tree)[1] phylo


Weight C.leiolepis   C.nicterus C.sinebrachiatus   S.catimbau

0.9020.7160.53705260.5887500
  N.ablephara  P.erythrocercus  P.tetradactylus V.rubricauda
0.5460.41272730.3970.4725294
M.maximiliani P.paeminosus
0.4830.4677692



However, to me, everything seems to be right about the names and the
vector's length. Any hint?

Thanks to all. I am new to these analyses and appreciate much the help
from the mail list.


Agus



___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo


--
Emmanuel Paradis
IRD, Jakarta, Indonesia
http://ape.mpl.ird.fr/

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo


Re: [R-sig-phylo] as.phylo.hclust

2012-06-21 Thread Emmanuel Paradis
Hi Dave,

That's because hclust() computes 'height' as the distance between the two sets 
that are agglomerated. Consider a case with n=2:

 (d - dist(rnorm(2)))
 1
2 2.562737
 h - hclust(d)
 h$height
[1] 2.562737

So if we make a phylo object from h, it will have two edges whose lengths 
must sum to 2.56:

 t - as.phylo(h)
 t$edge.length
[1] 1.281369 1.281369

Of course the cophenetic distances must be the same:

 cophenetic(t)
 12
1 0.00 2.562737
2 2.562737 0.00
 cophenetic(h)
 1
2 2.562737


Cheers,

Emmanuel
--Original Message--
From: David Bapst
Sender: dwba...@gmail.com
To: R Sig Phylo Listserv
Cc: Emmanuel Paradis
Subject: as.phylo.hclust
Sent: 21 Jun 2012 05:08

Hello all,

I was having some problems with getting the information I wanted from
an hclust object so decided I would make it a tree and use all the
tools I already know so well. Unfortunately, the distance structure in
the hclust structure does not seem to be preserved by as.phylo.hclust.
Instead, the tree's total depth/height seems to be halved during
conversion. See the example below.

set.seed(444)
x-dist(rnorm(10))
hc-hclust(x)
tr-as.phylo.hclust(hc)

#see graphically
layout(1:2)
plot(hc)
plot(tr);axisPhylo()

#test depths
max(hc$height)==max(dist.nodes(tr)[Ntip(tr)+1,]) #FALSE

It looks like what's happening is that the edges are half as long as
they should be.

tr1-tr
tr1$edge.length-tr1$edge.length*2
max(hc$height)==(max(dist.nodes(tr1)[Ntip(tr1)+1,]))#TRUE

I am running ape 3.0.4 with R 2.15.0. I notice in the change-log for
ape that there used to be an error (around version 2.5) where the
edges of as.phylo.hclust output were multiplied by 2. Maybe the
reverse bug has crept in?
-Dave


-- 
David Bapst
Dept of Geophysical Sciences
University of Chicago
5734 S. Ellis
Chicago, IL 60637
http://home.uchicago.edu/~dwbapst/
http://cran.r-project.org/web/packages/paleotree/index.html


___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo


Re: [R-sig-phylo] using ace() with fixed internal node - singularity error

2012-06-25 Thread Emmanuel Paradis

Dear Annemarie,

I have added a better error-catching mechanism in ace (it was already 
there for type = discrete). You can find the new version here:


https://svn.mpl.ird.fr/ape/dev/ape/R/ace.R

This will give you the estimated ancestral states but you won't be able 
to compute SEs of the parameters.


BTW, for continuous traits you may also try method = REML which should 
give you a better estimate of the rate of BM evolution.


Cheers,

Emmanuel

Annemarie Verkerk wrote on 22/06/2012 22:05:

Dear R-sig-phylo members,

I am trying to use ace() with a fixed internal node as described here:
https://stat.ethz.ch/pipermail/r-sig-phylo/2011-June/001465.html

My aim is to provide some confidence in the ancestral state estimation
of the root node by setting the root node to have specific values that
have been proposed in the literature and comparing the log likelihoods
of the estimated value and the values proposed in the literature. To
that end, a node with branch length zero is added to root node, and the
proposed value is given to that newly added node as described in the
archived message above.

However, I get the following error message when I use ace() with
'method=ML':

Error in solve.default(out$hessian) : Lapack routine dgesv: system is
exactly singular
In addition: There were 50 or more warnings (use warnings() to see the
first 50)

warnings()
Warning messages:
1: In nlm(function(p) dev.BM(p), p = c(1, rep(mean(x), nb.node)), ... :
NA/Inf replaced by maximum positive value
2: In nlm(function(p) dev.BM(p), p = c(1, rep(mean(x), nb.node)), ... :
NA/Inf replaced by maximum positive value
3: In nlm(function(p) dev.BM(p), p = c(1, rep(mean(x), nb.node)), ... :
NA/Inf replaced by maximum positive value

I think this is because the newly added edge has zero-length, as
mentioned here: http://comments.gmane.org/gmane.comp.lang.r.phylo/1342

When I use 'method=pic', I do not get this error message and it
completes succesfully. (The root state is not exactly the same as the
value for the newly added node, but is very close.)

However, I really do need to work with the ML method, as I need the log
likelihoods for my comparisons of the analyses with different values
that have been used for the newly added node... As far as I can tell,
neither method=pic nor method=PGLS give log likelihoods in their
output.

Is there any way around this problem?

Many thanks, as always,
Annemarie



--
Emmanuel Paradis
IRD, Jakarta, Indonesia
http://ape.mpl.ird.fr/

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo


Re: [R-sig-phylo] R package for detecting positive selection?

2012-06-28 Thread Emmanuel Paradis

Hi Anna,

See the package phangorn: recent versions implement codon-based models.

Cheers,

Emmanuel

Anna Syme wrote on 28/06/2012 08:42:

Hi all,
I was wondering if there are any packages in R that can detect positive
selection, comparable to programs such as PAML. I have two gene copies
(each copy in multiple species), and want to see whether either copy has
been subject to positive selection since divergence.
Thanks,
Anna
*Dr Anna Syme* Molecular Systematist, National Herbarium of Victoria,
Royal Botanic Gardens Melbourne, Private Bag 2000, Birdwood Avenue,
South Yarra, 3141, Australia. Phone (03) 9252 2328, Fax (03) 9252 2413.
anna.s...@rbg.vic.gov.au mailto:anna.s...@rbg.vic.gov.au
www.rbg.vic.gov.au http://www.rbg.vic.gov.au/

--
This email and any attachments may contain information that is personal, 
confidential, legally privileged and/or copyright. No part of it should be 
reproduced, adapted or communicated without the prior written consent of the 
sender and/or copyright owner.

It is the responsibility of the recipient to check for and remove viruses.

If you have received this email in error, please notify the sender by return 
email, delete it from your system and destroy any copies. You are not 
authorised to use, communicate or rely on the information contained in this 
email.

Please consider the environment before printing this email.

This email was Anti Virus checked by RBG Astaro Mail Gateway.



___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo


--
Emmanuel Paradis
IRD, Jakarta, Indonesia
http://ape.mpl.ird.fr/

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo


Re: [R-sig-phylo] Ape: dist.dna, P-distance calculation

2012-07-29 Thread Emmanuel Paradis
Hi,

Try the option pairwise.deletion = TRUE. If you still find abnormal values 
please give some example data.

Emmanuel
-Original Message-
From: Jessica Sabo sabo.j...@gmail.com
Sender: r-sig-phylo-boun...@r-project.org
Date: Sun, 29 Jul 2012 14:42:17 
To: r-sig-phylo@r-project.org
Subject: [R-sig-phylo] Ape: dist.dna, P-distance calculation

Hi all,

I am calculating P-distances using dist.dna (pdistance -
dist.dna(convertdata, model = raw). WHen I did this, I got
abnormally high p-distances in the .9 region. I then calculated the p
distance using Paup and I got more normal values. Ultimately, I would
like to use R to calculate p-distances and not Paup. Does anyone know
what is going on here?

Thanks for your help,

Jessica Sabo

PhD Student
University of Florida

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo


Re: [R-sig-phylo] sort / select trees by backbone constraint?

2012-08-02 Thread Emmanuel Paradis

Dear Annemarie,

Annemarie Verkerk wrote on 30/07/2012 16:48:

Dear all,

I have a follow-up question about how to sort or select trees from a
tree sample as originally asked here:

https://stat.ethz.ch/pipermail/r-sig-phylo/2011-March/001194.html

I would like to select a set of trees from a sample of multiple trees
(i.e., BEAST output) to return all trees that match a constraint
backbone topology.

Neither of the two answers given to the original question work for me,
because they seem to be able to only constrain the tree sample using
full trees, not with partial backbone trees. My aim is as follows.

If I have a tree sample such as the following:

tree1: (t1, (t2, (t3, (t4, t5
tree2: (t1, (t2, (t4, (t3, t5
tree3: (t1, (t3, (t4, (t2, t5
tree4: (t5, (t3, (t4, (t2, t1
tree5: (t1, (t3, (t5, (t2, t4

I would like to select from this tree sample those trees that have:

backbone1: (t1, (t2, (XXX)))

i.e. those trees that have that topology of t1 and t2, while the
internal structure of the rest of the tree does not matter. tree1 and
tree2 should be selected, but not tree3, tree4, or tree5;


It seems equivalent to test whether the tips excluding t1 and t2 are 
monophyletic. So something like:


is.monophyletic(phy, paste(t, 3:5, sep = ))

should tell you whether to select 'phy' or not. You may also think 
whether to consider your trees as rooted or not (see the option 'reroot' 
of is.monophyletic).



and those trees that have:

backbone2: (XXX (t3, (t5, XXX)))


The same reasoning might be applied where you first test whether t5 and 
other tips excluding t3 are monophyletic, and then test whether t3, t5 
and the same other tips are monophyletic. This is of course more 
complicated than the 1st case. Maybe prop.part is more appropriate here 
(and faster): it will return the list of all monophyletic groups where 
the 1st element includes all tips (so it's trivially monophyletic). So 
you could look for the clades (excluding the 1st one) where t5 is 
present and t3 absent. Then test whether a clade with the same clade 
plus t3 is present in the list.



i.e. those trees in which t5 is nested in a clade that is sister to t3,
while the internal structure of the rest of the tree does not matter.
tree1, tree3 and tree5 should be selected, but not tree2 or tree4.

Of course it would be possible to do this by making the selection with
all possible full trees that have this partial topology, but I think
this would be time-demanding with trees that have a large number of
taxa.


Not only time-demanding but the number of possible trees will certainly 
be too large. Suppose XXX in backbone1 has 50 tips:



ape::howmanytrees(50)

[1] 2.752921e+76

which is close to the number of atoms in the universe.

Best,

Emmanuel


Does anybody have any ideas on how to do this?

Many thanks, as always,

Annemarie


___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo


Re: [R-sig-phylo] plotting a coloured tree

2012-08-22 Thread Emmanuel Paradis

Agus,

Whether the background appears transparent or not depends on the device 
where you plot the tree. On Linux, the default device (X11) is transparent:



par(bg)

[1] transparent

When plotting in a file, the default depends on the file format: pdf() 
has transparent background, but png() has a white one. For the latter 
you may do:


png(tree.png)
par(bg = transparent)
plot(tree, ... etc...
dev.off()

The way the background is rendered may depend on the application you use 
to open this file: on my system the transparent background of a PNG 
image is shown as a grey checkerboard, but when I open a PDF file with a 
transparent background, this one appears white. But the difference 
between a transparent and a white background should be clear if you 
include such a file in a document or in slides.


Best,

Emmanuel

Marcio Pie wrote on 20/08/2012 07:09:

Hi Agustin,

The problem seems to be the space between the genus and the species
names. Try this:


require(ape)
require(geiger)

tree-read.tree(text=
Calyptommatus_leiolepis:0.0168, Calyptommatus_nicterus:0.0099):0.0242
,Calyptommatus_sinebrachiatus:0.038):0.0724
,(Scriptosaura_catimbau:0.022,Nothobachia_ablephara:0.0309):0.0648):0.0659
,Procellosaurinus_erythrocercus:0.0235,
Procellosaurinus_tetradactylus:0.0259):0.0462
,Vanzosaura_rubricauda:0.072):0.142
,Micrablepharus_maximiliani:0.863):0.0225
,Psilopthalmus_paeminosus:0.193):0.0131);)
rgb(red=0, green=0, blue=0, alpha=100, max=255)
plot(tree,use.edge.length = T, font=4,tip.col=black,  bg=transparent,
cex=1.5,edge.color = 'cyan',edge.width = 3)


Marcio

On Sun, Aug 19, 2012 at 8:19 PM, Agus Camachoagus.cama...@gmail.com  wrote:

Dear list,

Im trying to plot a coloured tree with transparent background, to do that,
im using the following script:
require(ape)
require(geiger)
tree- read.tree(text=
Calyptommatus leiolepis:0.0168, Calyptommatus nicterus:0.0099):0.0242
,Calyptommatus sinebrachiatus:0.038):0.0724
,(Scriptosaura catimbau:0.022,Nothobachia ablephara:0.0309):0.0648):0.0659
,Procellosaurinus erythrocercus:0.0235,Procellosaurinus
tetradactylus:0.0259):0.0462
,Vanzosaura rubricauda:0.072):0.142
,Micrablepharus maximiliani:0.863):0.0225
,Psilopthalmus paeminosus:0.193):0.0131);)
rgb(red=0, green=0, blue=0, alpha=100, max=255)
plot(tree,use.edge.length = T, font=4,tip.col=black,  bg=transparent,
cex=1.5,edge.color = 'cyan',edge.width = 3)

However, when I plot that, it does not appear transparent background and it
also plots together the species name and the genus.
Anybody knows how to do that?
Thanks in advance!
Agus
--
Agustín Camacho Guerrero.
Doutorando em Zoologia.
Laboratório de Herpetologia, Departamento de Zoologia, Instituto de
Biociências, USP.
Rua do Matão, trav. 14, nº 321, Cidade Universitária,
São Paulo - SP, CEP: 05508-090, Brasil.

 [[alternative HTML version deleted]]


___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo







--
Emmanuel Paradis
IRD, Jakarta, Indonesia
http://ape.mpl.ird.fr/

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo


Re: [R-sig-phylo] ladderize + drop.tip = shuffled node labels

2012-10-10 Thread Emmanuel Paradis

Hi,

This was a bug in drop.tip(). It is fixed and will be in the next 
release of ape (the fixed version is in ape's SVN).


Best,

Emmanuel

Rebecca Best wrote on 02/10/2012 13:26:

Hi all

I have been plotting some pruned trees recently, and have run into a
problem using drop.tip() after ladderize(). If you ladderize() and
then drop tips from the ladderized tree, then at least in my case the
node labels are no longer correct. This may be an unlikely sequence of
commands, but I thought I'd post this in case it is an easy fix, or it
helps anyone else avoid issues.
Thanks!

Rebecca

##

require(ape)

#read tree


mytree-read.tree()
((D,(E,G)1)1,((H,J)0.8,(K,(((L,M)0.5,(N,O)0.6)1,(P,(Q,R)1)1)0.7)1)1);

#ladderize tree

mytree.lad-ladderize(mytree)

#node labels display on both trees correctly

plot(mytree,show.node.label=TRUE)
plot(mytree.lad,show.node.label=TRUE)

#drop tips from both trees

drop.mytree-drop.tip(mytree,c(L,D,G))
drop.mytree.lad-drop.tip(mytree.lad,c(L,D,G))

#plot both trees, node labels are incorrect for ladderized tree

plot(drop.mytree,show.node.label=TRUE)
plot(drop.mytree.lad,show.node.label=TRUE)

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo



--
Emmanuel Paradis
IRD, Jakarta, Indonesia
http://ape.mpl.ird.fr/

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo


Re: [R-sig-phylo] Scale parameter in compar.gee()

2012-10-12 Thread Emmanuel Paradis

Hi Carlos,

Carlos A. Botero wrote on 10/10/2012 03:49:

Dear R-sig-phylo users,

I am using compare.gee() to run a phylogenetically informed analysis
with Poisson error. The scale parameter in my model is estimated to be
close to 4, indicating that the data are over-dispersed. It is unclear
to me whether compare.gee() uses the scale parameter to inflate the
covariance matrix of the parameter estimates (as I understand it is done
in SAS)


In the GEE formulation, the scale is used to multiply the var-cov matrix 
of the observations. Its estimated value is then used to multiply the 
SEs of the parameter estimates by sqrt(scale).



or whether it simply reports this statistic as a diagnostic of
potential over-dispersion. In other words, do the results of a poisson
regression in compare.gee() already account for over dispersion or are
they invalid when the scale parameter is greater than one?


The former. You may repeat your analyses setting 'scale.fix = TRUE' to 
compare the results.


HTH

Best,

Emmanuel


Guidance would be greatly appreciated. Best,


Carlos
___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo



--
Emmanuel Paradis
IRD, Jakarta, Indonesia
http://ape.mpl.ird.fr/

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo


Re: [R-sig-phylo] breaking up phylogeny figure

2012-10-12 Thread Emmanuel Paradis

Hi Ben,

You should be able to play with extract.clade() and drop.tip(): they 
both have an option 'interactive' so you may click on the plotted 
backbon tree to build your subtrees.


Best,

Emmanuel

Ben Weinstein wrote on 03/10/2012 23:45:

Hi all,

I'm wondering if there is a way to plot a phylogeny in R that is broken up
and displayed side by side to condense space. I've made a backbone in R
that i'd like to make into a figure, but it is too tall. If this isn't the
write setup, could someone suggest what program (and format for me to
write.tree) from R. I am using windows, in terms of potential third part
programs.

Thanks,

ben


___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo


Re: [R-sig-phylo] Reverting or setting axis label in plottet phylo tree

2012-10-24 Thread Emmanuel Paradis

Hi Jana,

Jarecki, Jana wrote on 24/10/2012 20:01:

Hi everybody,

just a quick question:
-- is it possible to revert the scale on the axis added to a tree plotted with 
plot(phyloobject)?


No, but instead of axisPhylo() you may use axis(1).


-- is it possible to manually set the axis tick locations and labels?


No, but it's fairly easy to do fix(axisPhylo) and modify the code to 
your need. Or you can use axis(1, at = ).


Best,


Thanks a lot,
best
Jana

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo



--
Emmanuel Paradis
IRD, Jakarta, Indonesia
http://ape.mpl.ird.fr/

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo


Re: [R-sig-phylo] drop.tip weird handling of node labels

2012-11-21 Thread Emmanuel Paradis

Dear Tristan,

Thanks alot for spotting this. This is fixed now. The new version is 
available here:


https://svn.mpl.ird.fr/ape/dev/ape/R/drop.tip.R

More tests will be welcome.

Cheers,

Emmanuel

Tristan Lefebure wrote on 15/11/2012 18:09:

Dear all,

I am puzzled with the following dropt.tip behaviour:

##
require(ape)
tr-
read.tree(text='(((t1:1.047859182,t2:1.047859182):0.2756716989,t3:1.323530881):0.9374413868,(t4:1.004540609,(t5:0.825851211,t6:0.825851211):0.1786893979):0.2456960053,((t7:1.056756599,t8:1.056756599):0.06431900218,t9:1.121075601):0.1291610131):0.3349743758,t10:1.58521099):0.2527177518,((t11:0.7045266296,t12:0.7045266296):0.9569230276,(t13:1.434588187,t14:1.434588187):0.2268614706):0.1764790846):0.08226490869,t15:1.279204276,((t16:1.114074687,t17:1.114074687):0.1122507672,t18:1.226325454):0.05287882165):0.1013794601,t19:0.9955287763,t20:0.9955287763):0.1328862159,(((t21:0.5523867955,t22:0.5523867955):0.2108511243,t23:0.7632379198):0.1651743381,((t24:0.6573230631,t25:0.6573230631):0.1870302034,(t26:0.7107704271,t27:0.7107704271):0.1335828394):0.08405899131):0.227343):0.1056949204,(t28:1.153531955,t29:1.153531955):0.08057795766):0.05156567335,((t30:0.7944490533,t31:0.7944490533):0.3247998092,(((t32:0.6657996502,t33:0.6657996502):0.1778661928,((t34:0.1423836508!

,t!

  
35:0.1423836508):0.5359910966,t36:0.6783747474):0.1652910955):0.08026277837,t37:0.9239286214):0.1953202411):0.1664267234):0.0949081502):0.06478712445,(((t38:0.7909217241,(t39:0.404267298,t40:0.404267298):0.3866544261):0.4221751735,((t41:0.9390399711,t42:0.9390399711):0.1900443404,(t43:0.7331382087,(t44:0.5724991151,(t45:0.4288431567,t46:0.4288431567):0.1436559584):0.1606390935):0.0802190135,t47:0.813357):0.09505692675,t48:0.9084141489):0.1203866568,(((t49:0.748033,t50:0.748033):0.081110064,(t51:0.4635806844,t52:0.4635806844):0.3575301829):0.09861132464,t53:0.9197221919):0.1090786138):0.03016346022,(t54:0.7468362228,t55:0.7468362228):0.3121280432):0.07012004561):0.08401258605):0.1109195687,t56:1.324016466):0.1213543942):0.1153375546,((t57:1.136226544,t58:1.136226544):0.2488130821,(t59:0.493435228,t60:0.493435228):0.8916043976):0.1756687895):0.3594852354):0.3407786169);')

torm- paste('t',
c(2,3,4,6,10,11,12,17,18,26,29,33,37,38,41,45,46,51,52,53,55), sep='')
tr$node.label- paste('n', 1:Nnode(tr), sep='')
x11(); plot(tr, cex=0.7); nodelabels(tr$node.label, cex=0.7);
tr2- drop.tip(tr, torm)
x11(); plot(tr2, cex=0.7); nodelabels(tr2$node.label, cex=0.7);
#

Just look at the top clade made of t60, t59, t57 and t57. This clade is not
impacted by the tree pruning; but look at the node labels, they have been
badly reassigned... I've tried to replicate the problem on a smaller data
set, but did not succeed.


Thanks!

--
Tristan Lefebure

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo



--
Emmanuel Paradis
IRD, Jakarta, Indonesia
http://ape.mpl.ird.fr/

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo


Re: [R-sig-phylo] Why no branch lengths on consensus trees?

2012-11-22 Thread Emmanuel Paradis

Hi Scott,

The reason for implementing only the consensus on the topology was after 
reading the appropriate chapter in Inferring Phylogenies. So I'm glad 
that Joe himself stepped in. I have added this reference in the help 
page (it was already cited in my book with respect to this issue).


I think that the usage of branch lengths on (or from) a consensus tree 
depends on your perspective on phylogenetic estimation.


If you are a frequentist, the goal of resampling the data is to give an 
idea of the accuracy of the estimates. So you might not use the 
bootstrap trees to estimate branch lengths; you already did that by ML 
(or maybe least squares) with the full data.


If you are Bayesian, the trees sampled from an MCMC are here for 
estimation including of the branch lengths, so you use them to compute 
some sort of consensus topology as well as its branch lengths. So it 
makes sense that MrBayes can do a consensus tree with branch lengths.


Cheers,

Emmanuel

Scott Chamberlain wrote on 22/11/2012 09:41:

Dear Joe,

Thanks for your feedback on this question.  I will go read those pages you
mentioned.

Scott


On Wed, Nov 21, 2012 at 5:19 PM, Joe Felsensteinj...@gs.washington.eduwrote:



Daniel Barker wrote:

What should branch lengths on a consensus tree represent?

Scott Chamberlain had written:


When making a consensus tree using ape::consensus the branch lengths are

lost. Is there a way to not lose the branch lengths? Or to add them

somehow

to the consensus tree after making it.


The issue of what branch lengths ought to be on a consensus tree is not
simple.  If we have three rooted trees:
((A:1,B:1):1,C:2);
((A:1,B:1):1,C:2);
(A:2,(B:1,C:1):1);

the consensus tree should be the first tree, but what branch length should
be used for (say) the branch ancestral to the AB clade?  1?   0.667?

The minute you open this can of worms it becomes clear that the answer
depends on what you want that number to convey and what interpretations
your audience will tend to draw from the number.  There is no obvious
answer. So this is not a mere technical computing question.

By the way, in my 2004 book, you will find me agonizing about this on page
526, coming down on the side of 0.667, but not overwhelmingly convincingly.
  You could argue that a branch length should be set 0 when the branch is
not there, and all the resulting values averaged, or you could argue that
the average should only be taken over those trees for which that branch is
present.

One possible way to solve the problem is to take the consensus tree as if
it were a user-defined tree, use your whole data set, and infer branch
lengths on that tree.  Daniel has already expressed his legitimate concerns
in such a case as to whether it takes (for example) trifurcations as if
they were real rather than an expression of our uncertainty.

J.F.

Joe Felsenstein j...@gs.washington.edu
Department of Genome Sciences and Department of Biology,
University of Washington, Box 355065, Seattle, WA 98195-5065 USA

(from 1 October 2012 to 10 December 2012 on sabbatical  leave at)
Department of Statistics, University of California, Berkeley, 367 Evans
Hall, Berkeley, CA  94710






[[alternative HTML version deleted]]

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo



--
Emmanuel Paradis
IRD, Jakarta, Indonesia
http://ape.mpl.ird.fr/

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo


Re: [R-sig-phylo] Modification of figures produced using cophyloplot

2012-12-04 Thread Emmanuel Paradis

Hi David,

David Hembry wrote on 03/12/2012 13:12:

Dear all,

I am plotting associated insect-plant phylogenies with links between
tips using the function cophyloplot, and have two questions about how to
modify the resulting plot.

First, having initially plotted the data successfully, I want to modify
the font size of the tip labels on both the insect and plant phylogenies.

Using the following command:

cophyloplot(glo.tree, epi.tree, file, space = 100, gap = 5, length.line
= 5, lty = 2, cex = 0.5);

where glo.tree and epi.tree are the two phylogenies, and file is the
file indicating associations between tips, I was unable to adjust the
font size from the default. I also tried to change the font size in each
of the input trees before using cophyloplot, to use the tiplabels()
function immediately following cophyloplot(), and to combine both input
trees into a single file using write.tree(append = TRUE). None of these
attempts were successful.


cophyloplot() uses its own engine because it must draw two trees on 
the same plotting region together with the links, so most graphic 
functions from ape (eg, tiplabels) will not work with it. Also the ... 
argument is unused (as specified in the help page). The 'cex' parameter 
is hard-coded inside the internal function plotCophylo2 so you may edit 
it, but it's a bit tricky because if you do that the edited version will 
be in your workspace and it will not be used by cophyloplot() because of 
ape's namespace. The trick is like this (in case you want to try it, and 
this might be used in other situations). First make a copy of the 
internal function:


titi - plotCophylo2

then do fix(titi) and modify what you want (here look for calls to 
text(... cex = ...), save and close. Do fix(cophyloplot) and change 
plotCophylo2(... by titi(..., save and close.



Second, I also tried using the command cophyloplot(use.edge.length =
TRUE) in order to show branch length information (i.e.,
non-time-calibrated trees that result from a likelihood analysis) for
each of the phylogenies, but this had the effect of collapsing the trees
at either side of the figure into trees with miniscule branch lengths,
which were not possible to adjust significantly using the arguments
space, gap, and length.line. Is there a way to show non-calibrated trees
with the function cophyloplot?


Yes, rescale your branch lengths beforehand, eg:

glo.tree$edge.length - 100 * glo.tree$edge.length
epi.tree$edge.length - 100 * epi.tree$edge.length

where you have to find the value that best suits your data.

Cheers,

Emmanuel


I would be grateful for any suggestions. Thank you very much in advance.

David Hembry



--
Emmanuel Paradis
IRD, Jakarta, Indonesia
http://ape.mpl.ird.fr/

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo


Re: [R-sig-phylo] read.nexus error message

2012-12-07 Thread Emmanuel Paradis

Hi Simon,

I have tried with the first 2500 species and it worked fine. Are you 
sure to have the last version of ape?


The file output by this server is a NEXUS file, so use read.nexus, not 
read.tree.


Best,

Emmanuel

Simon Ducatez wrote on 07/12/2012 01:06:

Dear all,

I am having troubles importing a nexus file into R. The file contains 100 
different trees (containing over 2000 species) from a pseudo-posterior 
distribution (imported from the website http://birdtree.org/), and I got the 
same error message whether using read.nexus or read.tree:
Error in if (tp[3] != ) obj$node.label- tp[3] :   missing value where 
TRUE/FALSE needed
The same error was mentioned before on the web, but I couldn’t find any 
solution, and I do not understand the message. The different trees seem OK on 
Mesquite. Note that when using the same website to extract 100 trees, but this 
time for a sample of 50 species, the importation works without problem.
Any help would be greatly appreciated! Thank you very much in advance

Best
Simon

[[alternative HTML version deleted]]




___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/


--
Emmanuel Paradis
IRD, Jakarta
Visiting Professor, Agricultural University of Bogor
http://ape.mpl.ird.fr/

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/


Re: [R-sig-phylo] plotting a time axis on a circular phylogram

2012-12-18 Thread Emmanuel Paradis

Hi Marcel,

Here's what you can do:

fix(plot.phylo)

Find the line 104 which should be:

xx - seq(0, 2 * pi * (1 - 1/Ntip), 2 * pi/Ntip)

and replace it by:

open.angle - pi/4
xx - seq(0, 2 * pi * (1 - 1/Ntip) - open.angle, length.out = Ntip)

where 'open.angle' is the angle you want to leave blank (in radians). 
Save and close. Alternatively (and better), you may add 'open.angle' as 
an option in the function (in which case, of course, do not add the 
first line above). Don't forget to play with rotate.tree too. This will 
not draw the scale but will leave space for it so you can add it yourself.


BTW, add.scale.bar() should draw a scale bar correctly. axisPhylo() 
currently fails with circular trees. I'll try to fix that.


HTH

Emmanuel

Marcel Cardillo wrote on 13/12/2012 09:37:

Hello all

I have been searching for a way to plot a time axis on a circular
phylogram created using plot.phylo(type=fan). I have seen such plots
published but wasn't sure if these were done by hand, or if there is an
automated way. One problem with doing it by hand (eg in Illustrator) is
that there doesn't seem to be a large enough gap between the first and
last tips in which an axis could be squeezed (my tree has 170 tips). Any
advice gratefully accepted.

cheers
Marcel Cardillo



--
Emmanuel Paradis - IRD
Visiting Professor, Agricultural University of Bogor
http://ape.mpl.ird.fr/

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/


Re: [R-sig-phylo] plotting a time axis on a circular phylogram

2012-12-19 Thread Emmanuel Paradis
This has been included in plot.phylo(). Also axisPhylo() will work 
correctly with circular trees (an error occurs until now) and a sensible 
error message is now issued with radial and unrooted trees. These all 
will be in ape 3.0-7 and are already available on ape's SVN.


Emmanuel

Marcel Cardillo wrote on 19/12/2012 07:52:

Thanks Emmanuel. It works nicely with 'open.angle' as an argument in the
function call.
Marcel

On 18/12/2012 11:06 PM, Emmanuel Paradis wrote:

Hi Marcel,

Here's what you can do:

fix(plot.phylo)

Find the line 104 which should be:

xx - seq(0, 2 * pi * (1 - 1/Ntip), 2 * pi/Ntip)

and replace it by:

open.angle - pi/4
xx - seq(0, 2 * pi * (1 - 1/Ntip) - open.angle, length.out = Ntip)

where 'open.angle' is the angle you want to leave blank (in radians).
Save and close. Alternatively (and better), you may add 'open.angle'
as an option in the function (in which case, of course, do not add the
first line above). Don't forget to play with rotate.tree too. This
will not draw the scale but will leave space for it so you can add it
yourself.

BTW, add.scale.bar() should draw a scale bar correctly. axisPhylo()
currently fails with circular trees. I'll try to fix that.

HTH

Emmanuel

Marcel Cardillo wrote on 13/12/2012 09:37:

Hello all

I have been searching for a way to plot a time axis on a circular
phylogram created using plot.phylo(type=fan). I have seen such plots
published but wasn't sure if these were done by hand, or if there is an
automated way. One problem with doing it by hand (eg in Illustrator) is
that there doesn't seem to be a large enough gap between the first and
last tips in which an axis could be squeezed (my tree has 170 tips). Any
advice gratefully accepted.

cheers
Marcel Cardillo








--
Emmanuel Paradis - IRD
Visiting Professor, Agricultural University of Bogor
http://ape.mpl.ird.fr/

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/


Re: [R-sig-phylo] using gamma parameter with dist.dna

2013-01-04 Thread Emmanuel Paradis

Hi Peter,

You are right: it's a bug.

Peter Chi wrote on 04/01/2013 12:30:

Hello list,

I am currently trying to estimate distances on data that I've simulated under 
the JC69+\Gamma model.

It appears that I can provide the gamma parameter to the dist.dna function in 
ape. However, I'm wondering if there may be a bug in the code, because no 
matter what value I provide, it gives the same estimated distance matrix 
(though it is different from the resulting distance matrix when gamma=FALSE as 
it is by default)

I found these lines in the dist.dna code:
if (!gamma)
 gamma- alpha- 0
 else alpha- gamma- 1


This last line should be 'else alpha - gamma', or:

else {
alpha - gamma
gamma - 1
}

You can fix it easily with fix(dist.dna).

Thanks for the report. It'll be fixed in the next version of ape which 
will be released soon.


Cheers,

Emmanuel


So it looks like if gamma=FALSE, then it will set gamma and alpha variables 
equal to 0. If gamma is anything else (e.g. TRUE, or any numerical value), then 
it sets the alpha and gamma variables both to 1.

I'm not sure if this is what was intended. Looking at the next line of code 
(where a C function is called), my guess is that the intent is that if gamma=x, 
then set alpha=x and then set gamma=1. But I can't be sure, because I couldn't 
locate the C dist_dna function within the ape binaries.

I tried some simulations where I modified the code according to my guess of 
what it should be, and it then seems to behave as I might expect (the estimated 
distances are in fact closer to the truth on average when I give dist.dna the 
true gamma parameter than when I leave gamma=FALSE).

But if anyone has any experience with this, your input is appreciated. Thanks!


Peter

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/



--
Emmanuel Paradis - IRD
Visiting Professor
Agricultural University of Bogor
http://ape.mpl.ird.fr/

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/


[R-sig-phylo] Pre-release of ape 3.0-7

2013-01-04 Thread Emmanuel Paradis

Dear all,

ape 3.0-7 is now ready for release on CRAN. It includes a much faster 
code for reading FASTA files, a new function for estimating chronograms 
by PL and ML (which will eventually replace chronopl), and various 
improvements and bug fixes already announced here. The sources are 
available for testing as well as a compiled version for Windows (thanks 
to Uwe Ligges's server) on ape's web site, or directly:


http://ape.mpl.ird.fr/ape_3.0-7.tar.gz
http://ape.mpl.ird.fr/ape_3.0-7.zip

Tests are welcome, especially the new FASTA parser which should behave 
slightly differently on Windows than on other OSs. The release date on 
CRAN is planned around mid-January.


Best,

Emmanuel
--
Emmanuel Paradis - IRD
Visiting Professor
Agricultural University of Bogor
http://ape.mpl.ird.fr/

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/


Re: [R-sig-phylo] read.dna line endings in ape 3.0-7

2013-02-06 Thread Emmanuel Paradis

Hi Rupert,

The Mac version of ape on CRAN is not correct and has a few bugs 
particularly in read.dna(). Another problem is that G is not read (but 
g is). This does not affect the Windows version nor the sources. A 
solution for Mac users is re-install from the sources. Another one is to 
use the code of read.dna() from a previous release of ape.


I'm waiting for other tests and feed-back before releasing ape 3.0-8 
(within one or two weeks).


Best,

Emmanuel

Rupert Collins wrote on 31/01/2013 02:20:

I have noticed that in ape 3.0-7, read.dna on Linux does not parse fasta
files with Windows line endings (i.e. CRLF \r\n). It results in the
addition of \r to the taxon label, e.g. taxonA\r rather than the
behaviour of 3.0-6, which was taxonA.

Hope this makes sense, and thanks for any explanation.

Rupert


Here's an MWE.

##
zz- file(test.fas, w)
cat(taxonA, actg, taxonB, actt, taxonC, aatt\r, file = zz,
sep = \r\n)
close(zz)

library(ape)#3.0-7
dat- read.dna(file=test.fas, format=fasta)
labels(dat)
##

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/



--
Emmanuel Paradis - IRD
Visiting Professor
Agricultural University of Bogor
http://ape.mpl.ird.fr/

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/


Re: [R-sig-phylo] Ancestral state estimates of continuous traits

2013-03-16 Thread Emmanuel Paradis
Alejandro and Liam,

You can also try method=REML which gives much better estimates of sigma^2 
than ML. I think I'll make it the default for continuous characters.

Best,

Emmanuel
-Original Message-
From: Liam J. Revell liam.rev...@umb.edu
Sender: r-sig-phylo-boun...@r-project.org
Date: Fri, 15 Mar 2013 14:23:20 
To: Alejandro Gonzalezalejandro.gonza...@ebd.csic.es
Cc: R-phylo Mailing-listr-sig-phylo@r-project.org
Subject: Re: [R-sig-phylo] Ancestral state estimates of continuous traits

Hi Alejandro.

That must be a bug in ace(...,method=GLS).

I would also suggest you check out fastAnc in the phytools package. It 
uses ace(...,method=pic) internally to take advantage of the fact that 
the contrasts estimate at the root and the MLE assuming Brownian 
evolution are the same. It re-roots the tree at all internal nodes which 
sounds computationally intensive but is actually much faster than 
getting the MLEs via numerical optimization.

fastAnc also uses equation (6) from Rohlf (2001) to get the correct 95% 
interval on the estimates. My analysis 
(http://blog.phytools.org/2013/02/new-version-of-fastanc-new-build-of.html) 
suggests that the 95% CIs from ace(...,method=ML) are too small. (This 
is probably because they rely on asymptotic properties of likelihood 
that are not satisfied for the relatively small size of most 
phylogenetic datasets.)

All the best, Liam

Liam J. Revell, Assistant Professor of Biology
University of Massachusetts Boston
web: http://faculty.umb.edu/liam.revell/
email: liam.rev...@umb.edu
blog: http://blog.phytools.org

On 3/15/2013 8:18 AM, Alejandro Gonzalez wrote:
 Hello,

 I am using ape to obtain ancestral state estimates for continuous
 traits. Two options are available, either maximum likelihood or a GLS
 method. I am comparing the results of both methods and one difference
 between the two puzzles me, I hope someone can enlighten me. Under GLS
 the ancestral state estimate at the root has virtually identical values
 for the point estimate and 95% confidence intervals, here is one
 example: 100 1.7848301 1.78483005 1.7848301 (100 is the root node
 number, then the ancestral state estimate and 95% CIs, respectively).
 However, when I use maximum likelihood for the same ancestral sate
 estimate, with the same data and tree I get the following result for the
 root :  100 1.6570119  0.86612615 2.4478977 (numbers in the same order
 as above).
 Any ideas as to why GLS gives such narrow confidence intervals for the
 ancestral state estimate at the basal node?

 Cheers

 Alejandro
 __

 Alejandro Gonzalez Voyer

 Post-doc

 Estaci�n Biol�gica de Do�ana
 Consejo Superior de Investigaciones Cient�ficas (CSIC)
 Av Am�rico Vespucio s/n
 41092 Sevilla
 Spain

 Tel: + 34 - 954 466700, ext 1749

 E-mail: alejandro.gonza...@ebd.csic.es

 Web site (Under construction):

 Personal page: http://consevol.org/members/alejandro_combo.html

 Group page: http://consevol.org/people.html

 For PDF copies of papers see:

 http://csic.academia.edu/AlejandroGonzalezVoyer



  [[alternative HTML version deleted]]

 ___
 R-sig-phylo mailing list - R-sig-phylo@r-project.org
 https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
 Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

Re: [R-sig-phylo] Input Format for nuc.div

2013-03-23 Thread Emmanuel Paradis
Hi Mark,

On Fri, 22 Mar 2013 10:24:36 -0400 (EDT)
 markharriso...@aol.com wrote:
 Hello,
 
 
 I'd like to measure nuc.div for a list of about 20,000 loci across 
170 individuals.
 What is the best way to format the input data? Is it possible to 
integrate the whole data within one file or would I need a separate 
file for each gene locus?

Both are possible and you should use the solution that best suits your goals. 
Generally, R is flexible enough so you don't have to change your data files.

 Additionally, is it possible to use . or any other arbitrary 
character to denote an invariable base or does each character have to 
be a G, C, T or an A?

The latter (can be in lowercase too).

Best,

Emmanuel

 Thanks for you help,
 Mark

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/


Re: [R-sig-phylo] read.dna

2013-03-23 Thread Emmanuel Paradis
Hi again,

It's a bug already identified:

https://stat.ethz.ch/pipermail/r-sig-phylo/2013-February/002514.html

The release of ape 3.0-8 will be done next week.

Best,

Emmanuel
-Original Message-
From: markharriso...@aol.com
Sender: r-sig-phylo-boun...@r-project.org
Date: Fri, 22 Mar 2013 14:10:23 
To: r-sig-phylo@r-project.org
Subject: [R-sig-phylo] read.dna

Hello,


I've tried to read a fasta file containing 7 sequences of equal length (5733 
bases) and get the following output:



 df-read.dna(AT1G01040.fasta, format=fasta)
 df
7 DNA sequences in binary format stored in a list.


Mean sequence length: 4224.571 
   Shortest sequence: 4223 
Longest sequence: 4226 


Labels: Aa_0 Abd_0 Ag_0 Ak_1 Altai_5 Amel_1 ...


Base composition:
a c g t 
0.393 0.237 0.000 0.370 
 





It appears that Gs are not being read which would also account for the shorter 
and uneven sequence lengths.
Can anyone imagine what has happened here?


Thanks for any help,
Mark
 

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/


Re: [R-sig-phylo] ML ancestral state reconstruction using different softwares

2013-04-04 Thread Emmanuel Paradis
Hi,

You cannot really say that in both ace and Mesquite, it slightly favors the 
asymmetrical model: the increase in log-likelihood is less than 1 which is far 
from being significant with one additional parameter. So it seems that all 
three pieces of software agree well on the estimate of rate for the symmetrical 
model. You should fit also this model with BayeTraits to do the full comparison.

Cheers,

Emmanuel
-Original Message-
From: Jingchun Li jingc...@umich.edu
Sender: r-sig-phylo-boun...@r-project.org
Date: Thu, 4 Apr 2013 10:15:05 
To: r-sig-phylo@r-project.org
Subject: [R-sig-phylo] ML ancestral state reconstruction using different
softwares

Hi all,

I am exploring different methods for ancestral state reconstruction for a
small phylogeny I have (26 OTUs). I have one binary trait coded as 0s and
1s for all the taxa. And I am more interested in the transition rates
between the two states.

If I understand correctly, a maximum likelihood ancestral state
reconstruction using a MK1 model or an asymmetrical rates model should give
me the optimized estimated transition rates between the two states. The
formal will assume the two rates are the same, the later will
assume independent rates.

So I firstly did this using the ape function in ace.

ERreconstruction - ace(traits_data, tree, type=discrete, model=ER)
ERreconstruction$rates
ERreconstruction$loglik

ARDreconstruction - ace(traits_data, tree, type=discrete, model=ARD)
ARDreconstruction$rates
ARDreconstruction$loglik

Then I tried using Mesquite, using MK1 model and the Asymmetrical 2-param.
Markov-k Model 

Then I tried using BayesTraits, with the MultiState - Maximum Likelihood
option, with no restrictions on q10 and q01.

What puzzled me is that I'm getting different results from the three
softwares.
--
aceER:
transition rate: 0.0059, likelihood: -7.00
aceARD:
transition rate: 0.019 and 0.00, likelihood: -6.11

Mesquite MK1:
transition rate: 0.0059, likelihood: -7.70
Mesquite Asymmetrical:
transition rate: 0.015 and 0.005, likelihood: -7.30

BayesTraits no restriction:
transition rate: 0.0057 and 0.0057, likelihood: -14.09


I can see that in both ace and Mesquite,
it slightly favors the asymmetrical model with two different rates, and the
rates and likelihoods are more of less comparable. But BayesTraits seems to
think the two rates should be equal, and it has a different likelihood.

Does this has something to do with different searching algorithms? Or am I
missing something here? I am aware that the scaled likelihoods in ace are
the scaled conditional likelihoods, not the joint or marginal
reconstructions. But this should not affect the estimated transition rates?


Thank you very much!

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/


Re: [R-sig-phylo] ML ancestral state reconstruction using different softwares

2013-04-05 Thread Emmanuel Paradis
Jingchun,

Thanks to Daniel for giving a detailed answer to your first question.

You can use ace() to fit a model with irreversible change. You need to build a 
custom model like this:

 Q - matrix(0, 2, 2)
 Q[2, 1] - 1
 Q
 [,1] [,2]
[1,]00
[2,]10

In this case there's one rate to be estimated and the other is fixed to zero 
(the 0's on the diagonal are ignored). This can be generalized to more 
complicated models (there's an example in my book). Then call ace with:

ace(traits_data, tree, type = d, model = Q)

This time you cannot compare this model with the ER one with an LRT: compare 
directly the log-liks since they have the same number of parameters or use 
AIC(). BTW, there is an anova methods to compare nested models fitted with ace: 
this avoids the pain to compute correctly df's.

Cheers,

Emmanuel
-Original Message-
From: Jingchun Li jingc...@umich.edu
Date: Thu, 4 Apr 2013 22:25:09 
To: emmanuel.para...@ird.fr; r-sig-phylo@r-project.org
Subject: Re: [R-sig-phylo] ML ancestral state reconstruction using different 
softwares

Sorry, I forgot to include the list. there we go.

 --
 *From: * Jingchun Li jingc...@umich.edu
 *Date: *Thu, 4 Apr 2013 21:58:56 -0400
 *To: *emmanuel.para...@ird.fr
 *Subject: *Re: [R-sig-phylo] ML ancestral state reconstruction using
 different softwares

 Thank you very much, Dr.Paradis. You are right that the two parameter
 model does not fit the data significantly better. I guess I'm more puzzled
 about why the three softwares gave me different likelihoods.

 I did fit the symmetrical model using BayesTraits later and it gave me
 about the same transition rates as the asymmetrical model and a
 similar likelihood value (-14.08 for symmetrical and -14.00 for
 asymmetrical). However I do have biological evidence that transition from 0
 to 1 might be extremely hard. So maybe I can also try set q01 to 0 (which
 is actually what ace gave me under the asymmetrical model)?

 Cheers,

 Jingchun


 On Thu, Apr 4, 2013 at 9:21 PM, Emmanuel Paradis 
 emmanuel.para...@ird.frwrote:

 Hi,

 You cannot really say that in both ace and Mesquite, it slightly favors
 the asymmetrical model: the increase in log-likelihood is less than 1
 which is far from being significant with one additional parameter. So it
 seems that all three pieces of software agree well on the estimate of rate
 for the symmetrical model. You should fit also this model with BayeTraits
 to do the full comparison.

 Cheers,

 Emmanuel
 -Original Message-
 From: Jingchun Li jingc...@umich.edu
 Sender: r-sig-phylo-boun...@r-project.org
 Date: Thu, 4 Apr 2013 10:15:05
 To: r-sig-phylo@r-project.org
 Subject: [R-sig-phylo] ML ancestral state reconstruction using different
 softwares

 Hi all,

 I am exploring different methods for ancestral state reconstruction for a
 small phylogeny I have (26 OTUs). I have one binary trait coded as 0s and
 1s for all the taxa. And I am more interested in the transition rates
 between the two states.

 If I understand correctly, a maximum likelihood ancestral state
 reconstruction using a MK1 model or an asymmetrical rates model should
 give
 me the optimized estimated transition rates between the two states. The
 formal will assume the two rates are the same, the later will
 assume independent rates.

 So I firstly did this using the ape function in ace.

 ERreconstruction - ace(traits_data, tree, type=discrete, model=ER)
 ERreconstruction$rates
 ERreconstruction$loglik

 ARDreconstruction - ace(traits_data, tree, type=discrete, model=ARD)
 ARDreconstruction$rates
 ARDreconstruction$loglik

 Then I tried using Mesquite, using MK1 model and the Asymmetrical
 2-param.
 Markov-k Model 

 Then I tried using BayesTraits, with the MultiState - Maximum Likelihood
 option, with no restrictions on q10 and q01.

 What puzzled me is that I'm getting different results from the three
 softwares.
 --
 aceER:
 transition rate: 0.0059, likelihood: -7.00
 aceARD:
 transition rate: 0.019 and 0.00, likelihood: -6.11

 Mesquite MK1:
 transition rate: 0.0059, likelihood: -7.70
 Mesquite Asymmetrical:
 transition rate: 0.015 and 0.005, likelihood: -7.30

 BayesTraits no restriction:
 transition rate: 0.0057 and 0.0057, likelihood: -14.09
 

 I can see that in both ace and Mesquite,
 it slightly favors the asymmetrical model with two different rates, and
 the
 rates and likelihoods are more of less comparable. But BayesTraits seems
 to
 think the two rates should be equal, and it has a different likelihood.

 Does this has something to do with different searching algorithms? Or am I
 missing something here? I am aware that the scaled likelihoods in ace are
 the scaled conditional likelihoods, not the joint or marginal
 reconstructions. But this should not affect the estimated transition
 rates?


 Thank you very much!

 [[alternative HTML version deleted

Re: [R-sig-phylo] Seeking list of nucleotide substitution models

2013-04-06 Thread Emmanuel Paradis
Hi Daniel,

Have you looked at the help page of dist.dna? The list of references there is 
focused on calculating distances but some are general. There are a few more 
references in my book too. I suggest you also look at Inferring Phylogenies.

Best,

Emmanuel
-Original Message-
From: Daniel Barker d...@st-andrews.ac.uk
Sender: r-sig-phylo-boun...@r-project.org
Date: Fri, 5 Apr 2013 16:23:36 
To: r-sig-phylo@r-project.orgr-sig-phylo@r-project.org
Subject: [R-sig-phylo] Seeking list of nucleotide substitution models

Hello,

Is there a review or list of ~every specific nucleotide substitution model
that has been proposed or used in the literature (with references)?

I'm interested in reversible models.

The most exhaustive list I have is from the jModeltest documentation,
https://code.google.com/p/jmodeltest2/wiki/TheoreticalBackground#Models_of_
nucleotide_substitution

I realise an enormous number of models is possible (and may have been used
in model averaging). I'm keen to know which have some precedent in the
literature, along the lines of those listed above. I'm finding it
difficult to get beyond the standard sources.

Thank you in advance,

Daniel

-- 
Daniel Barker
http://bio.st-andrews.ac.uk/staff/db60.htm
The University of St Andrews is a charity registered in Scotland : No
SC013532

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/


Re: [R-sig-phylo] ML ancestral state reconstruction using different softwares

2013-04-06 Thread Emmanuel Paradis
Hi Jingchun,

Yes in this case I'd use the true model. The LRT may be not very powerful 
with your data.

Cheers,

Emmanuel
-Original Message-
From: Jingchun Li jingc...@umich.edu
Date: Fri, 5 Apr 2013 10:25:57 
To: emmanuel.para...@ird.fr
Cc: r-sig-phylo@r-project.org
Subject: Re: [R-sig-phylo] ML ancestral state reconstruction using different 
softwares

Thank you very much Daniel and Emmanuel! Now things start to make sense.

I guess my last question is, if there is biological evidence that the
transition rates between the two traits are different, is it even
meaningful to compare the two models? Should I simply avoid using the equal
rates model?

Cheers,

Jingchun


On Fri, Apr 5, 2013 at 9:12 AM, Emmanuel Paradis emmanuel.para...@ird.frwrote:

 **
 Jingchun,

 Thanks to Daniel for giving a detailed answer to your first question.

 You can use ace() to fit a model with irreversible change. You need to
 build a custom model like this:

  Q - matrix(0, 2, 2)
  Q[2, 1] - 1
  Q
 [,1] [,2]
 [1,] 0 0
 [2,] 1 0

 In this case there's one rate to be estimated and the other is fixed to
 zero (the 0's on the diagonal are ignored). This can be generalized to more
 complicated models (there's an example in my book). Then call ace with:

 ace(traits_data, tree, type = d, model = Q)

 This time you cannot compare this model with the ER one with an LRT:
 compare directly the log-liks since they have the same number of parameters
 or use AIC(). BTW, there is an anova methods to compare nested models
 fitted with ace: this avoids the pain to compute correctly df's.

 Cheers,

 Emmanuel
 --
 *From: * Jingchun Li jingc...@umich.edu
 *Date: *Thu, 4 Apr 2013 22:25:09 -0400
 *To: *emmanuel.para...@ird.fr; r-sig-phylo@r-project.org
 *Subject: *Re: [R-sig-phylo] ML ancestral state reconstruction using
 different softwares

 Sorry, I forgot to include the list. there we go.

 --
 *From: * Jingchun Li jingc...@umich.edu
 *Date: *Thu, 4 Apr 2013 21:58:56 -0400
 *To: *emmanuel.para...@ird.fr
 *Subject: *Re: [R-sig-phylo] ML ancestral state reconstruction using
 different softwares

 Thank you very much, Dr.Paradis. You are right that the two parameter
 model does not fit the data significantly better. I guess I'm more puzzled
 about why the three softwares gave me different likelihoods.

 I did fit the symmetrical model using BayesTraits later and it gave me
 about the same transition rates as the asymmetrical model and a
 similar likelihood value (-14.08 for symmetrical and -14.00 for
 asymmetrical). However I do have biological evidence that transition from 0
 to 1 might be extremely hard. So maybe I can also try set q01 to 0 (which
 is actually what ace gave me under the asymmetrical model)?

 Cheers,

 Jingchun


 On Thu, Apr 4, 2013 at 9:21 PM, Emmanuel Paradis emmanuel.para...@ird.fr
  wrote:

 Hi,

 You cannot really say that in both ace and Mesquite, it slightly favors
 the asymmetrical model: the increase in log-likelihood is less than 1
 which is far from being significant with one additional parameter. So it
 seems that all three pieces of software agree well on the estimate of rate
 for the symmetrical model. You should fit also this model with BayeTraits
 to do the full comparison.

 Cheers,

 Emmanuel
 -Original Message-
 From: Jingchun Li jingc...@umich.edu
 Sender: r-sig-phylo-boun...@r-project.org
 Date: Thu, 4 Apr 2013 10:15:05
 To: r-sig-phylo@r-project.org
 Subject: [R-sig-phylo] ML ancestral state reconstruction using different
 softwares

 Hi all,

 I am exploring different methods for ancestral state reconstruction for a
 small phylogeny I have (26 OTUs). I have one binary trait coded as 0s and
 1s for all the taxa. And I am more interested in the transition rates
 between the two states.

 If I understand correctly, a maximum likelihood ancestral state
 reconstruction using a MK1 model or an asymmetrical rates model should
 give
 me the optimized estimated transition rates between the two states. The
 formal will assume the two rates are the same, the later will
 assume independent rates.

 So I firstly did this using the ape function in ace.

 ERreconstruction - ace(traits_data, tree, type=discrete, model=ER)
 ERreconstruction$rates
 ERreconstruction$loglik

 ARDreconstruction - ace(traits_data, tree, type=discrete, model=ARD)
 ARDreconstruction$rates
 ARDreconstruction$loglik

 Then I tried using Mesquite, using MK1 model and the Asymmetrical
 2-param.
 Markov-k Model 

 Then I tried using BayesTraits, with the MultiState - Maximum Likelihood
 option, with no restrictions on q10 and q01.

 What puzzled me is that I'm getting different results from the three
 softwares.
 --
 aceER:
 transition rate: 0.0059, likelihood: -7.00
 aceARD:
 transition rate: 0.019 and 0.00, likelihood: -6.11

 Mesquite MK1:
 transition rate: 0.0059, likelihood: -7.70
 Mesquite Asymmetrical:
 transition rate: 0.015 and 0.005

Re: [R-sig-phylo] NaN rates and error running ace()

2013-04-14 Thread Emmanuel Paradis

Hi Zhenjiang,

This is because some branch lengths are zero. You can change them for 
a small value, eg:


tree$edge.length[tree$edge.length == 0] - 1e-7

I have added a check of this in ace().

Cheers,

Emmanuel

Fri, 12 Apr 2013 12:22:53 -0600 zhenjiang xu zhenjiang...@gmail.com:

Hi all,

I am using the ace() from ape package with my phylogenetic tree
(downloadable at https://www.dropbox.com/s/ab8b5q45134llti/foo.tre). 
But I

met an error as shown here:

tree = read.tree(foo.tre)
is.ultrametric(tree)

[1] TRUE

is.binary.tree(tree)

[1] TRUE
## I tested on a lot of data sets which show the same error.
## here I just use one random data for convenience.

a=sample(c(0,1), 2858, replace=T)

## I used fix(ace) modify ace() code to just print the obj variable
## to show the cause of the error

ace(a, tree, type='d')

$loglik
[1] -5e+49

$rates
[1] NaN

Error in nlm(function(p) dev(p), p = obj$rates, iterlim = 1, stepmax 
= 0,

:
 missing value in parameter

At first, I thought this might be due to the large size of the tree, 
which
somehow cause the numerical accuracy problem. But later I tested on 
a

random tree, which ran perfectly:

tree2 = rtree(2858)
ace(a, tre, type='d')

$loglik
[1] -1980.203

$rates
[1] 14.51092


   Ancestral Character Estimation

Call: ace(x = a, phy = tre, type = d)

   Log-likelihood: -1980.203

Rate index matrix:
 0 1
0 . 1
1 1 .

Parameter estimates:
rate index estimate std-err
 1  14.5109 10.9866

Scaled likelihoods at the root (type '...$lik.anc' to get them for 
all

nodes):
 0   1
0.5 0.5


So Does anyone have idea of what might be wrong? I think it might be 
the
problem of my tree, but I tested it - it is binary and 
ultrametric...


Any idea or help would be appreciated. Thanks!

Best,
Zhenjiang

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at 
http://www.mail-archive.com/r-sig-phylo@r-project.org/


___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/


Re: [R-sig-phylo] Find common ancestor of multiple taxa

2013-05-10 Thread Emmanuel Paradis
Hi Klaus and others,

I have now documented getMRCA() and modified it so that it accepts a vector of 
labels and does some checks. This will be in the next release of ape. The SVN 
repository is not yet updated: there's a technical problem there. I'll fix that 
next week.

Cheers,

Emmanuel
-Original Message-
From: Klaus Schliep klaus.schl...@gmail.com
Sender: r-sig-phylo-boun...@r-project.org
Date: Thu, 9 May 2013 18:53:24 
To: Nicholas Crouchncro...@uic.edu
Cc: r-sig-phylo@r-project.org
Subject: Re: [R-sig-phylo] Find common ancestor of multiple taxa

Hi Nick,

there is a functions getMRCA in ape and and mrca.phylo in phangorn,
which do what you are looking for. And both functions are reasonably
fast:
getMRCA(tree, mytips)
mrca.phylo(tree, match(mytips, tree$tip))

Regards,
Klaus


On 5/9/13, Liam J. Revell liam.rev...@umb.edu wrote:
 This is exactly what findMRCA in phytools does, but I wrote it a while
 ago so I'm not sure how fast it is

 - Liam

 Liam J. Revell, Assistant Professor of Biology
 University of Massachusetts Boston
 web: http://faculty.umb.edu/liam.revell/
 email: liam.rev...@umb.edu
 blog: http://blog.phytools.org

 On 5/9/2013 11:24 AM, Nicholas Crouch wrote:
 Hi All,

 The 'mrca' function in Ape provides a method for finding the common
 ancestor of a pair of taxa. What I would like have not found is a
 function
 where you can pass a (long) list of tips from a phylogeny and find the
 node
 which represents the common ancestor to all these tips.

 Does such a function exist? Or could the 'mrca' function be expanded to
 loop over all the taxa within a list a somehow boil down to what is the
 common ancestor that way?

 Thanks,

 Nick

  [[alternative HTML version deleted]]

 ___
 R-sig-phylo mailing list - R-sig-phylo@r-project.org
 https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
 Searchable archive at
 http://www.mail-archive.com/r-sig-phylo@r-project.org/


 ___
 R-sig-phylo mailing list - R-sig-phylo@r-project.org
 https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
 Searchable archive at
 http://www.mail-archive.com/r-sig-phylo@r-project.org/



-- 
Klaus Schliep
Phylogenomics Lab at the University of Vigo, Spain

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/


Re: [R-sig-phylo] nlm error when using ace for discrete characters

2013-05-13 Thread Emmanuel Paradis
Hi Graeme,

The reason of the error is because this custom model does fit your data, so the 
calculation of the SEs fails. I have modified ace() so that the error is caught 
and a better message is printed.

For these data, you should try to fit the models that are pre-programmed in ace 
(ER, SYM, ARD) -- maybe you did already. It seems ER is the best one. You can 
also try other custom models though.

Best,

Emmanuel
-Original Message-
From: Graeme Lloyd graemetll...@gmail.com
Sender: r-sig-phylo-boun...@r-project.org
Date: Sun, 12 May 2013 16:35:30 
To: r-sig-phylo@r-project.org
Subject: [R-sig-phylo] nlm error when using ace for discrete characters

Hi Everyone,

I'm trying to use ace to do some discrete ancestral character estimation, but 
came upon the following error:

Error in nlm(function(p) dev(p), p = obj$rates, iterlim = 1, stepmax = 0,  : 
 non-finite value supplied by 'nlm'

Any ideas? I am using the latest versions of R and ape and my code is given 
below.

Cheers,


Graeme

# Load library:
library(ape)

# Load tree with branch lengths:
tree-read.tree(text=(Gracilisuchus_stipanicicorum:1,((Terrestrisuchus_gracilis:1,Dibothrosuchus_elaphros:1):1,((Orthosuchus_stormbergi:1,(Protosuchus_richardsoni:2,Edentosuchus_tienshanensis:2):1):1,(Zaraasuchus_shepardi:2,(Sichuanosuchus_shuhanensis:3,(Hsisosuchus_chungkingensis:1,(Fruitachampsa_callisoni:1,(((Araripesuchus_wegeneri:1,(Araripesuchus_tsangatsangana:1,(((Malawisuchus_mwakasyungutiensis:1,(Notosuchus_terrestris:2,(Iberosuchus_macrodon:4,Chimaerasuchus_paradoxus:2):1):1):1,Simosuchus_clarki:2):2,Araripesuchus_gomesii:3):1):1):1,Mahajangasuchus_insignis:3):2,((Theriosuchus_pusillus:1,Alligatorium:1):1,((Pachycheilosuchus_trinquei:1,(Sunosuchus_junggarensis:4,(Bernissartia_fagesii:1,(Shamosuchus_djadochtaensis:3,(Pristichampsus_vorax:2,(Borealosuchus_formidabilis:2,((Eothoracosaurus_mississippiensi:1,Gavialis_gangeticus:1):3,(Crocodylus_niloticus:1,(Diplocynodon_hantoniensis:1,Alligator_mississippiensis:1):1):1):1):2):1):1):1):2):1,((Steneosaurus_bollensis:1,(P!
 
elagosaurus_typus:1,(Metriorhynchus_superciliosus:1,Geosaurus_araucanensis:3):2):1):1,(Sarcosuchus_imperator:1,Terminonaris_robusta:1):4):1):1):2):1):1):1):1):1):1):1);)

# Load tip values:
tipvals-c(0,0,0,0,1,4,1,0,0,0,3,3,1,0,4,1,3,2,4,0,0,2,2,2,3,3,3,1,3,3,3,3,0,0,0,0,0,0)

# Create model:
charmodel-rbind(c(0,1,2,3,4),c(1,0,1,2,3),c(2,1,0,1,2),c(3,2,1,0,1),c(4,3,2,1,0))

# Find ancestral states:
ace(tipvals,tree,type=discrete,model=charmodel)
___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/


Re: [R-sig-phylo] Extracting Independent Subclades

2013-05-21 Thread Emmanuel Paradis
Hi Glenn,

Of course %in% is pairwise. I thought you'd figure out the rest of it ;)

I can see two ways to solve your problem, though I don't know if they work.

The 1st one:
1) Select the clades of appropriate size(s).
2) Build a square symmetric matrix where the rows and columns are the clades 
selected in 1), and put 0 in the matrix if the two clades are independent, 1 if 
not.
3) Reorder the rows and columns of this matrix so that the 1's are concentrated 
near the diagonal. I'm not sure how to do this step. There are algorithms to do 
that (matrix profile reduction) but I'm not sure whether they are in R. Others 
may have better advice.
4) Step 3) should result in a matrix with 0's concentrated in the top right and 
bottom left corners of the matrix. Select these clades.

The 2nd one: do a minimum spanning tree with the matrix from step 2) above 
(using preferrably mst() in pegas). The resulting MST should cluster together 
clades with links of size 0: select the biggest cluster. Here also I'm not sure 
about the practical details.

There could be other solutions I guess.

Best,

Emmanuel
-Original Message-
From: Glenn Seeholzer seeholzer.gl...@gmail.com
Date: Tue, 21 May 2013 11:14:12 
To: emmanuel.para...@ird.fr
Cc: r-sig-phylo-boun...@r-project.org; r-sig-phylo@r-project.org
Subject: Re: [R-sig-phylo] Extracting Independent Subclades

Hi Emmanuel,
Thanks for the suggestion to use prop.part(tree), its definitely faster the
what I had come up with. I might have missed something, but I don't think
using %in% alone will solve my problem.

Essentially, I'm interested in finding an objective way to determine sets
of independent subclades (not just two). Ultimately, I will be comparing
rates of trait evolution and diversification rates among these subclades
using PGLS regressions. The tree I'm working with is ~300 species. My
initial subclade partitioning scheme sampled the major genera within the
group, resulting in 13 independent subclades. I'd like a more objective way
to sample subclades, that will allow me run the analysis across many (if
not all) possible independent subclade partitions to examine the effect
subclade partition choice has on the outcome of the analysis.

While %in% would allow me to determine if two clades are independent, I
think I'm looking for an algorithm that will find all possible subclade
partitions where each subclade has greater than 10 taxa and all
are independent.

Variations of this problem seem to be common in combinatorial math i.e.
find all disjoint (non-overlapping) sets from a set of sets, but the
algorithms proposed in the math forums are difficult for me to understand,
let alone implement in R. Has anyone dealt with this issue before? Is there
some simple solution I'm missing?

Cheers,
Glenn



On Tue, May 21, 2013 at 12:11 AM, Emmanuel Paradis
emmanuel.para...@ird.frwrote:

 Hi Glenn,

 An alternative is to use prop.part(tree) which returns the composition of
 all clades in tree. You can then easily select those of appropriate size,
 eg:

 pp - prop.part(tree)
 which(sapply(pp, length) = 10)

 Then using match() or %in% you can test whether two sets are independent.
 Because the list returned by prop.part is ordered along the node numbers of
 the tree, it is easy to use extract.clade() for your final operation.

 HTH

 Best,

 Emmanuel
 -Original Message-
 From: Glenn Seeholzer seeholzer.gl...@gmail.com
 Sender: r-sig-phylo-boun...@r-project.org
 Date: Mon, 20 May 2013 23:08:51
 To: r-sig-phylo@r-project.org
 Subject: [R-sig-phylo] Extracting Independent Subclades

 Hi all,

 I'd like to sample independent subclades of greater than N tips from a
 phylogenetic tree much greater than N, say (number of tips = 10*N). I can
 extract all subclades of greater than N taxa, but sampling independent
 subclades seems to be a harder problem. I believe I am looking for all
 disjoint (non-overlapping) sets of species name sets from this list of
 subclades that maximizes the number of subclades sampled.

 For example, the following code produces a list of subclades (subclades)
 for all subclades with greater than 10 tips from a simulated tree with 100
 tips.  tips.subclades is a list of tip labels for these trees that may be
 useful.

 Essentially, I'd like to determine all the sets of subclades in this list
 of monophyletic trees that are independent, i.e. no shared taxa. Many of
 the trees formed by combining these sets of monophyletic subclades will be
 paraphyletic, that is OK. From this list of sets of subclades, I can then
 sample the sets of independent subclades with the appropriate number of
 subclades for the analyses I am performing.

 Thanks in advance for any suggestions.

 Glenn

 ###

 library(ape)
 library(TreeSim)
 n.taxa - 100
 min.subclade.n.tip - 10
 tree - sim.bd.taxa(n.taxa,1,lambda=.5,mu=0)[[1]][[1]]
 tmp - subtrees(tree)

 testNtip - function(x){
 if(x$Ntip  min.subclade.n.tip) x else NA
 }

 tmp2 - lapply(tmp, testNtip

Re: [R-sig-phylo] extracting order of tips as in plotted ladderized tree

2013-05-24 Thread Emmanuel Paradis
Hi,

Your question is not very specific but I guess you want something like:

i - which(tr$edge[, 2] = Ntip(tr))
tr$edge[i, 2]

Best,

Emmanuel
-Original Message-
From: Ivalu ivalu.ca...@gmail.com
Sender: r-sig-phylo-boun...@r-project.org
Date: Thu, 23 May 2013 09:42:32 
To: r-sig-phylo@r-project.org
Subject: [R-sig-phylo] extracting order of tips as in plotted ladderized tree

Hi All,
Any advice as to how to extract the tips of a tree ordered as in a ladderized 
tree?
Thanks!
-Ivalu
___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/


Re: [R-sig-phylo] Calling Clustal in R the DNAbin format for storing nucleotides

2013-05-27 Thread Emmanuel Paradis
Hi Ben,

Clustal accepts labels up to 30-character long. You can solve your problem by 
changing the labels using the function makeLabel (see its help page for some 
details). I'll change clustal() to avoid this problem in the future.

You can find details on DNAbin here:

http://ape.mpl.ird.fr/ape_development.html

Note that ape provides a number of high-level functions in R, so that it's not 
always needed to work at the C level. You can find more on these R functions at 
the following help pages (and some links therein):

?DNAbin
?as.DNAbin
?base.freq
?seg.sites
?where

Best,

Emmanuel
-Original Message-
From: Ben Ward (TSL) ben.w...@sainsbury-laboratory.ac.uk
Sender: r-sig-phylo-boun...@r-project.org
Date: Mon, 27 May 2013 08:21:41 
To: r-sig-phylo@r-project.orgr-sig-phylo@r-project.org
Subject: [R-sig-phylo] Calling Clustal in R  the DNAbin format for storing
nucleotides

Hi all,

I've recently turned to R for scripting an analysis consisting of many boring 
jobs repeated with many files I could automate. I'm learning perl and had a go 
with BioPerl but had trouble getting PhyML to get called from within it using 
the run modules - any who I'm a much longer R user since I began my first 
degree with more classic stats and model fitting, so I've returned to it for my 
bioinformatics needs!

I want to run the Clustal program from my R script using the function provided 
in the ape package by Paradis et al, but I kept getting an error:

Error in `[.default`(res, , FXCNDTJ02R756X, drop = FALSE) :
  subscript out of bounds

So naturally I called up the function name and had a nosy at what it does – the 
error occurs after Clustal has successfully run at this line(s):

if (original.ordering)
res - res[labels(x), ]
res

Which is where it re-orders sequences according to the original alignment fed 
in.

Herin lies the problem:

If I do labels(x) to get the original labels of the sequences (sequences read 
in using the read.dna() function of ape in fasta format), the labels are like 
this:

FXCNDTJ02RW9CX length=278 xy=3136_3087 region=2 run=R_2009_06_11_14_18_55_

However the labels of the aligned sequences (labels(res)) are like this:
FXCNDTJ02RW9CX
 So the indexing of the matrix containing the aligned sequences is failing 
because the labels are changed at some point in the function, possibly when the 
input sequences are written to a tmpdir for the system command to run clustal 
to find.
Has anyone else experience issues similar to this? Of course if I set the 
original.ordering variable to false this part would not cause the error since 
the if statement would not evaluate as true.

Does anybody have any suggestions?

I have a second question:

In addition, is there a source as for how nucleotides data are stored in DNAbin 
and how/why storing them as binary vectors (I've read the brief mention in 
Paradis' Springer book) speeds up operations?. If I try to look at the 
structure of a DNAbin object I see in the matrix raw [1:257] 88 88 18 18 …
I'm writing a package that implements some stuff my colleagues have used and I 
figure I want as much compatibility with existing bioinformatics packages for R 
as possible: up to now what I've written just uses characters and strings of 
characters a,t,g,c, but if I can learn the ins and outs of the DNAbin format 
more than the overview given in the book I have it would be very useful.

Thanks in advance and best,

Ben W.
The Sainsbury Lab and UEA(ENV)

[[alternative HTML version deleted]]


___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/


Re: [R-sig-phylo] dist.dna returning NaN's for 3 alignments

2013-05-28 Thread Emmanuel Paradis
Hi Ben,

See the option pairwise.deletion of dist.dna. You can use image() also to have 
a view on the distribution of the -.

Cheers,

Emmanuel
-Original Message-
From: Ben Ward (TSL) ben.w...@sainsbury-laboratory.ac.uk
Sender: r-sig-phylo-boun...@r-project.org
Date: Tue, 28 May 2013 15:30:50 
To: r-sig-phylo@r-project.orgr-sig-phylo@r-project.org
Subject: [R-sig-phylo] dist.dna returning NaN's for 3 alignments

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/


[R-sig-phylo] new ape's web site (and singleton nodes)

2013-06-07 Thread Emmanuel Paradis

Hi all,

ape's web site has been moved to this URI:

http://ape-package.ird.fr/

(Currently, the redirection from the old URI does not work correctly. 
I hope it will be fixed soon.)


Regarding the issue of singleton nodes and phylomatic, this is 
mentioned in ape's FAQ (FAQ #15) but in a different way than 
originally asked by Kellen. I have added a link to Megan's excellent 
post.


Supporting singleton nodes is one feature that might be included in 
ape in the future (among many others).


Best,

Emmanuel

Fri, 7 Jun 2013 00:32:58 -0400 Liam J. Revell liam.rev...@umb.edu:

Hi Kellen.

I'm not sure if this is still relevant, but I just posted source 
code (http://www.phytools.org/read.newick/v0.4/read.newick.R) for a 
function that will read this type of Newick string (i.e., with node 
labels  singletons) into R. To create the attached plot I had to 
first collapse singles, i.e.,


require(phytools)
source(read.newick.R)
tree-read.newick(file=output.txt)
tree-collapse.singles(tree)
plotTree(tree,type=fan)

All the best, Liam

Liam J. Revell, Assistant Professor of Biology
University of Massachusetts Boston
web: http://faculty.umb.edu/liam.revell/
email: liam.rev...@umb.edu
blog: http://blog.phytools.org

On 6/6/2013 4:25 PM, Megan Bartlett wrote:

Dear Kellen,

It looks to me like the problem is that you have genera labels 
containing

only one species, like (Sabatia_angularis:46.296295)sabatia. The
frustrating thing about this notation is that Phylocom is fine with 
this,
but R can't read it, even though it's technically legal in the 
Newick
format. I don't think collapse.single would solve your problem, 
since R

isn't even able to input the file to perform a function on it. For
troubleshooting, I would try running clean.phylo on this tree in 
Phylocom,
and seeing if that solves your problem. If so, I would run 
clean.phylo on
your undated tree before you run BLADJ. (This would be your 
Phylomatic
output if you assembled your undated tree with Phylomatic). Because 
having
node ages for singleton taxa will not affect your branch lengths for 
that
taxa (for example, Sabatia angularis will have the same branch 
length
whether or not you have an age for the genus Sabatia), you should 
get the

same results. I believe the Phylocom manual discusses this in the
Phylomatic section.

Best,

Megan

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at 
http://www.mail-archive.com/r-sig-phylo@r-project.org/




___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/


Re: [R-sig-phylo] compare.gee issues with multiple predictor models

2013-07-02 Thread Emmanuel Paradis

Hi Sandra,

Le 01/07/2013 22:03, sandra goutte a écrit :

Hello all,
I am sorry if the question has already been answered, i have not found it
int the archive.
I am using compar.gee to look at possible correlations between behavioral
traits and ecological variables. I have two problems:

1) if i try a model with more than 3 predictors, the function cannot
compute the p-values. Is that a problem of df?


Yes. In other words, your species are too closely related to give enough 
information to compute the P-values, but this is for GEEs only. Since 
you use a gaussian family for the response, you could use gls() instead: 
the estimated coefficients will be the same and you'll get P-values.



2) this one is more problematic to me: if i add predictors in my model, it
actually changes the values estimated for the first one. to be clear here
is an example of my output:


That's expected and this is for any kind of regression models (and most 
estimation problems). Try with lm() if you want to make sure.


Best,

Emmanuel


# with only canopy as a predictor
Call: compar.gee(formula = DF ~ canopy, phy = tree2)
Number of observations:  18
Model:
   Link: identity
  Variance to Mean Relation: gaussian

QIC: 225.2231

Summary of Residuals:
Min 1Q Median 3QMax
-1.3742940 -0.6158815  1.1275603  2.6748027 10.2016164


Coefficients:
   EstimateS.E.t Pr(T  |t|)
(Intercept) 2.21716210 2.366080952 0.937061  0.43327224
canopy  0.02050857 0.006936446 2.956640  0.07879305

Estimated Scale Parameter:  12.71506
Phylogenetic df (dfP):  4.395587


# with canopy and splfrog as predictors
Call: compar.gee(formula = DF ~ splfrog * canopy, phy = tree2)
Number of observations:  18
Model:
   Link: identity
  Variance to Mean Relation: gaussian

QIC: 197.5136

Summary of Residuals:
Min 1Q Median 3QMax
-1.4887445 -0.4669401  1.0553884  2.1243035  9.7526188


Coefficients:
 Estimate S.E.  t Pr(T  |t|)
(Intercept)-1.2294629060 3.8542626753 -0.3189878   0.8481567
splfrog 0.0638736816 0.0542900060  1.1765274   0.6056463
canopy  0.0308125063 0.0507169033  0.6075392   0.7408100
splfrog:canopy -0.0002361583 0.0007856787 -0.3005787   0.8561095

Estimated Scale Parameter:  12.32929
Phylogenetic df (dfP):  4.395587


The values ar really different! Am i missing something obvious?

Thank you all in advance for your help!
Sandra.







___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/



___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/


Re: [R-sig-phylo] Problems with write.nexus of the package ape

2013-07-02 Thread Emmanuel Paradis

Hi Agus,

The problem is that spaces are ignored in Newick and NEXUS files. Maybe 
you do not need to replace _ for   ?


Best,

Emmanuel

Le 29/06/2013 15:15, Agus Camacho a écrit :

Dear all,


After correcting the names of a otherwise perfectly functional tree using
gsub,


tree$tip.label=gsub(_,  , tree$tip.label



the new tree looks nice, but when i save it using write.nexus and reopen
it, i get:



wtree=read.nexus(tree_corrected_names.nexus)Warning message:In matrix(x, ncol 
= 2, byrow = TRUE) :

   data length [105] is not a sub-multiple or multiple of the number of rows 
[53]



wtree$tip.label [1] Tupinambis   2ocellifer
Kentropyx

  [5] 4atriventris  Anotosaura   6
  [9] percarinatum Arthrosaura  8reticulata
[13] Bachia   10   panoplia Bachia
[17] 12   bicarinatus  Placosoma14
[21] glabellumNeusticurus  16   quadrilineata
[25] Cercosaura   18   ocelatta Cercosaura
[29] 20   brachylepis  Heterodactylus   22
[33] modesta  Iphisa   24   paeminosus
[37] Gymnophthalmus   26   ablepharaCalyptommatus
[41] 28   leiolepisCalyptommatus30
[45] atticolusMicrablepharus   32   agilis
[49] Vanzosaura   34   erythrocercusProcellosaurinus
[53] 1



thanks in advance
Agus



___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/



___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/


[R-sig-phylo] workshop on phylogeography

2013-07-03 Thread Emmanuel Paradis

Hi all,

I'm looking for information on workshops, trainings or courses that 
include phylogeography. That could be with R of course, but not 
absolutely necessarily.


If you organize, or know about, such a course or workshop, at the 
PhD-candidate level and/or above, and in English preferably, please let 
me know.


Thanks in advance.

Best,

Emmanuel

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/


Re: [R-sig-phylo] PGLS vs lm

2013-07-11 Thread Emmanuel Paradis

Hi Tom,

In an OLS regression, the residuals from both regressions (varA ~ varB 
and varB ~ varA) are different but their distributions are (more or 
less) symmetric. So, because the residuals are independent (ie, their 
covariance is null), the residual standard error will be the same (or 
very close in practice).


In GLS, the residuals are not independent, so this difference in the 
distribution of the residuals affects the estimation of the residual 
standard errors (because we need to estimate the covaraince of the 
residuals), and consequently the associated tests.


Best,

Emmanuel

Le 11/07/2013 11:03, Tom Schoenemann a écrit :

Hi all,

I ran a PGLS with two variables, call them VarA and VarB, using a phylogenetic 
tree and corPagel. When I try to predict VarA from VarB, I get a significant 
coefficient for VarB.  However, if I invert this and try to predict VarB from 
VarA, I do NOT get a significant coefficient for VarA. Shouldn't these be both 
significant, or both insignificant (the actual outputs and calls are pasted 
below)?

If I do a simple lm for these, I get the same significance level for the 
coefficients either way (i.e., lm(VarA ~ VarB) vs. lm(VarB ~ VarA), though the 
values of the coefficients of course differ.

Can someone help me understand why the PGLS would not necessarily be symmetric 
in this same way?

Thanks,

-Tom


outTree_group_by_brain_LambdaEst_redo1 - gls(log_group_size_data ~ 
log_brain_weight_data, correlation = bm.t.100species_lamEst_redo1,data = 
DF.brain.repertoire.group, method= ML)
summary(outTree_group_by_brain_LambdaEst_redo1)

Generalized least squares fit by maximum likelihood
   Model: log_group_size_data ~ log_brain_weight_data
   Data: DF.brain.repertoire.group
AIC BIClogLik
   89.45152 99.8722 -40.72576
Correlation Structure: corPagel
  Formula: ~1
  Parameter estimate(s):
lambda
0.7522738
Coefficients:
Value Std.Error   t-value p-value
(Intercept)   -0.0077276 0.2628264 -0.029402  0.9766
log_brain_weight_data  0.4636859 0.1355499  3.420778  0.0009

  Correlation:
   (Intr)
log_brain_weight_data -0.637
Standardized residuals:
Min Q1Med Q3Max
-1.7225003 -0.1696079  0.5753531  1.0705308  3.0685637
Residual standard error: 0.5250319
Degrees of freedom: 100 total; 98 residual


Here is the inverse:


outTree_brain_by_group_LambdaEst_redo1 - gls(log_brain_weight_data ~ 
log_group_size_data, correlation = bm.t.100species_lamEst_redo1,data = 
DF.brain.repertoire.group, method= ML)
summary(outTree_brain_by_group_LambdaEst_redo1)

Generalized least squares fit by maximum likelihood
   Model: log_brain_weight_data ~ log_group_size_data
   Data: DF.brain.repertoire.group
 AIC   BIC   logLik
   -39.45804 -29.03736 23.72902
Correlation Structure: corPagel
  Formula: ~1
  Parameter estimate(s):
   lambda
1.010277
Coefficients:
  Value  Std.Error   t-value p-value
(Intercept)  1.2244133 0.20948634  5.844836  0.
log_group_size_data -0.0234525 0.03723828 -0.629796  0.5303
  Correlation:
 (Intr)
log_group_size_data -0.095
Standardized residuals:
Min Q1Med Q3Max
-2.0682836 -0.3859688  1.1515176  1.5908565  3.1163377
Residual standard error: 0.4830596
Degrees of freedom: 100 total; 98 residual

_
P. Thomas Schoenemann

Associate Professor
Department of Anthropology
Cognitive Science Program
Indiana University
Bloomington, IN  47405
Phone: 812-855-8800
E-mail: t...@indiana.edu

Open Research Scan Archive (ORSA) Co-Director
Consulting Scholar
Museum of Archaeology and Anthropology
University of Pennsylvania

http://www.indiana.edu/~brainevo











[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/



___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/


Re: [R-sig-phylo] PGLS vs lm

2013-07-14 Thread Emmanuel Paradis

Hi all,

I would like to react a bit on this issue.

Probably one problem is that the distinction correlation vs. 
regression is not the same for independent data and for phylogenetic data.


Consider the case of independent observations first. Suppose we are 
interested in the relationship y = b x + a, where x is an environmental 
variable, say latitude. We can get estimates of b and a by moving to 10 
well-chosen locations, sampling 10 observations of y (they are 
independent) and analyse the 100 data points with OLS. Here we cannot 
say anything about the correlation between x and y because we controlled 
the distribution of x. In practice, even if x is not controlled, this 
approach is still valid as long as the observations are independent.


With phylogenetic data, x is not controlled if it is measured on the 
species -- in other words it's an evolving trait (or intrinsic 
variable). x may be controlled if it is measured outside the species 
(extrinsic variable) such as latitude. So the case of using regression 
or correlation is not the same than above. Combining intrinsic and 
extinsic variables has generated a lot of debate in the literature.


I don't think it's a problem of using a method and not another, but 
rather to use a method keeping in mind what it does (and its 
assumptions). Apparently, Hansen and Bartoszek consider a range of 
models including regression models where, by contrast to GLS, the 
evolution of the predictors is modelled explicitly.


If we want to progress in our knowledge on how evolution works, I think 
we have to not limit ourselves to assess whether there is a 
relationship, but to test more complex models. The case presented by Tom 
is particularly relevant here (at least to me): testing whether group 
size affects brain size or the opposite (or both) is an important 
question. There's been also a lot of debate whether comparative data can 
answer this question. Maybe what we need here is an approach based on 
simultaneous equations (aka structural equation models), but I'm not 
aware whether this exists in a phylogenetic framework. The approach by 
Hansen and Bartoszek could be a step in this direction.


Best,

Emmanuel

Le 13/07/2013 02:59, Joe Felsenstein a écrit :


Tom Schoenemann asked me:


With respect to your crankiness, is this the paper by Hansen that you are 
referring to?:

Bartoszek, K., Pienaar, J., Mostad, P., Andersson, S.,  Hansen, T. F. (2012). 
A phylogenetic comparative method for studying multivariate adaptation. Journal of 
Theoretical Biology, 314(0), 204-215.

I wrote Bartoszek to see if I could get his R code to try the method mentioned 
in there. If I can figure out how to apply it to my data, that will be great. I 
agree that it is clearly a mistake to assume one variable is responding 
evolutionarily only to the current value of the other (predictor variables).


I'm glad to hear that *somebody* here thinks it is a mistake (because it really is).  I 
keep mentioning it here, and Hansen has published extensively on it, but everyone keeps 
saying Well, my friend used it, and he got tenure, so it must be OK.

The paper I saw was this one:

Hansen, Thomas F  Bartoszek, Krzysztof (2012). Interpreting the evolutionary 
regression: The interplay between observational and biological errors in 
phylogenetic comparative studies. Systematic Biology  61 (3): 413-425.  ISSN 
1063-5157.

J.F.

Joe Felsenstein j...@gs.washington.edu
  Department of Genome Sciences and Department of Biology,
  University of Washington, Box 355065, Seattle, WA 98195-5065 USA

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/



___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/


Re: [R-sig-phylo] number of fitted parameters in lm and pgls in R

2013-08-01 Thread Emmanuel Paradis

Hi Sereina,

You will get the same AIC with lm and gls, for a given regression model, if:

1. the GLS model is fitted assuming independence of observations (this 
is the default in nlme::gls),


2. the GLS model is fitted by ML (this is not the default).

Some correlation structures require to estimate some additional 
parameters (eg, corPagel) and others don't (eg, corBrownian, or if you 
set fixed = TRUE when building the correlation structure). To check 
that, use the generic function logLik which returns the number of free 
parameters (as 'df').


Best,

Emmanuel

Le 30/07/2013 20:59, Sereina Graber a écrit :

Hi everybody,
I was comparing a simple lm model and a pgls model in R and realised
that the AICs are not identical. I found out that for the pgls there is
one fitted parameter (k) less than in the lm. Can anybody explain me the
difference and why in a simple lm there is one more fitted parameter in
the model compared to a PGLS? Imporant to note is that of course in both
models I included the extact same covariates and response variable and
further all the estimates, std. errors etc are identical except the AIC.
For any help I am very grateful.
Best


___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/



___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/


Re: [R-sig-phylo] problem with corMartins (ape)

2013-08-01 Thread Emmanuel Paradis

Hi,

Xavier: if I understand correctly what you describe, this can be done with:

V - vcv(phy, corr = TRUE)
diag(V) - 0

vcv() returns the expected variance-covariance matrix under Brownian 
motion for a given tree. If this tree is not ultrametric (I guess this 
is what you mean with where the tips do not have the same lengths) 
then it is correct that the variances are not the same. Note that if a 
tree is rescaled (say by multiplying its branch lengths) then the result 
of vcv() will be different but vcv(, corr = TRUE) will return the same 
(correlation) matrix.


Anyway, if you really found some wrong results, please report this.

Sandra: thanks for solving your problem by yourself _and_ giving some 
feedback on the list. I guess the information you needed can be found in 
the vignette delivered with ape.


Best,

Emmanuel

Le 01/08/2013 22:13, Xavier Prudent a écrit :

Salut Sandra,

I have experienced that the matrix given by vcv can lead to wrong results
if taken as a correlation matrix in the case of trees where the tips do not
have the same lengths.
I then built my own correlation, using the common branch length shared by
two tips, normalized by this same length plus the lengths of the last
branches of these tips.
The diagonals being forced to zero.

If you are interested in it, I can send you the code,
cheers,
Xavier


2013/8/1 sandra goutte gou...@mnhn.fr


Ok so i found out what i was doing wrong and i just answer my own post so
nobody will waste time answering me: i was not making a proper weight
matrix using vcv(), so i fixed the problem by using 1/distance matrix from
this variance-covariance matrix and then fixing the sum of each row to 1.
Now the Moran.I behaves as expected with random data for both OU and BM
models.

Sandra.



2013/7/29 sandra goutte gou...@mnhn.fr


Dear list,

I am trying to run a gls with an OU correlation structure using
corMartins. I first wanted to check whether i had a phylogenetic
autocorrelation in my trait, so i used the Moran's I index (Moran.I):

vcovm-vcv(corMartins(1,tree, fixed=T))
Moran.I(trait, vcovm, alt=greater)

And i obtained a p-value equal to zero. Changing the alpha parameter, i
kept getting p-values=0. For the Brownian motion however, i got a

normal

p-value. I then tried with random values for my trait, and when for a
brownian correlation structure i get what i expect, that is most of the
time non-signiifcant autocorrelation, with the cor.Martins structure i
always get significant p-values (although not equal to zero).

In addition, if i run the GLS with the corMartins structure like this:

corm-corMartins(1,phy=tree, fixed=T)
glsp- gls(trait~var1+var2+var3, correlation=corm, data=data)

i get better and better AIC for increasing the alpha parameter
(indefinitely!), which would mean that the stronger the constraint on the
trait, the better is the fit.

Does anyone know why the corMartins is giving me these wierd results? Am

I

using this function in the wrong way or is it a bugg?

Thank you in advance for your help,
Sandra.



--
PhD Student
Muséum National d'Histoire Naturelle
Département Systématique et Évolution
USM 601 / UMR 7205 Origine, Structure et Évolution de la Biodiversité
Reptiles  Amphibiens - Case Postale 30
25 rue Cuvier
F-75005 Paris

Tel :  +33 (0) 1 40 79 34 90
Mobile: +33 (0) 6 79 20 29 99





--
PhD Student
Muséum National d'Histoire Naturelle
Département Systématique et Évolution
USM 601 / UMR 7205 Origine, Structure et Évolution de la Biodiversité
Reptiles  Amphibiens - Case Postale 30
25 rue Cuvier
F-75005 Paris

Tel :  +33 (0) 1 40 79 34 90
Mobile: +33 (0) 6 79 20 29 99

 [[alternative HTML version deleted]]


___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at
http://www.mail-archive.com/r-sig-phylo@r-project.org/







___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/



___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/


Re: [R-sig-phylo] rooting multiple trees

2013-08-07 Thread Emmanuel Paradis
Hi Silvia,

Check the help page of root(): it says the tree is unrooted before before being 
rooted. So your 2nd output is as expected. See the options of this function to 
resolve the root.

For the 1st method, I think you should not do unclass().

best,

Emmanuel
-Original Message-
From: SILVIA CALVO ARANDA silvit...@hotmail.com
Sender: r-sig-phylo-boun...@r-project.org
Date: Wed, 7 Aug 2013 06:08:18 
To: Liam J. Revellliam.rev...@umb.edu
Cc: r-sig-phylo@r-project.orgr-sig-phylo@r-project.org
Subject: Re: [R-sig-phylo] rooting multiple trees

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/


Re: [R-sig-phylo] Substitute for functions on the laser package

2013-09-04 Thread Emmanuel Paradis
Hi Mariana,

You may try the function bd.time in ape. The companion paper gives examples 
similar to what laser does.

Best,

Emmanuel
-Original Message-
From: Mariana Vasconcellos marian...@utexas.edu
Sender: r-sig-phylo-boun...@r-project.org
Date: Tue, 3 Sep 2013 23:20:17 
To: Liam J. Revellliam.rev...@umb.edu
Cc: r-sig-phylo@r-project.org
Subject: Re: [R-sig-phylo] Substitute for functions on the laser package

Thanks, Liam! But, unfortunately I don't know how to build from source. I 
downloaded Xcode and the laser_2.3.tar.gz. But, I have no idea of how to 
install from source. If someone could help with any advise, that would be 
great!  Also, does anyone have another option to calculate decreasing 
speciation rate, increasing extinction rate or both using a different package?

Thank you very much!

--
Mariana Vasconcellos
Ecology, Evolution  Behavior
Integrative Biology
The University of Texas at Austin




On Sep 3, 2013, at 10:05 PM, Liam J. Revell liam.rev...@umb.edu wrote:

 Hi Mariana.
 
 You can download old versions of laser from the CRAN archive 
 (http://cran.r-project.org/src/contrib/Archive/laser/); however you will have 
 to build from source. This is easy if you have Xcode (for Mac OS) or a gcc 
 compiler installed. If you do not, and cannot figure out how to install from 
 source, then respond to the list  I'm sure someone will help out  post a 
 package binary for you.
 
 - Liam
 
 Liam J. Revell, Assistant Professor of Biology
 University of Massachusetts Boston
 web: http://faculty.umb.edu/liam.revell/
 email: liam.rev...@umb.edu
 blog: http://blog.phytools.org
 
 On 9/3/2013 10:39 PM, Mariana Vasconcellos wrote:
 Dear all:
 I just performed a test of the gamma-statistic on my tree and I found that 
 diversification is not constant in time. So, I would like to perform the 
 function fitSPVAR, fitEXVAR and fitBOTHVAR using the laser package. But, I 
 just saw that the laser package was removed from the CRAN repository. Could 
 anyone tell me what other alternative package I could use to develop and 
 test models of decreasing speciation rate, increasing extinction rate or 
 both? Does MEDUSA do this sort of analyses?
 
 Thank you for your help!
 Mariana
 
 
 
 
 
 
  [[alternative HTML version deleted]]
 
 ___
 R-sig-phylo mailing list - R-sig-phylo@r-project.org
 https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
 Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
 


[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/


Re: [R-sig-phylo] selecting sequences from an alignment

2013-10-05 Thread Emmanuel Paradis
Hi,

What about this:

match(selected, rownames(woodmouse))

? If this is not what you want yet, please give an example of the output you 
wish to get.

Best,

Emmanuel
-Original Message-
From: john d dobzhan...@gmail.com
Sender: r-sig-phylo-boun...@r-project.org
Date: Sat, 5 Oct 2013 08:20:34 
To: mailman, r-sig-phylor-sig-phylo@r-project.org
Subject: [R-sig-phylo]  selecting sequences from an alignment

Hi David,

Thanks for your input. However, I guess I should have been more specific in
my question. I'd need to extract part of the alignment based on a list of
taxon names. Is there a simple way to know which lines in the DNAbin object
correspond to a list of taxon names?

Thanks!


On Fri, Oct 4, 2013 at 11:35 PM, David Winter david.win...@gmail.com
 wrote:

 Hey Carlos,

 You can use the normal x[ i , j ] style to extract subsections of a
 DNAbin object:

 sub_alignment - woodmouse[selected,]

 To see what's going on check out ?DNAbin and

 dim(woodmouse)
 dim(sub_alignment)

 David





[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/


Re: [R-sig-phylo] most efficient way to read trees from huge text file

2013-10-10 Thread Emmanuel Paradis

Hi,

Le 10/10/2013 21:07, Juan Antonio Balbuena a écrit :

Hello,
I need to handle in R 10M trees produced with the program evolver of
PAML. With a smaller number of trees one could create a multi-phylo
object with ape as

tree.m - read.tree(filename.tre)

but this is impractical in this case because it takes all the RAM
memory. I've tried something like


library(ape)
con = file(evolver.out, open=r) # where evolver.out contains 10M

trees

lin = readLines(con)
for (i in 1:length(lin)) {

+ tree - read.tree(lin[i])


Replace the line above by:

tree - read.tree(text = lin[i])

This may work if PAML has actually written one tree per line (ie, each 
tree ends with a '\n').


HTH,

Emmanuel


+ # my additional syntax mrca analysis etc. here
+ }

But it doesn't work:

Error in file(file, r) : cannot open the connection
In addition: Warning messages:
1: closing unused connection 3 (evolver.out)
2: In file(file, r) :
   cannot open file 'S6: 0.33544, (S1: 0.02314, S10: 0.02314):
0.31230): 0.53388, S5: 0.86932): 0.04830, (S4: 0.51901, (S7: 0.05901,
S2: 0.05901): 0.46000): 0.39861): 0.08238, (S3: 0.30236, (S8: 0.13236,
S9: 0.13236): 0.17000): 0.69764);': No such file or directory

I would very much appreciate your help. Note that computing efficiency
is key here.

Thank you very much for your attention.

Juan A. Balbuena

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/



___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

Re: [R-sig-phylo] Problems with bootstraps of NJ tree from SSRs data

2013-10-16 Thread Emmanuel Paradis

Le 16/10/2013 00:20, Vojtěch Zeisek a écrit :

Hi, Emmanuel, thank You very much!

Dne Út 15. října 2013 21:45:12, Emmanuel Paradis napsal(a):

Hi Vojtěch,

Le 14/10/2013 23:14, Vojtěch Zeisek a écrit :

Hello,
I know this is FAQ, but I still haven't found solution, which would work
for me... :-( I'd like to get unrooted bootstraped NJ tree. I was looking
through discussions (incl. on this list), but proposed solutions seem
very complicated or not suitable for my data/method. Function
boot.phylo() from ape package looks promising, but it causes R to
crash...
This is my workflow:
# Read data
mydata.loci - read.loci(mydata.txt, header=TRUE, loci.sep=\t,
allele.sep=/, col.pop=2, col.loci=4:17, row.names=1)
# Convert to genind object
mydata.genind - loci2genind(mydata.loci)
# Compute genetic distances
mydata.dist - dist(mydata.genind, method=euclidean, diag=TRUE,
upper=TRUE) # Do the NJ - the last command, which works
mydata.nj - nj(mydata.dist)
# Bootstrap the tree - I see progress bar running towards 100% and when it
reaches that value, R session crashes...
mydata.boot - boot.phylo(mydata.nj, mydata.genind$tab, function(xx)
nj(dist(xx)), B=100)


With this command, you resample the columns of the matrix of alleles. I
am not sure this is the best thing to do. I'd rather resample the matrix
of loci, like this:

mydata.boot - boot.phylo(mydata.nj, mydata.loci, function(xx)
nj(dist(loci2genind(xx))), B = 100)


Perfect, it works!


I suppose the problem with your previous implementation (resampling the 
matrix of alleles) is that most alleles are at low frequencies (as 
common with micro-sat data), so most bootstrap samples were not meaningful.



And what if I'd like to make the bootstrap of population
tree? I then can't use mydata.loci, as the matrix has different size.  Or if I'd
like to bootstrap tree based on another method like UPGMA? If I make tree by
hclust, the object is of class hclust, not phylo (which is required by
boot.phylo).


For the latter, it's easy: you have to incorporate as.phylo in the 
estimation procedure. The recommended way is to build a function that 
makes the phylo object from the original data, something like:


f - function(x) as.phylo(hclust(dist(loci2genind(x

Then build the tree directly using f:

mytree - f(mydata.loci)

and do the bootstrap:

myboot - boot.phylo(mytree, mydata.loci, f)

This way, you can modifiy f() as you want and rerun the bootstrap 
accordingly.


Best,

Emmanuel


Note that if you have many individuals, computing the bootstrap values
may take some time (I'm working on a new code that will be faster).

HTH.

Emmanuel


Thank You  have a nice day!
Vojtěch


# Plot the tree (this works, of course)
plot(mydata.nj)
# Add bootstrap values (mydata.boot doesn't exist)
nodelabels(mydata.boot)
What do I do wrong? Is there any better way, how to calculate bootstraps
for microsatellite data?
Sincerely,
Vojtěch




___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/



___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

Re: [R-sig-phylo] match two tree edges

2013-11-11 Thread Emmanuel Paradis
Hi,

Have a look at the function makeNodeLabel with the option method = md5sum: it 
will create a label to each node depending on the tips descending from it. You 
can then use them to match nodes and edges of different trees.

Best,

Emmanuel
-Original Message-
From: Jingchun Li jingc...@umich.edu
Sender: r-sig-phylo-boun...@r-project.org
Date: Mon, 11 Nov 2013 19:06:08 
To: Liam J. Revellliam.rev...@umb.edu
Cc: r-sig-phylo@r-project.org
Subject: Re: [R-sig-phylo] match two tree edges

Thanks Liam. Unfortunately for now, the second tree is read in separately
because it is generated from a different software. So the nodes numbers do
not match. But I get the general idea. I'll have to think more about it.

Cheers,

Jingchun


On Mon, Nov 11, 2013 at 6:02 PM, Liam J. Revell liam.rev...@umb.edu wrote:

 Hi Jingchun.

 The function ladderize in ape does not change the node numbers of the tree
 in phy$edge, only the order of the edges in the matrix. To verify this, you
 can use the phytools function matchNodes which matches nodes between two
 trees. Note that if a tree that is ladderized using ladderize is written to
  then read from file or a text string, the node numbers *will* change.

 That being said, since the order of the rows in phy$edge is different
 between the two trees, you will need to get the new order (in terms of the
 original tree) to plot your edge lengths. This might look something like
 this:

 ii-sapply(B$edge[,2],function(x,y) which(y==x),y=A$edge[,2])
 plot(B,no.margin=TRUE)
 edgelabels(A$edge.length[ii])

 If your node numbers *do not* match, then this gets a little more
 complicated - but the same general logic can be used.

 All the best, Liam

 Liam J. Revell, Assistant Professor of Biology
 University of Massachusetts Boston
 web: http://faculty.umb.edu/liam.revell/
 email: liam.rev...@umb.edu
 blog: http://blog.phytools.org


 On 11/11/2013 5:00 PM, Jingchun Li wrote:

 Dear all,

 I am trying to make a tree figure and I can not figure out the right way
 to
 do it.

 I have two trees, A and B. They have the same set of taxa but are in
 different orders. A is in the default order when read by read.tree(). B is
 ladderized. For my purpose, I want to plot A and label edges of A using
 the
 corresponding edge length of B.

 I struggled for a while and can't find a way to find matching edges
 between
 A and B. Is there a function out there that already does this or can
 anyone
 point out me a way? Thank you very much!

 Sorry I can't just ladderize A as well and make them match for some other
 reasons.

 Best,

 Jingchun

 [[alternative HTML version deleted]]

 ___
 R-sig-phylo mailing list - R-sig-phylo@r-project.org
 https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
 Searchable archive at http://www.mail-archive.com/r-
 sig-ph...@r-project.org/



[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/


Re: [R-sig-phylo] About function .getSEs

2013-11-15 Thread Emmanuel Paradis
Hi,

From R:

ape:::.getSEs

You can also get the code from the sources on CRAN.

Best,

Emmanuel
--Original Message--
From: Tane Kim
Sender: r-sig-phylo-boun...@r-project.org
To: r-sig-phylo@r-project.org
Subject: [R-sig-phylo] About function .getSEs
Sent: 16 Nov 2013 02:54

Hi,I have used the function ace in ape packages, and I am curious how the 
standard errors are calculated.I realized that .getSEs function is responsible 
for it, but I cannot access to its code.Would there be any way to see the code?
Thanks.
  
[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/


Re: [R-sig-phylo] Mismatch distribution (pegas): expected versus empirical and other questions

2013-12-11 Thread Emmanuel Paradis
Hi David,

Thanks for this. I have included it into pegas with an option to select the 
types of lines, and the legend is moved to the top of the plot. I'll release 
this new version by the end of the year.

I'm taking this opportunity to advertise the last version of coalescentMCMC 
(0.4) which is now stable. I made a few comparisons with LAMARC on real and 
simulated data and the results agree overall (but faster to get with 
coalescentMCMC). Four different time-dependent models are available (LAMARC has 
only one). The package comes with a vignette that explains how to run several 
chains sequentially, in parallel, or in interaction, and how to compare models 
(LRT, DIC, ...) For the moment, coalescentMCMC is focused on analysis of 
temporal dynamics with DNA sequences. If this fits your need, it is worth a try.

Best,

Emmanuel
-Original Message-
From: David Winter david.win...@gmail.com
Sender: r-sig-phylo-boun...@r-project.org
Date: Tue, 10 Dec 2013 12:47:39 
To: Alastair Pottspott...@gmail.com
Cc: mailman, r-sig-phylor-sig-phylo@r-project.org
Subject: Re: [R-sig-phylo] Mismatch distribution (pegas): expected versus
 empirical and other questions

Hi Alistair,

I  wrote a function to make the standard expectation under constant
population size quite a while ago. It's up here

https://github.com/dwinter/Pegas/blob/new_functions/R/MMD.stable.R

It's been a little while since I've had to think about mismatch
distributions, so, though I'm sure I would checked everything out at
the time, you should probably confirm this function behaves as
expected.

(BTW, Emmanuel, if you want to include any of that function in PEGAS
you are most welcome to)

David Winter




On Tue, Dec 10, 2013 at 6:27 AM, Alastair Potts pott...@gmail.com wrote:
 Hi all,

 The mismatch distribution function in pegas (MMD) produces the familiar
 histogram and then also a line that I always took as the theoretical
 expected distribution under a scenario of constant population size.

 I had a quick look at the MMD function and noticed that this line is not
 calculated using the theoretical expectations based on theta, but rather is
 some sort of moving-average based on the density() function.

 Does anyone know how to implement the correct theoretical expectation? I
 have dipped into the literature and am now completely lost.

 Also, there are a range of statistics (such as the raggedness index) and
 Arlequin has somehow managed to calculate a significance value for these
 statistics. Has anyone implemented (or can easily implement) these
 statistics and significance tests in R?

 Thanks in advance for any comments, thoughts and solutions.

 Cheers,
 Alastair Potts

 [[alternative HTML version deleted]]

 ___
 R-sig-phylo mailing list - R-sig-phylo@r-project.org
 https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
 Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/


Re: [R-sig-phylo] plotting support value

2013-12-16 Thread Emmanuel Paradis
Hi Thomas,

Have you tried prop.clades?

Best,

Emmanuel
-Original Message-
From: Thomas Manke ma...@ie-freiburg.mpg.de
Sender: r-sig-phylo-boun...@r-project.org
Date: Mon, 16 Dec 2013 11:52:33 
To: r-sig-phylo@r-project.org
Subject: [R-sig-phylo] plotting support value

Dear Forum,
I have a list containing sampled trees for the same taxa and would 
like to plot their consensus tree
with the support values (absolute numbers or frequencies) for the 
various branches.
(as done by the Consense program of the Phylip suite)

I understand that boot.phylo() returns an object bs that contains such 
information as bs$BP.
My scenario is similar, but the sampled trees are not based on sequence 
information and boot.phylo()
is not applicable here.

I assume the support values can be obtained from prop.part(), but I 
could not figure how.
Thank you for any help or suggestions,
Thomas

#tr is list of trees
ct-consensus(tr, p=0.5)  # majority consensus
pt-prop.part(tr)
pc-prop.part(ct)

# calculate support values
sv - f(pc, pt) = ???

plot(ct);  nodelabels(sv)

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/


Re: [R-sig-phylo] Using the function mst in APE to produce coloured graphs

2013-12-18 Thread Emmanuel Paradis
Hi,

Try mst() in pegas. It returns an object of class haploNet with a more 
flexible plot method.

Best,

Emmanuel
-Original Message-
From: Stefani stef...@irsa1.irsa.cnr.it
Sender: r-sig-phylo-boun...@r-project.org
Date: Wed, 18 Dec 2013 17:35:26 
To: r-sig-phylo@r-project.org
Subject: [R-sig-phylo] Using the function mst in APE to produce coloured
graphs

Dear All,
I'm a new user of R in phylogenetics and population genetics. So, 
please, apologize me in advance if what I'm asking is a basic and banal 
question. Briefly, I used the function mst of the package APE to produce 
a minimun spanning tree startin from a distance matrix. I used the 
default argument graph  (nsca or cirle), or I plotted the samples by 
using externally produced coordinates (i.e. PCA coordinates). In all the 
cases, I didn't manage to apply colours to a set of defined categories 
of samples. I tried by defining a categorical vector and using the 
argument col of the function plot, like in this example:

 colour-as.character(PCAcor[,3])
 plot(MSTtre, x1 = PCAcor[,1], x2 = PCAcor[,2], col=colour)

Or I directly use the argument col of the function plot, like this (one 
among the various tempts):

 plot(MSTtre, graph=nsca, col=blue, black).

In any case, I didn't get back any error message, but the plot was 
always black and white.
May you suggest me the right approach?
Many thanks!
Regards,
Fabrizio

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/


Re: [R-sig-phylo] Warning messages if using family=poisson in compar.gee, and changing link function

2013-12-19 Thread Emmanuel Paradis

Hi Nina,

Don't confound warning with error. If you have the former, then R 
certainly didn't crash. These warnings seem to come from the fact that 
the model fitting did not converge after the default number of 
iterations. Unfortunately, this cannot be controlled from compar.gee, so 
you can edit the function with fix(compar.gee), then find the line 
starting with:


geemod - do.call(gee, list(.

and insert in the list this item: maxiter = 100 (or more; don't forget 
to separate with a comma). Save and close the editor.


For the second problem you report (which is really an error), this is a 
bug which you can fix again with fix(compar.gee) and find this line:


Qlik - switch(fname, gaussian .

and replace 'fname' by 'fname$family'.

BTW, your formula can be simply:

FlowerNumber ~ sex * PollinationSystem

The * codes for main effects + interactions, which is equivalent to 
(: codes for the interaction):


FlowerNumber ~ sex + PollinationSystem + sex:PollinationSystem

HTH

Emmanuel



Le 17/12/2013 21:46, Nina Hobbhahn a écrit :

Dear fellow list users,

we would like to analyze three variables that are count data (no zeros, no 
integers) as responses to two categorical predictor variables using compar.gee. 
An example would be

compar.gee(FlowerNumber ~ sex + PollinationSystem + sex*PollinationSystem).

One of our variables is normally distributed, the other two can be normalized 
by log-transformation. So far we have modelled our data using family=gaussian, 
but we would prefer to model untransformed data using a log-link function, i.e. 
family=poisson.

However, several of our models with family=poisson crash R or result in the 
following warning messages:

Warning messages:
1: In gee(noFH ~ poll * sex, c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,  :
   Maximum number of iterations consumed
2: In gee(noFH ~ poll * sex, c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,  :
   Convergence not achieved; results suspect
3: In gee(noFH ~ poll * sex, c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,  :
   Cgee had an error (code= 104).  Results suspect.

Can any of you explain why are we getting this warning message? Is there a fix 
to any of this?

Alternatively, if family=poisson really doesn't work for our data, is it 
possible to change the link function associated with family=gaussian from 
link=identity to link=log? We tried this by writing

compar.gee(model, data, phy, family=gaussian(link=log))

and get the following error message:

Error in switch(fname, gaussian = -sum((Y - MU)^2)/2, binomial = sum(Y *  :
   EXPR must be a length 1 vector
In addition: Warning message:
In if (fname == binomial) W - summary(glm(formula, family = quasibinomial,  :
   the condition has length  1 and only the first element will be used

Is it at all possible to change the link function in compar.gee?

Any input would be greatly appreciated.

Nina Hobbhahn and Megan Welsford

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/



___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/


Re: [R-sig-phylo] ape package gives 'Error in FI[i]:LA[i] : NA/NaN argument'

2014-02-27 Thread Emmanuel Paradis

Hi Matt,

This is because this accession number does not point to the sequence. 
For this particular one, you could use:


seq1 - read.GenBank(U00096.3)

Best,

Emmanuel

Le 27/02/2014 01:35, Matt Curcio a écrit :

Greetings all,,
I received this error while using R version 2.15.1 and ape 3.0.11.
Error in FI[i]:LA[i] : NA/NaN argument

I have tried several approaches:

ref = NC_000913.3
seq1 = read.GenBank(ref)
Error in FI[i]:LA[i] : NA/NaN argument

ref = read.csv(file=test2.csv, head=T)

ref = str(ref)

# Where test2.csv is in two rows

'data.frame':   1 obs. of  1 variable:
$ seq: Factor w/ 1 level NC+AF8-000913: 1

seq2 = read.GenBank(ref)
Error in FI[i]:LA[i] : NA/NaN argument


I was wondering if someone could lend a hand based on their own experience.





___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/


Re: [R-sig-phylo] Phylogenetic regression available in R (Grafen, 1989)

2014-03-07 Thread Emmanuel Paradis

Hi Gustavo,

Grafen's method is partially implemented in ape. The function corGrafen
defines a correlation structure according to Grafen's method (see 
?corClasses for all corStruct defined in ape). When used with nlme::gls 
this makes possible to estimate the parameter of the branch length 
transformation. There is no correction of the number of degrees of 
freedom in ape. So I guess that the results from ape::corGrafen and 
phyreg should be the same if the phylogeny is assumed to be known 
without error.


Best,

Emmanuel

Le 05/03/2014 14:30, Gustavo Paterno a écrit :

Hello Xavier,
Thanks for your answer.

I also might be wrong, but as far as I know, Ape has the function - 
compute.brlen which can calculate branch lengths throght Grafen method but can 
not do phylogenetic regression sensu (Grafen, 1989).
Grafen method uses a different approach to control phylogeny (hanging on a 
Tree”, see paper for details).
PGLS uses a covariance matriz to correct for phylogeny, so your degree of 
fredon remains the same.
In Grafen method you can lose degree of freedom depending on your species 
relatedness.

In the other hand, Brunch and crunch functions can not deal with complex models 
(eg. with continuous and categorigal variables together as explanatory 
variables) while Grafen method can deal complex models and also work if you do 
not know your full phylogeny (when if you have politomies)

Grafen method was only available in GLIM and SAS until feb 2014. In the way I 
see, his new package for R is the only one that can really run his full method 
- phylogenetic regression.

Any one has more information about it?

Best wishes,
On Mar 5, 2014, at 10:04 AM, Xavier Prudent prudentxav...@gmail.com wrote:


Hi Gustavo,
Thanks for that notification,
I may be wrong, but was'nt that method already implemented in the CAPER package 
by David Orme (function pgls, brunch, crunch)?
If yes, then which new functionalities does phyreg bring?
Regards,
Xavier


2014-03-05 13:46 GMT+01:00 Gustavo Paterno aspessoasmu...@gmail.com:
Hello all,

I just got the confirmation that the Grafen method for phylogenetic regression 
(Grafen, 1989) was implemented in R !
The package “phyreg” has lots of details in help.

Best wishes for all.
Gustavo Paterno
___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/



--
---
Xavier Prudent

Computational biology and evolutionary genomics

Guest scientist at the Max-Planck-Institut für Physik komplexer Systeme
(MPI-PKS)
Noethnitzer Str. 38
01187 Dresden

Max Planck-Institute for Molecular Cell Biology and Genetics
(MPI-CBG)
Pfotenhauerstraße 108
01307 Dresden


Phone: +49 351 210-2621
Mail: prudent [ at ] mpi-cbg.de
---




[[alternative HTML version deleted]]



___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/



___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/


Re: [R-sig-phylo] Difference running Rscript vs source with ape library

2014-05-03 Thread Emmanuel Paradis

Hi Carlos,

I tested your script in both ways and it may work or fail depending on 
the tree which is generated. However, it seems that failure is more 
frequent with source (about 70%) than with Rscript (about 40%). One 
possible explanation for this may be that the random number generator 
(RNG) is initialized at each call of Rscript whereas it is generated 
only once for all calls of source; however, why this may make a 
difference is not clear to me.


Anyway, this error should not happen and seems to be due to the small 
size of the simulated tree. I continue to work on improving the 
underlying code so that, hopefully, this will be fixed soon.


Best,

Emmanuel

Le 02/05/2014 18:03, Carlos Anderson a écrit :

I have a very simple script using bd.time from the ape library that runs OK
within the R environment or when running the script using Rscript, but it
fails when calling it with source. I have posted my problem in detail on
stackoverflow.com:

http://stackoverflow.com/questions/23416763/difference-running-rscript-vs-source-with-ape-library

Thanks in advance.

Carlos Anderson
Postdoctoral Research Fellow
Department of Ecology and Evolutionary Biology
University of Michigan
carlo...@umich.edu

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/



___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

Re: [R-sig-phylo] What order are the tips of a tree plotted in?

2014-05-06 Thread Emmanuel Paradis

Hi Henry,

I suppose the two trees do not have the same topology (otherwise there 
would be no difficulty). Have you considered using rotateConstr() on the 
second tree? That'd be something like:


tree2 - rotateConstr(tree2, tree1$tip.label)

The node numbering won't be changed, so you can still nodelabels() 
without trouble.


Best,

Emmanuel

Le 06/05/2014 11:25, Ferguson-Gow, Henry a écrit :

Dear List

I am trying to plot two trees facing each other with pie charts at the nodes 
indicating marginal likelihoods of an ancestral state. I have broken the 
plotting frame into 3 sections, with one in the centre for the tip labels. I am 
using text(rep()) to plot the tip labels into this section, but they do not 
match up with the tips plotted by the ladderised tree.

It seems like the plot() function is not plotting the tips in the edge order, 
and so when the tip labels are written in 1:n order, they do not match up. How 
can I reorder the tip labels to match the order in which the tips are plotted 
by plot()? I am happy with the tip order on the plotted tree, so I would rather 
work out how to reorder the tip labels in the text() command, rather than 
reorder the tips on the tree.

I hope that makes sense!

Henry Ferguson-Gow



[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/



___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/


Re: [R-sig-phylo] adjacency matrix / edgelist from igraph into a phylo class

2014-05-12 Thread Emmanuel Paradis

Hi Wesley,

You are right: this definition is out-dated. Have a look at:

http://ape-package.ird.fr/

in the 'Development' section, there is a document explaining the 
structure of the class phylo. Also the second edition of APER has been 
updated on this point.


Best,

Emmanuel

Le 12/05/2014 10:10, Wesley Goi a écrit :

Hi all,

I've search quite around for methods of converting from a edgelist (from
igraph) to a phylo class


taken from the 2006: Analysis of Phylogenetics and Evolution with R

the instructions for building an object of the phylo class seems to have
been changed. Please correct me if I'm mistaken.

An object of class phylo is a list with the following components.

edge
a two-column matrix where each row represents a branch (or edge) of the
tree; the nodes and the tips are symbolized with numbers; the nodes are
represented with negative numbers (the root being -1), and the tips are
represented with positive numbers. For each row, the first column gives the
ancestor. This representation allows an easy manipulation of the tree.





___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

Re: [R-sig-phylo] plot bars across tips of a circular phylogeny

2014-05-14 Thread Emmanuel Paradis

Hi Matt,

See ?phydataplot in ape. This has a couple of examples.

Best,

Emmanuel

Le 14/05/2014 16:44, Matthew Helmus a écrit :

Hi all,

Does anyone know of R code (or perhaps another program) to plot bars across
the tips of a radial/fan phylogeny?

Specifically, I have a large phylogeny and a corresponding vector of
continuous trait values for the tips, and while I could use those values to
vary the color and size of the tip labels, or also to plot points of
varying color or size at those tips, I think a better depiction of the data
would be to plot bars that vary in height.

Thanks!!
Matt

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/



___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/


Re: [R-sig-phylo] Distances from nods to the root.

2014-05-21 Thread Emmanuel Paradis

Alexey,

See ?dist.nodes in ape. If your tree is named tr, you'll get exactly 
what you want with:


dist.nodes(tr)[, Ntip(tr) + 1]

which extracts one column of the matrix returned by dist.nodes.

best,

Emmanuel

Le 21/05/2014 06:56, Alexey Fomin a écrit :

Which package allow calculate all distances from all nods to the root?

Alexey.

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/



___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

Re: [R-sig-phylo] cophyloplot: how to marks links differently

2014-05-28 Thread Emmanuel Paradis

Hi Juan,

cophyloplot() has the options col, lwd, and lty which specify the 
aspects of the association lines.


Le 27/05/2014 16:09, Juan Antonio Balbuena a écrit :

Hello
This is a simple question and hope that there is a simple answer.
Plotting a tanglegram, I would like to write a function where congruent
tips on both trees are marked, say, with continuous lines, whereas
incongruent tips are marked differently (for instance with stippled
lines). A starting point could be using cophyloplot() specifying
assoc=NULL and then use segment() to plot the lines accordingly, but how
can one get the x, y coordinates for segment()?


This is not possible (at least easily) because this function uses a 
different system than plot.phylo().


HTH

Best,

Emmanuel


Any thoughts will be much appreciated.

Juan A. balbuena

--

Dr. Juan A. Balbuena
Cavanilles Institute of Biodiversity and Evolutionary Biology
University of Valencia http://www.uv.es/~balbuena
http://www.uv.es/%7Ebalbuena
P.O. Box 22085 http://www.uv.es/cophylpaco
http://www.uv.es/cavanilles/zoomarin/index.htm
46071 Valencia, Spain
e-mail: j.a.balbu...@uv.es mailto:j.a.balbu...@uv.estel. +34 963 543
658fax +34 963 543 733

*NOTE!*For shipments by EXPRESS COURIER use the following street address:
C/ Catedrático José Beltrán 2, 46980 Paterna (Valencia), Spain.




___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/



___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/


Re: [R-sig-phylo] Pegas Tajimas Test

2014-06-26 Thread Emmanuel Paradis

Hi Peters,

This test cannot be used with 3 sequences. I've added a warning message 
about this.


Best,

Emmanuel

Le 25/06/2014 12:17, Peters, Stuart a écrit :

Hi,


I am trying to run the Tajimas.test function from the r package 'Pegas' on a 
small sample of 3 DNA sequences and I get this error:

Error in if (p  0.5) 2 * p else 2 * (1 - p) :  missing value where TRUE/FALSE 
needed ?

Any ideas?

Kind regards,

Stuart




[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/



___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/


Re: [R-sig-phylo] convergence in compar.gee (ape)

2014-08-14 Thread Emmanuel Paradis

Dear Sereina,

The number of iterations is not output by default. You can modify the 
function with fix(compar.gee), find this line:


geemod - do.call(gee, list(formula, 

and insert this command after it:

cat(number of iterations:, geemod$iterations, \n)

save and close.

HTH

Best,

Emmanuel

Le 12/08/2014 10:57, Sereina Graber a écrit :

Dear list members,
I have a question concerning the compar.gee function in ape. Often, the
models don`t converge, and now I would like to safe whether a model
converged or not. Is there a way to somehow get the number of iterations
reached the max number, or is there something like a convergence
attribute which tells you whether the model converged or not? I couldn`t
find anything when looking at the attributes of the model output.
Thank you  best wishes,
Sereina


___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/



___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/


Re: [R-sig-phylo] Trying to read a tree with numerical node labels, but without edge length

2014-08-30 Thread Emmanuel Paradis

Hi Liam and others,

read.nexus() can read NEXUS files that have no TRANSLATE block. There 
are two problems with this file: the extra spaces between begin.../end 
and the semicolon, and the linebreak between the tree declaration and 
the Newick string. So the following file can be read by read.nexus:


#NEXUS
begin trees;
tree tagged_tree = [U] (A,(B,(C,D)60)100);
end;

Best,

Emmanuel

Le 29/08/2014 23:25, Liam J. Revell a écrit :

Hi Daniel.

I think the problem is actually that read.nexus expects a translation
table, and has nothing to do with the labels.

You could use this simplified reading function which does not permit a
translation table  uses phytools read.newick internally:

library(phytools)

readNexus-function(file){
 obj-readLines(file)
 obj-strsplit(obj[grep(=,obj)], )
 obj-sapply(obj,function(x) x[length(x)])
 if(length(obj)1) trees-read.newick(text=obj)
 else {
 trees-lapply(obj,read.newick)
 class(trees)-multiPhylo
 }
 trees
}

Let me know if this works.

All the best, Liam

Liam J. Revell, Assistant Professor of Biology
University of Massachusetts Boston
web: http://faculty.umb.edu/liam.revell/
email: liam.rev...@umb.edu
blog: http://blog.phytools.org

On 8/29/2014 4:42 PM, Daniel Rafael Miranda-Esquivel wrote:

Dear all,

I tried to read a tree like this:

#NEXUS
begin trees ;
tree tagged_tree = [U]
(A,(B,(C,D)60)100);

end ;

using read.nexus; while figtree reads it, ape refuses, telling that
there is an error:

Error in start:end : NA/NaN argument

I tried readNexus in phylobase, but there is a problem reading the
labels (in the real tree, not in this example), as the program
complains that a label is duplicated.


I will be grateful for any tip to solve this problem,


All the best,


Daniel

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at
http://www.mail-archive.com/r-sig-phylo@r-project.org/



___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at
http://www.mail-archive.com/r-sig-phylo@r-project.org/



___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/


Re: [R-sig-phylo] problem with write.nexus.data

2014-09-24 Thread Emmanuel Paradis

Hi Liam and Nicholas.

write.nexus.data() accepts only lists as specified in the help page, but 
it's a bit of an anomaly since the same help page says the sequences 
must be aligned. I have modified this function so that it now accepts 
both lists and matrices.


Best,

Emmanuel

Le 18/09/2014 07:22, Liam J. Revell a écrit :

Hi Nicholas.

I think this is a bug. Try the following to circumvent:

write.nexus.data(as.list(data), file=test.nex, interleaved=TRUE,
 charsperline=100)

Since I don't have your dataset, I can't check it; but it fixed the
problem in another dataset with which I was able to reproduce the error.

All the best, Liam

Liam J. Revell, Assistant Professor of Biology
University of Massachusetts Boston
web: http://faculty.umb.edu/liam.revell/
email: liam.rev...@umb.edu
blog: http://blog.phytools.org

On 9/17/2014 10:46 PM, Nicholas Crouch wrote:

Hi,

I am having a problem with write.nexus.data, such that the file generated
is nonsense.

I have a very, very large data set, but have been working with a subset
trying to solve this problem. My data is in .fasta format, and I am
looking
to convert it into nexus format.

I load the data:


library(ape)
data - read.dna(concat.fasta, format=fasta, as.matrix=TRUE)


This gives the following:


data
5 DNA sequences in binary format stored in a matrix


 All sequences of same length: 5023

 Labels: Species etc.

 Base composition: a c g t

Also:


class(data)
DNAbin


This is exactly the same as the woodmouse example data provided in the
package ape, which I have been following when trying to solve this issue
(see ?write.nexus.data). When I export the data:


write.nexus.data(data, file=test.nex, interleaved=TRUE,

charsperline=100)

a file is produced which begins:

BEGIN DATA;
   DIMENSIONS NTAX=135165 NCHAR=1;
   FORMAT DATATYPE=DNA MISSING=? GAP=- INTERLEAVE=YES;
   MATRIX
 1   040 etc.

I haven't been able to find other posts on this, any help is greatly
appreciated.

Sincerely,

Nick




___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at
http://www.mail-archive.com/r-sig-phylo@r-project.org/



___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/


Re: [R-sig-phylo] Random sampling of branch lengths

2014-09-25 Thread Emmanuel Paradis

Hi,

compute.brtime() can use a set of pre-calculated branching times (the 
default is indeed a set of coalescent times), for instance:


tr - read.tree(text = (A,(((B,C),D),E));)
compute.brtime(tr, method = 4:1/4)

will produce a tree with root time = 1 and regular inter-node times.

Best,

Emmanuel

Le 25/09/2014 13:43, Arbuckle, Kevin a écrit :

Hi,

I have a problem that I am not entirely sure how to code appropriately,
and any help here would be gratefully appreciated.

Imagine we have a tree consisting only of a topology, for example this
newick format tree: '(A,(((B,C),D),E));'. Now, consider that we wish to
explore how variation in branch lengths on this topology influences the
results of a particular comparative method. One way to do this would be
to generate lots of trees with randomly sampled branch lengths and look
to see whether the conclusions drawn are strongly influenced by the
branch lengths chosen.

In a simple case where ultrametric trees are unnecessary or at least not
considered important then we could just randomly sample edge lengths
from a given distribution, but this poses some problems when ultrametric
trees are desired. I've attached a figure of the example tree given
above for ease of explanation. If the total tree height is some
arbitrary value (let's say 1), then the edge length of the branch
leading to A is always going to be 1. If there was a sister species to A
(let's call it A') along this branch then it would also be easy to do -
simply randomly sample a value between 0 and 1 (call this x), assign
that to both branches leading the common ancestor (A,A') to A and A',
and then assign the edge length from the root to (A,A') as 1-x.

But my problem is thinking how to code a general case for the whole
tree, incorporating two constraints. Firstly, that the sum of all
branches leading from each species to root is 1. Secondly, that branch
lengths are dependent on those of their sister taxon. To illustrate the
last point consider that the edges of the branches leading from common
ancestor (B,C) to each of B and C is 0.2. This automatically constrains
the branch leading to D (call it y) to be greater than 0.2 and also that
the length of the branch leading from (B,C,D) to (B,C) must equal y-0.2.
These dependencies make me unsure how to code such a function to
'randomly' sample branch lengths to generate an ultrametric tree.

As an additional complication, though not an absolutely necessary one,
consider that we have information about some node ages. For instance,
say we know that the age of node (B,C,D,E) is 0.7. How could I
incorporate these additional (user-specified) constraints into such a
function as I'm looking for?

I am currently using the compute.brtime function in ape to obtain branch
lengths from a coalescent model, but these are often biased towards
'tippy' trees and so may not be a representative sample of the possible
tree space.

Am I missing something in the form of an existing function, or does
anyone have any code that could do what I'm asking?

I hope I have explained myself clearly enough.

Cheers,

Kev







___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/



___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/


[R-sig-phylo] new version system for ape

2014-10-24 Thread Emmanuel Paradis

Dear all,

There is a new system of package version for ape. A package repository 
has been set up at ape-package.ird.fr hosting testing versions as 
sources and Windows binaries (also for pegas and coalescentMCMC). The 
version numbers of the testing packages are always higher than the 
current versions on CRAN, and lower than the upcoming versions on CRAN. 
So, new versions can be checked for with the usual R commands. Testing 
versions should be useful for maintainers who develop packages depending 
on ape so that they can test the compatibilities before a new version of 
ape is on CRAN. They can also be used by anyone who wants to try the 
last new features.


One can check if a new testing version is available with:

old.packages(repos = http://ape-package.ird.fr/;)

The release of a new (stable) version on CRAN is checked with the usual:

old.packages()

All practical details can be found on ape's web site:

http://ape-package.ird.fr/

menu Installation - Stable and Testing Versions.

Best,

Emmanuel

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/


Re: [R-sig-phylo] Potential glitch in phymltest

2014-11-08 Thread Emmanuel Paradis

Hi Luiz,

Thanks for the fix. It will be in the next release.

Cheers,

Emmanuel

Le 07/11/2014 20:42, Luiz Max Carvalho a écrit :

Hello Phylofolks,

I've been using ape::phymltest() for a while now on a project and I just
realised there is a potential glitch in the function. The reason is that
the order of the models in the 'mdl' object inside the function is
different from the order in .phymltest.model, what causes the cmd object to
mess up.
  The following MWE should do the job of showing my point:
### begin MWE
library(ape)

.phymltest.model -
   c(JC69, JC69+I, JC69+G, JC69+I+G,
 K80, K80+I, K80+G, K80+I+G,
 F81, F81+I, F81+G, F81+I+G,
 F84, F84+I, F84+G, F84+I+G,
 HKY85, HKY85+I, HKY85+G, HKY85+I+G,
 TN93, TN93+I, TN93+G, TN93+I+G,
 GTR, GTR+I, GTR+G, GTR+I+G)

phymltest.test - function(seqfile, format = interleaved, itree = NULL,
   exclude = NULL, execname = NULL, append = TRUE)
{
   N - length(.phymltest.model)
   format - match.arg(format, c(interleaved, sequential))
   fmt - rep(, N)
   if (format != interleaved) fmt[] - -q
   boot - rep(-b 0, N) # to avoid any testing
   mdl - paste(-m, rep(c(JC69, K80, F81,HKY85,F84, TN93,
GTR), each = 4))
   tstv - rep(-t e, N) # ignored by PhyML with JC69 or F81
   inv - rep(c(, -v e), length.out = N)
   ## no need to use the -c option of PhyML (4 categories by default if '-a
e' is set):
   alpha - rep(rep(c(-c 1, -a e), each = 2), length.out = N)
   tree - rep(, N)

   cmd - paste(execname, -i, seqfile, fmt, boot, mdl, tstv, inv, alpha,
tree, --append )
   imod - 1:N
   if (!is.null(exclude)) imod - imod[!.phymltest.model %in% exclude]
   for (i in imod) print(cmd[i])
}

phymltest.test(seqfile = seqname,
   exclude = c(K80, K80+I, K80+G, K80+I+G,
   F81, F81+I, F81+G, F81+I+G,
   F84, F84+I, F84+G, F84+I+G,

   TN93, TN93+I, TN93+G, TN93+I+G),
   execname = execname)
1] execname -i seqname  -b 0 -m JC69 -t e  -c 1  --append 
[1] execname -i seqname  -b 0 -m JC69 -t e -v e -c 1  --append 
[1] execname -i seqname  -b 0 -m JC69 -t e  -a e  --append 
[1] execname -i seqname  -b 0 -m JC69 -t e -v e -a e  --append 
[1] execname -i seqname  -b 0 -m F84 -t e  -c 1  --append 
[1] execname -i seqname  -b 0 -m F84 -t e -v e -c 1  --append 
[1] execname -i seqname  -b 0 -m F84 -t e  -a e  --append 
[1] execname -i seqname  -b 0 -m F84 -t e -v e -a e  --append 
[1] execname -i seqname  -b 0 -m GTR -t e  -c 1  --append 
[1] execname -i seqname  -b 0 -m GTR -t e -v e -c 1  --append 
[1] execname -i seqname  -b 0 -m GTR -t e  -a e  --append 
[1] execname -i seqname  -b 0 -m GTR -t e -v e -a e  --append 

## end MWE

I think it's easily solved by replacing mdl with  paste(-m, rep(c(JC69,
K80, F81, F84HKY85, TN93, GTR), each = 4))

Anyways, hope to clarify this minor issue.

Cheers,

Luiz



___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/


Re: [R-sig-phylo] overall p-value from multiple PGLS regression

2014-11-10 Thread Emmanuel Paradis

Hi Pascal,

See ?anova.gls.

HTH

Emmanuel

Le 08/11/2014 17:49, Pascal Title a écrit :

Hi all,

I am running a number of PGLS regressions, some of which are multiple
regressions. I am using the nlme package with the corBrownian error
structure. If I build a model M via multiple regression, then I can get a
summary of the model that includes p values as well as a number of other
statistics. But how can I get an overall p-value for this model?
In the same vein, as value, std. error, t-value and p-value are all
returned for each of the traits separately, what does one report for
multiple regression models? I have been unable to find a paper that
actually reported this type of information for such a model.

Here is a toy example:

require(nlme)
require(phytools)

tree - pbtree(n=20)
dat - as.data.frame(fastBM(tree, nsim=4))
colnames(dat) - paste('trait', 1:4, sep='')

M - gls(trait1 ~ trait2 + trait3 + trait4, data=dat,
correlation=corBrownian(1, tree))
summary(M)

Any help/advice would be greatly appreciated. Thanks!

-Pascal



___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/


Re: [R-sig-phylo] advice about diversification analyses

2015-01-09 Thread Emmanuel Paradis

Hi Karla,

You may consider using the methods we developed in this paper:

Paradis E., Tedesco P. A.  Hugueny B. 2013. Quantifying variation in 
speciation and extinction rates with clade data. Evolution 67: 3617–3627.


The associated R code is available on-line as SI of the paper.

Best,

Emmanuel

Le 08/01/2015 13:50, Karla Shikev a écrit :

Dear R-sig-phyloers,

I have a backbone phylogeny (~200 terminals) and diversity data (~2
species) and a multistate character scored for each tip (including a few
polymorphisms). I'd like to estimate both possible effects of different
states of this character on lineage diversification and the transition
probabilities between states. My impression is that MuSSE wouldn't be able
to handle such high level of missing data, and other methods that I've
tried will either do one analysis or the other.

What methods would you suggest?

Thanks!

Karla

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/




___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

<    1   2   3   4   >