Re: [R-sig-phylo] Simulate node ages given known topology
Gene Hunt said: Brian Sidlauskas did this in an Evolution paper in 2008 (based on Yang and Rannala 1997): Sidlauskas, Brian L. 2008. Continuous and arrested morphological diversification in sister clades of characiform fishes: a phylomorphospace approach. Evolution 62(12): 3135-3156. He had a morphological tree without fossil or molecular branch lengths. The branch lengths were based on rates of speciation, extinction and sampling, and so it would be a different approach than the coalescent theory, I believe. I think Brian has R code for it; you may want to contact him directly. In my 2004 book, on pages 570-571 I note the use of a time transformation that makes simulation of birth-death processes easy. This is based on Bruce Rannala's 1997 work. If you want to start out with the first fork in the tree at time T ago, you want to take your N tip species and choose uniformly from 2, 3, ..., N-1 to choose the number of those tips that are descended from the left branch of the fork. Say this is called K. Then you need to simulate a birth-death process for K tips (the subtree descended from the left fork), and another for N-K tips for the right fork. The result will be a fairly quick algorithm for simulating a BD process that has its oldest fork T units of time ago and leads to exactly N tips. If instead you just want the process to start T units of time ago, and have the oldest fork occur some time after that (determined by the BD process), then that is even easier using the time- transformation trick. I do not know the R packages well enough to know whether one of them implements this, but I am sure someone will comment on that. Joe Joe Felsenstein, j...@gs.washington.edu Dept. of Genome Sciences, Univ. of Washington Box 355065, Seattle, WA 98195-5065 USA ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Re: [R-sig-phylo] Simulate node ages given known topology
Hey Andy, Simulating the bifurcation times in a tree with n extant species and age t (where t is either mrca or the origin of the first species), assuming constant speciation and extinction rates (and possibly a sampling probability rho), can be done within the R package TreeSim using the function: sim.bd.taxa.age Note: the output is a tree with branch length. Therefore, use branching.times(tree) for obtaining the bifurcation times. These bifurcation times can be put on your constrained tree topology in the following: 1) sort (or rank) the interior nodes in your tree where each sorting is equally likely (the resulting tree is called ranked tree, see eg SempleSteel Phylogenetics 2003. For sorting the nodes uniformly at random, you can use the shuffle idea introduced in Ford, Matsen, Stadler 2009 Syst Biol). 2) Then the i-th largest branching time is put on the i-th node in your tree. 1-2) works as ranked tree and branching time distribution are independent under the birth-death process and each ranked tree has equal probability. Simulating 1) is fairly easy with this shuffle concept, if you have questions or need help, let me know! Best Tanja Stadler ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Re: [R-sig-phylo] Simulate node ages given known topology
Hi Emmanuel, Many thanks for your suggestions. I think a group of us at Berkeley would be very interested in working out different ways to simulate these processes and then share our results. We'll keep you posted. I think starting with rcoal and compute.brlen would be a great jumping off place. Thanks again, Andy On Mon, Apr 11, 2011 at 4:36 AM, Emmanuel Paradis emmanuel.para...@ird.frwrote: Andy, Andy Rominger wrote on 09/04/2011 08:37: Hello, I'm working with a friend on a problem where we'd like to simulate many possible trees that all have the same topology, but differ in branch lengths (or equivalently in node ages). We have a known age for the root, and all tips correspond to extant taxa. So the start and end times are known. The code to do this is in rcoal() except that a random topology is generated. It seems several users are interested in assigning random branching times to a fixed topology, so I'll add this. compute.brlen() can generate random branch lengths on a fixed topology. What you want is a function that does the opposite operation of branching.times(). I can see three possible ways to implement this: 1) add a new option to compute.brlen() 2) add a new option to rcoal() 3) write a new function, eg, compute.brtime() BTW, rcoal() can be used with fixed branching intervals (ie, computed beforehand) where a random coalescent topology is generated; so option 2 makes sense to me. Any suggestion? I'm wondering if a function to produce random branch lengths under topological constraints already exists? If not, I'm trying to dredge up my foggy knowledge of stochastic processes to write the simulation myself. I'm thinking that maybe I remember under a Yule process sojourn times are distributed exponential so event times should be uniformly distributed??? But then I got pointed toward Yule priors which I still don't satisfactorily understand (but maybe eventually will) after a few web searches. Once you'll have the above tool (soon I hope), then you'll be able to explore this issue more freely. Be careful to distinguish branching times and branching intervals. (I'm just seeing that the help page of rcoal is not clear on this point: I'll clarify this.) If you simulate the latter, you get the former with cumsum(). ape has an internal function that computes the theoretical CDF of branching times under any birth-death process. eg: CDF.birth.death(.1, 0, Tmax = 10, x = 0:100/10, case = 1) will return the cumulative probability of the branching times 0, 0.1, 0.2, ..., 10 for lambda = 0.1 and mu = 0 (thus a Yule process). From this you can simulate a set of random branching times with the inverse method (generate runif(n) and back transform from the CDF). The literature reference is in ?bd.time. Of course, you can use another formula and compare the results with standard plots in R. As a reminder, rcoal() generates coalescent intervals with: 2 * rexp(n - 1)/(as.double(n:2) * as.double((n - 1):1)) You may also have a look at the package TreeSim which has algorithms contributed by Tanja Stadler. Cheers, Emmanuel Any insights would be greatly appreciated. Thanks in advance-- Andy [[alternative HTML version deleted]] ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo -- Emmanuel Paradis IRD, Jakarta, Indonesia http://ape.mpl.ird.fr/ [[alternative HTML version deleted]] ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Re: [R-sig-phylo] Simulate node ages given known topology
Hi Andy, Brian Sidlauskas did this in an Evolution paper in 2008 (based on Yang and Rannala 1997): Sidlauskas, Brian L. 2008. Continuous and arrested morphological diversification in sister clades of characiform fishes: a phylomorphospace approach. Evolution 62(12): 3135-3156. He had a morphological tree without fossil or molecular branch lengths. The branch lengths were based on rates of speciation, extinction and sampling, and so it would be a different approach than the coalescent theory, I believe. I think Brian has R code for it; you may want to contact him directly. Best, Gene On 4/11/11 12:04 PM, Andy Rominger ajromin...@gmail.com wrote: Hi Emmanuel, Many thanks for your suggestions. I think a group of us at Berkeley would be very interested in working out different ways to simulate these processes and then share our results. We'll keep you posted. I think starting with rcoal and compute.brlen would be a great jumping off place. Thanks again, Andy On Mon, Apr 11, 2011 at 4:36 AM, Emmanuel Paradis emmanuel.para...@ird.frwrote: Andy, Andy Rominger wrote on 09/04/2011 08:37: Hello, I'm working with a friend on a problem where we'd like to simulate many possible trees that all have the same topology, but differ in branch lengths (or equivalently in node ages). We have a known age for the root, and all tips correspond to extant taxa. So the start and end times are known. The code to do this is in rcoal() except that a random topology is generated. It seems several users are interested in assigning random branching times to a fixed topology, so I'll add this. compute.brlen() can generate random branch lengths on a fixed topology. What you want is a function that does the opposite operation of branching.times(). I can see three possible ways to implement this: 1) add a new option to compute.brlen() 2) add a new option to rcoal() 3) write a new function, eg, compute.brtime() BTW, rcoal() can be used with fixed branching intervals (ie, computed beforehand) where a random coalescent topology is generated; so option 2 makes sense to me. Any suggestion? I'm wondering if a function to produce random branch lengths under topological constraints already exists? If not, I'm trying to dredge up my foggy knowledge of stochastic processes to write the simulation myself. I'm thinking that maybe I remember under a Yule process sojourn times are distributed exponential so event times should be uniformly distributed??? But then I got pointed toward Yule priors which I still don't satisfactorily understand (but maybe eventually will) after a few web searches. Once you'll have the above tool (soon I hope), then you'll be able to explore this issue more freely. Be careful to distinguish branching times and branching intervals. (I'm just seeing that the help page of rcoal is not clear on this point: I'll clarify this.) If you simulate the latter, you get the former with cumsum(). ape has an internal function that computes the theoretical CDF of branching times under any birth-death process. eg: CDF.birth.death(.1, 0, Tmax = 10, x = 0:100/10, case = 1) will return the cumulative probability of the branching times 0, 0.1, 0.2, ..., 10 for lambda = 0.1 and mu = 0 (thus a Yule process). From this you can simulate a set of random branching times with the inverse method (generate runif(n) and back transform from the CDF). The literature reference is in ?bd.time. Of course, you can use another formula and compare the results with standard plots in R. As a reminder, rcoal() generates coalescent intervals with: 2 * rexp(n - 1)/(as.double(n:2) * as.double((n - 1):1)) You may also have a look at the package TreeSim which has algorithms contributed by Tanja Stadler. Cheers, Emmanuel Any insights would be greatly appreciated. Thanks in advance-- Andy [[alternative HTML version deleted]] ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo -- Emmanuel Paradis IRD, Jakarta, Indonesia http://ape.mpl.ird.fr/ [[alternative HTML version deleted]] ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo -- Gene Hunt Curator, Department of Paleobiology National Museum of Natural History Smithsonian Institution [NHB, MRC 121] P.O. Box 37012 Washington DC 20013-7012 Phone: 202-633-1331 Fax: 202-786-2832 http://paleobiology.si.edu/staff/individuals/hunt.cfm ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Re: [R-sig-phylo] Simulate node ages given known topology
Hi Andy, Andy Rominger wrote on 11/04/2011 23:04: Hi Emmanuel, Many thanks for your suggestions. I think a group of us at Berkeley would be very interested in working out different ways to simulate these processes and then share our results. We'll keep you posted. I think starting with rcoal and compute.brlen would be a great jumping off place. Another possibility is to have an assignment version of branching.times; such as: branching.times(phy) - 1:10 Quite intuitive is you are used to R's syntax. Emmanuel Thanks again, Andy On Mon, Apr 11, 2011 at 4:36 AM, Emmanuel Paradis emmanuel.para...@ird.fr mailto:emmanuel.para...@ird.fr wrote: Andy, Andy Rominger wrote on 09/04/2011 08:37: Hello, I'm working with a friend on a problem where we'd like to simulate many possible trees that all have the same topology, but differ in branch lengths (or equivalently in node ages). We have a known age for the root, and all tips correspond to extant taxa. So the start and end times are known. The code to do this is in rcoal() except that a random topology is generated. It seems several users are interested in assigning random branching times to a fixed topology, so I'll add this. compute.brlen() can generate random branch lengths on a fixed topology. What you want is a function that does the opposite operation of branching.times(). I can see three possible ways to implement this: 1) add a new option to compute.brlen() 2) add a new option to rcoal() 3) write a new function, eg, compute.brtime() BTW, rcoal() can be used with fixed branching intervals (ie, computed beforehand) where a random coalescent topology is generated; so option 2 makes sense to me. Any suggestion? I'm wondering if a function to produce random branch lengths under topological constraints already exists? If not, I'm trying to dredge up my foggy knowledge of stochastic processes to write the simulation myself. I'm thinking that maybe I remember under a Yule process sojourn times are distributed exponential so event times should be uniformly distributed??? But then I got pointed toward Yule priors which I still don't satisfactorily understand (but maybe eventually will) after a few web searches. Once you'll have the above tool (soon I hope), then you'll be able to explore this issue more freely. Be careful to distinguish branching times and branching intervals. (I'm just seeing that the help page of rcoal is not clear on this point: I'll clarify this.) If you simulate the latter, you get the former with cumsum(). ape has an internal function that computes the theoretical CDF of branching times under any birth-death process. eg: CDF.birth.death(.1, 0, Tmax = 10, x = 0:100/10, case = 1) will return the cumulative probability of the branching times 0, 0.1, 0.2, ..., 10 for lambda = 0.1 and mu = 0 (thus a Yule process). From this you can simulate a set of random branching times with the inverse method (generate runif(n) and back transform from the CDF). The literature reference is in ?bd.time. Of course, you can use another formula and compare the results with standard plots in R. As a reminder, rcoal() generates coalescent intervals with: 2 * rexp(n - 1)/(as.double(n:2) * as.double((n - 1):1)) You may also have a look at the package TreeSim which has algorithms contributed by Tanja Stadler. Cheers, Emmanuel Any insights would be greatly appreciated. Thanks in advance-- Andy [[alternative HTML version deleted]] ___ R-sig-phylo mailing list R-sig-phylo@r-project.org mailto:R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo -- Emmanuel Paradis IRD, Jakarta, Indonesia http://ape.mpl.ird.fr/ -- Emmanuel Paradis IRD, Jakarta, Indonesia http://ape.mpl.ird.fr/ ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Re: [R-sig-phylo] Simulate node ages given known topology
Andy, Andy Rominger wrote on 09/04/2011 08:37: Hello, I'm working with a friend on a problem where we'd like to simulate many possible trees that all have the same topology, but differ in branch lengths (or equivalently in node ages). We have a known age for the root, and all tips correspond to extant taxa. So the start and end times are known. The code to do this is in rcoal() except that a random topology is generated. It seems several users are interested in assigning random branching times to a fixed topology, so I'll add this. compute.brlen() can generate random branch lengths on a fixed topology. What you want is a function that does the opposite operation of branching.times(). I can see three possible ways to implement this: 1) add a new option to compute.brlen() 2) add a new option to rcoal() 3) write a new function, eg, compute.brtime() BTW, rcoal() can be used with fixed branching intervals (ie, computed beforehand) where a random coalescent topology is generated; so option 2 makes sense to me. Any suggestion? I'm wondering if a function to produce random branch lengths under topological constraints already exists? If not, I'm trying to dredge up my foggy knowledge of stochastic processes to write the simulation myself. I'm thinking that maybe I remember under a Yule process sojourn times are distributed exponential so event times should be uniformly distributed??? But then I got pointed toward Yule priors which I still don't satisfactorily understand (but maybe eventually will) after a few web searches. Once you'll have the above tool (soon I hope), then you'll be able to explore this issue more freely. Be careful to distinguish branching times and branching intervals. (I'm just seeing that the help page of rcoal is not clear on this point: I'll clarify this.) If you simulate the latter, you get the former with cumsum(). ape has an internal function that computes the theoretical CDF of branching times under any birth-death process. eg: CDF.birth.death(.1, 0, Tmax = 10, x = 0:100/10, case = 1) will return the cumulative probability of the branching times 0, 0.1, 0.2, ..., 10 for lambda = 0.1 and mu = 0 (thus a Yule process). From this you can simulate a set of random branching times with the inverse method (generate runif(n) and back transform from the CDF). The literature reference is in ?bd.time. Of course, you can use another formula and compare the results with standard plots in R. As a reminder, rcoal() generates coalescent intervals with: 2 * rexp(n - 1)/(as.double(n:2) * as.double((n - 1):1)) You may also have a look at the package TreeSim which has algorithms contributed by Tanja Stadler. Cheers, Emmanuel Any insights would be greatly appreciated. Thanks in advance-- Andy [[alternative HTML version deleted]] ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo -- Emmanuel Paradis IRD, Jakarta, Indonesia http://ape.mpl.ird.fr/ ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
[R-sig-phylo] Simulate node ages given known topology
Hello, I'm working with a friend on a problem where we'd like to simulate many possible trees that all have the same topology, but differ in branch lengths (or equivalently in node ages). We have a known age for the root, and all tips correspond to extant taxa. So the start and end times are known. I'm wondering if a function to produce random branch lengths under topological constraints already exists? If not, I'm trying to dredge up my foggy knowledge of stochastic processes to write the simulation myself. I'm thinking that maybe I remember under a Yule process sojourn times are distributed exponential so event times should be uniformly distributed??? But then I got pointed toward Yule priors which I still don't satisfactorily understand (but maybe eventually will) after a few web searches. Any insights would be greatly appreciated. Thanks in advance-- Andy [[alternative HTML version deleted]] ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo