[R-sig-phylo] Random sampling of branch lengths
Hi, I have a problem that I am not entirely sure how to code appropriately, and any help here would be gratefully appreciated. Imagine we have a tree consisting only of a topology, for example this newick format tree: '(A,(((B,C),D),E));'. Now, consider that we wish to explore how variation in branch lengths on this topology influences the results of a particular comparative method. One way to do this would be to generate lots of trees with randomly sampled branch lengths and look to see whether the conclusions drawn are strongly influenced by the branch lengths chosen. In a simple case where ultrametric trees are unnecessary or at least not considered important then we could just randomly sample edge lengths from a given distribution, but this poses some problems when ultrametric trees are desired. I've attached a figure of the example tree given above for ease of explanation. If the total tree height is some arbitrary value (let's say 1), then the edge length of the branch leading to A is always going to be 1. If there was a sister species to A (let's call it A') along this branch then it would also be easy to do - simply randomly sample a value between 0 and 1 (call this x), assign that to both branches leading the common ancestor (A,A') to A and A', and then assign the edge length from the root to (A,A') as 1-x. But my problem is thinking how to code a general case for the whole tree, incorporating two constraints. Firstly, that the sum of all branches leading from each species to root is 1. Secondly, that branch lengths are dependent on those of their sister taxon. To illustrate the last point consider that the edges of the branches leading from common ancestor (B,C) to each of B and C is 0.2. This automatically constrains the branch leading to D (call it y) to be greater than 0.2 and also that the length of the branch leading from (B,C,D) to (B,C) must equal y-0.2. These dependencies make me unsure how to code such a function to 'randomly' sample branch lengths to generate an ultrametric tree. As an additional complication, though not an absolutely necessary one, consider that we have information about some node ages. For instance, say we know that the age of node (B,C,D,E) is 0.7. How could I incorporate these additional (user-specified) constraints into such a function as I'm looking for? I am currently using the compute.brtime function in ape to obtain branch lengths from a coalescent model, but these are often biased towards 'tippy' trees and so may not be a representative sample of the possible tree space. Am I missing something in the form of an existing function, or does anyone have any code that could do what I'm asking? I hope I have explained myself clearly enough. Cheers, Kev ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
Re: [R-sig-phylo] Random sampling of branch lengths
Hi, compute.brtime() can use a set of pre-calculated branching times (the default is indeed a set of coalescent times), for instance: tr - read.tree(text = (A,(((B,C),D),E));) compute.brtime(tr, method = 4:1/4) will produce a tree with root time = 1 and regular inter-node times. Best, Emmanuel Le 25/09/2014 13:43, Arbuckle, Kevin a écrit : Hi, I have a problem that I am not entirely sure how to code appropriately, and any help here would be gratefully appreciated. Imagine we have a tree consisting only of a topology, for example this newick format tree: '(A,(((B,C),D),E));'. Now, consider that we wish to explore how variation in branch lengths on this topology influences the results of a particular comparative method. One way to do this would be to generate lots of trees with randomly sampled branch lengths and look to see whether the conclusions drawn are strongly influenced by the branch lengths chosen. In a simple case where ultrametric trees are unnecessary or at least not considered important then we could just randomly sample edge lengths from a given distribution, but this poses some problems when ultrametric trees are desired. I've attached a figure of the example tree given above for ease of explanation. If the total tree height is some arbitrary value (let's say 1), then the edge length of the branch leading to A is always going to be 1. If there was a sister species to A (let's call it A') along this branch then it would also be easy to do - simply randomly sample a value between 0 and 1 (call this x), assign that to both branches leading the common ancestor (A,A') to A and A', and then assign the edge length from the root to (A,A') as 1-x. But my problem is thinking how to code a general case for the whole tree, incorporating two constraints. Firstly, that the sum of all branches leading from each species to root is 1. Secondly, that branch lengths are dependent on those of their sister taxon. To illustrate the last point consider that the edges of the branches leading from common ancestor (B,C) to each of B and C is 0.2. This automatically constrains the branch leading to D (call it y) to be greater than 0.2 and also that the length of the branch leading from (B,C,D) to (B,C) must equal y-0.2. These dependencies make me unsure how to code such a function to 'randomly' sample branch lengths to generate an ultrametric tree. As an additional complication, though not an absolutely necessary one, consider that we have information about some node ages. For instance, say we know that the age of node (B,C,D,E) is 0.7. How could I incorporate these additional (user-specified) constraints into such a function as I'm looking for? I am currently using the compute.brtime function in ape to obtain branch lengths from a coalescent model, but these are often biased towards 'tippy' trees and so may not be a representative sample of the possible tree space. Am I missing something in the form of an existing function, or does anyone have any code that could do what I'm asking? I hope I have explained myself clearly enough. Cheers, Kev ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/ ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
Re: [R-sig-phylo] Random sampling of branch lengths
Just to follow up on Emmanuel's point, something like this: tr - read.tree(text = (A,(((B,C),D),E));) bt-rexp(n=tr$Nnode,rate=Ntip(tr):2) tree-compute.brtime(tr,method=cumsum(bt)[length(bt):1]) plotTree(tree) (I believe) would give you internode times as if the tree arose under a pure-birth (i.e., Yule) process All the best, Liam Liam J. Revell, Assistant Professor of Biology University of Massachusetts Boston web: http://faculty.umb.edu/liam.revell/ email: liam.rev...@umb.edu blog: http://blog.phytools.org On 9/26/2014 12:51 AM, Emmanuel Paradis wrote: Hi, compute.brtime() can use a set of pre-calculated branching times (the default is indeed a set of coalescent times), for instance: tr - read.tree(text = (A,(((B,C),D),E));) compute.brtime(tr, method = 4:1/4) will produce a tree with root time = 1 and regular inter-node times. Best, Emmanuel Le 25/09/2014 13:43, Arbuckle, Kevin a écrit : Hi, I have a problem that I am not entirely sure how to code appropriately, and any help here would be gratefully appreciated. Imagine we have a tree consisting only of a topology, for example this newick format tree: '(A,(((B,C),D),E));'. Now, consider that we wish to explore how variation in branch lengths on this topology influences the results of a particular comparative method. One way to do this would be to generate lots of trees with randomly sampled branch lengths and look to see whether the conclusions drawn are strongly influenced by the branch lengths chosen. In a simple case where ultrametric trees are unnecessary or at least not considered important then we could just randomly sample edge lengths from a given distribution, but this poses some problems when ultrametric trees are desired. I've attached a figure of the example tree given above for ease of explanation. If the total tree height is some arbitrary value (let's say 1), then the edge length of the branch leading to A is always going to be 1. If there was a sister species to A (let's call it A') along this branch then it would also be easy to do - simply randomly sample a value between 0 and 1 (call this x), assign that to both branches leading the common ancestor (A,A') to A and A', and then assign the edge length from the root to (A,A') as 1-x. But my problem is thinking how to code a general case for the whole tree, incorporating two constraints. Firstly, that the sum of all branches leading from each species to root is 1. Secondly, that branch lengths are dependent on those of their sister taxon. To illustrate the last point consider that the edges of the branches leading from common ancestor (B,C) to each of B and C is 0.2. This automatically constrains the branch leading to D (call it y) to be greater than 0.2 and also that the length of the branch leading from (B,C,D) to (B,C) must equal y-0.2. These dependencies make me unsure how to code such a function to 'randomly' sample branch lengths to generate an ultrametric tree. As an additional complication, though not an absolutely necessary one, consider that we have information about some node ages. For instance, say we know that the age of node (B,C,D,E) is 0.7. How could I incorporate these additional (user-specified) constraints into such a function as I'm looking for? I am currently using the compute.brtime function in ape to obtain branch lengths from a coalescent model, but these are often biased towards 'tippy' trees and so may not be a representative sample of the possible tree space. Am I missing something in the form of an existing function, or does anyone have any code that could do what I'm asking? I hope I have explained myself clearly enough. Cheers, Kev ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/ ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/ ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/