[R-sig-phylo] ML tree search using continuous characters

2018-01-11 Thread William Gelnaw
Hi everyone,
   I was wondering if anyone knows a good R function or package for doing a
maximum likelihood search for a phylogenetic tree using continuous
characters.  I'm aware of Rcontml, but I'm looking for something that I can
change and modify the individual steps in the search process rather than
calling out to another program.  I am currently using the script below,
which I modified from Paradis' ML search using molecular data.  The problem
that I'm running into, is that lots of topologies are getting higher
likelihood scores than the tree that I'm using to simulate the characters.
I'm using rtree to generate a random tree and using fastBM to simulate the
characters with a Brownian Motion model.  Using ace, the topology that I
used to generate the data doesn't even come close to being the most likely
topology.  If anyone has suggestions, I am keen to hear them.
   Best regards,
  Will Gelnaw

Here's the script I've been using:

tree<-rtree(n=40)
dat<-fastBM(tree, nsim=20)


## calculates the sum of the log likelihoods of each of the characters
sumloglik<-function(tree,dat,dattype="continuous"){
loglik<-vector()
nchar<-ncol(dat)
for(i in 1:nchar)
loglik[i]<-ace(dat[,i],tree,type=dattype)$resloglik
sumlik<-sum(loglik)
return(sumlik)}

## uses the above sumloglik function to do the tree search
bigtreesearch<-function(dat,dattype="continuous",logreps=200){
#search breadth is the fraction of the tree that will be changed during
NNI transformations
#0.05 means that 5% of the branches are moved during the NNI shift
test_tree<-rtree(n=nrow(dat),tip.label=row.names(dat))
 ntips<-nrow(dat)
 print("iteration 1")
 write.tree(test_tree,"tree_log.tre", append=TRUE)
#appends the current tree to the tree log and creates a new log if none
exists
 loglik<-vector();
 loglik[1]<-sumloglik(test_tree, dat,dattype="continuous");
 print(loglik[1])
 i<-2  ## sets up the number of rounds of improvements
 m<-1  ## sets up the number of iterations that it tries to do a search
 while (i<=logreps){
 iteration<-paste("iteration", i)
 print(iteration)
 TR2<-rNNI(test_tree,moves=1,n=1)
 d=0.1  ## sets up the amount of random noise added to the branch lengths
 if (m>14) d=0.2  ## as the analysis tests out alternatives, it makes
progressively larger and larger changes
 if (m>29) d=0.3
 if (m>34) d=0.4
 TR2$edge.length<-TR2$edge.length+rnorm(2*ntips-2,sd=d)
 TR2$edge.length[TR2$edge.length<=0]<-0.001
   #above 5 lines add noise to the branch lengths
   #negative branch lengths are eliminated by changing 0-length or negative
branch lengths equal to a small size.
 loglik[i]<-sumloglik(TR2,dat,"continuous")
 print(loglik[i])
 accept<- {if((loglik[i])>(loglik[i-1])) TRUE
 else FALSE}  ##tests for improvement from previous test
  if(accept){
  test_tree<-TR2
  write.tree(tree,"tree_log.tre", append=TRUE)
 m<-1  ## resets m back to 1
 i<-i+1 } else {  ## advances to the next round of testing
   m<-m+1
   if (m>49) break}  ##if the process goes 19 iterations without
improvement, it quits
   }
 best_tree<-list()
 best_tree$tree<-test_tree
 char<-backcalc(test_tree,dat)
 best_tree$loglik<-loglik[i-1]
 return(best_tree)
 }

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/


[R-sig-phylo] rayDISC model not running in corHMM

2018-01-11 Thread Arbuckle Kevin .
Hi all,



I hope someone can help me out here as I've been wracking my brain trying to 
figure out what the problem is. In short, I'm trying to fit a bunch of pathway 
models with the rayDISC function in corHMM, which I've done before with no 
issues. However, on this occasion the model just isn't running and returning 
the following error message:-



Error in 1:nchar(as.character(string)) : NA/NaN argument



This is not the most helpful message, particularly as no part of it seems to 
appear in the code for rayDISC so I assume it's an error from something that 
the function is calling itself. I have loaded the example data provided with 
the package and ran the same code I was using on the tree and an equivalent 
trait from the example data and this runs fine. I have tried comparing the 
structures of both the tree and data from the example to my own and I can't 
find a difference. There are a few NAs in my dataset but the function can 
handle this (a statement I verified by manually introducing NAs into the 
example data and it still runs). The tree has no polytomies or zero-length 
branches.



I've attached my tree and a highly simplified version of the dataset and code 
I've been using (trimmed down to leave a simple example that reproduces the 
error message), I'd be very grateful if someone can figure out what's going on 
here.



Best wishes,

Kev



Dr Kevin Arbuckle
Lecturer in Biosciences (Evolutionary Biology)
College of Science
Swansea University
Singleton Park
Swansea
SA2 8PP


data example.csv
Description: data example.csv


reproducible example.R
Description: reproducible example.R


tree.nwk
Description: tree.nwk
___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/