Thanks, Klaus. I was unaware of node.depth.edgelength; I believe I use
dist.nodes to calculate root-to-tip distances in a number of functions
in paleotree, so this could mean improvement potentially to a large
number of functions. I'll have to do a global search for dist.nodes.

Cheers,
-Dave

On Tue, Oct 20, 2015 at 1:32 PM, Klaus Schliep <[email protected]> wrote:
> Hi Gustavo & David,
> a tiny improvement I missed another sapply.
> You can do the same trick replacing dist.nodes in dropExtinct if needed.
> Klaus
>
> On Tue, Oct 20, 2015 at 3:21 PM, Gustavo Burin Ferreira
> <[email protected]> wrote:
>>
>> Hi Klaus,
>>
>> the function works perfectly! Thank you very much!
>>
>> Best,
>>
>> Gustavo Burin Ferreira, Msc.
>> Instituto de Biociências
>> Universidade de São Paulo
>> Tel: (11) 98525-8948
>>
>> On Tue, Oct 20, 2015 at 5:15 PM, Klaus Schliep <[email protected]>
>> wrote:
>>>
>>> Hi Gustavo & David,
>>>
>>> I attached a file that contains a function timeSliceTree2, which is a
>>> replacement for timeSliceTree.
>>> I replaced
>>> dist.nodes(tree)[, Ntip(tree) + 1]
>>> with
>>> node.depth.edgelength(tree)
>>> dist.nodes computes a matrix (roughly nTips^2 * 4 * 8 bytes which can get
>>> very large for many taxa)
>>> whereas node.depth.edgelength just the vector the function needs. This
>>> caused the memory problem.
>>> I also simplified a unnecessary sapply and put prop.part out of a loop.
>>> So it needs less memory and is also much faster for large trees. The
>>> results should be exactly the same.
>>>
>>> Cheers,
>>> Klaus
>>>
>>>
>>>
>>> > source("timeSliceTree.R")
>>> > library(paleotree)
>>> > set.seed(123)
>>> > tree = rtree(2000)
>>> system.time(tree1 <-
>>> timeSliceTree(tree,sliceTime=5,plot=FALSE,drop.extinct=FALSE))
>>> Warning: no ttree$root.time! Assuming latest tip is at present (time=0)
>>>    user  system elapsed
>>>   7.416   0.033   7.454
>>> > system.time(tree2 <-
>>> > timeSliceTree2(tree,sliceTime=5,plot=FALSE,drop.extinct=FALSE))
>>> Warning: no ttree$root.time! Assuming latest tip is at present (time=0)
>>>    user  system elapsed
>>>   0.147   0.003   0.151
>>>
>>>
>>>
>>>
>>> On Tue, Oct 20, 2015 at 11:49 AM, Gustavo Burin Ferreira
>>> <[email protected]> wrote:
>>>>
>>>> Hey David and Nick,
>>>>
>>>> thanks a lot for the quick responses! I think I wasn't very clear in the
>>>> first e-mail. What I get is actually an error from within dist.nodes,
>>>> not
>>>> when calling it.
>>>>
>>>> I've tried to use chainsaw2 and in the beginning it appeared to be
>>>> working
>>>> quite well. However after some running time, I get the same (original)
>>>> error that motivated me writing to the list:
>>>>
>>>>
>>>> > *Error in double(nm * nm) : vector size cannot be NA*
>>>> > *In addition: Warning message:**In nm * nm : NAs produced by integer
>>>> > overflow*
>>>>
>>>>
>>>> Digging into the functions called within chainsaw2, I found that at some
>>>> point it uses the function get_max_height_tree, that calls dist.nodes
>>>> and
>>>> that's where I think the problem lies. The error I got now is almost
>>>> exactly the same as I got from timeSliceTree (because both cases use
>>>> dist.nodes):
>>>>
>>>>
>>>> > *dist.nodes*
>>>> > *function (x) *
>>>> > *{*
>>>> > *    x <- reorder(x)*
>>>> > *    n <- Ntip(x)*
>>>> > *    m <- x$Nnode*
>>>> > *    nm <- n + m*
>>>> > *    d <- .C(dist_nodes, as.integer(n), as.integer(m),
>>>> > as.integer(x$edge[, *
>>>> > *        1] - 1L), as.integer(x$edge[, 2] - 1L),
>>>> > as.double(x$edge.length), *
>>>> > *        as.integer(Nedge(x)), double(nm * nm), NAOK = TRUE)[[7]]*
>>>> > *    dim(d) <- c(nm, nm)*
>>>> > *    dimnames(d) <- list(1:nm, 1:nm)*
>>>> > *    d**}*
>>>>
>>>>
>>>> I tried changing the highlighted part to something like
>>>> double(as.numeric(nm) * as.numeric(nm)), and when I try running it, I
>>>> get
>>>> the error I wrote on the first e-mail:
>>>>
>>>>
>>>> > *Error in dist.nodes(tree) (from #7) : **  long vectors (argument 7)
>>>> > are
>>>> > not supported in .Fortran*
>>>>
>>>>
>>>> Thus, I think that to solve this problem some tweak in the C/Fortran
>>>> code
>>>> that is called within dist.nodes (from ape) might be required, but I
>>>> have
>>>> no expertise on that. So if someone can help me with that, I'll
>>>> appreciate
>>>> it!
>>>>
>>>> Thanks again for the help so far!
>>>>
>>>> Best,
>>>>
>>>>
>>>> *Gustavo Burin Ferreira, **Msc.*
>>>>
>>>> Instituto de Biociências
>>>> Universidade de São Paulo
>>>> Tel: (11) 98525-8948
>>>>
>>>> On Fri, Oct 16, 2015 at 5:06 PM, Nick Matzke <[email protected]> wrote:
>>>>
>>>> > Hi!  I re-did chainsaw at some point, now there is chainsaw2.
>>>> > However,
>>>> > googling that gets you horror movies, so here is a link with example
>>>> > code:
>>>> >
>>>> > https://groups.google.com/d/msg/biogeobears/Jy9uYckOL7s/XuNZ0B3jAwAJ
>>>> >
>>>> > (the discussion there points out a rare case where this crashes, but
>>>> > for
>>>> > most trees it should work fine)
>>>> >
>>>> > Cheers, Nick
>>>> >
>>>> > On Fri, Oct 16, 2015 at 2:17 PM, David Bapst <[email protected]>
>>>> > wrote:
>>>> >
>>>> > > Hi Gustavo,
>>>> > >
>>>> > > I'm paleotree's author and maintainer. Just to be clear that I
>>>> > > understand your problem, I believe you are saying that when you use
>>>> > > timeSliceTree, you are getting an error that the internal call to
>>>> > > dist.nodes is failing? Is that right?
>>>> > >
>>>> > > The first thought I have is that maybe the solution here is to avoid
>>>> > > dist.nodes, as it is somewhat overkill. I use dist.nodes in that
>>>> > > code,
>>>> > > which I wrote in 2011, to get the distance of tips and nodes from
>>>> > > the
>>>> > > root. A better solution may now exist in another R package. I'd have
>>>> > > to investigate (although maybe someone on the list can suggest one).
>>>> > >
>>>> > > The second thought I have is that there might be alternative
>>>> > > functions
>>>> > > that do something lie timeSliceTree in another R package. Off the
>>>> > > top
>>>> > > of my head, I recall that Nick Matzke had a similar, 'chainsaw'
>>>> > > function, which you can find here and appears not to call
>>>> > > dist.nodes:
>>>> > >
>>>> > > https://stat.ethz.ch/pipermail/r-sig-phylo/2011-July/001483.html
>>>> > >
>>>> > > Again, maybe someone on the list knows of a good alternative
>>>> > > function.
>>>> > >
>>>> > > I'll try to give this more thought, but for now, maybe see if you
>>>> > > can
>>>> > > use Nick's function succesfully. Overall though, I've discovered the
>>>> > > use of truly gigantic trees can often run into unexpected problems.
>>>> > >
>>>> > > Cheers,
>>>> > > -Dave
>>>> > >
>>>> > >
>>>> > >
>>>> > > On Fri, Oct 16, 2015 at 12:47 PM, Gustavo Burin Ferreira
>>>> > > <[email protected]> wrote:
>>>> > > > Dear list,
>>>> > > >
>>>> > > > I'm trying to perform a time travel in simulated phylogenies with
>>>> > > > both
>>>> > > > extant and extinct species using the timeSliceTree function form
>>>> > > > the
>>>> > > > paleotree package. My aim is to have the molecular phylogenies
>>>> > > > derived
>>>> > > from
>>>> > > > the complete phylogeny (attached) in different points in time.
>>>> > > >
>>>> > > > However, when I try that with big trees (bigger than 20000 tips
>>>> > total), I
>>>> > > > get an error of integer overflow coming from the dist.nodes
>>>> > > > function.
>>>> > > After
>>>> > > > slightly tweaking the dist.nodes function (changing nm from
>>>> > > > integer to
>>>> > > > numeric/double), I get the following message:
>>>> > > >
>>>> > > > Error in dist.nodes(tree) (from #7) :
>>>> > > >   long vectors (argument 7) are not supported in .Fortran
>>>> > > >
>>>> > > > Since I don't know much about C or Fortran, I couldn't find a way
>>>> > > > of
>>>> > > solving
>>>> > > > this by myself, so any help will be greatly appreciated.
>>>> > > >
>>>> > > > I'm sending one tree attached for example.
>>>> > > >
>>>> > > > Thank you very much in advance!
>>>> > > >
>>>> > > > Best,
>>>> > > >
>>>> > > > Gustavo Burin Ferreira, Msc.
>>>> > > > Instituto de Biociências
>>>> > > > Universidade de São Paulo
>>>> > > > Tel: (11) 98525-8948
>>>> > > >
>>>> > > > _______________________________________________
>>>> > > > R-sig-phylo mailing list - [email protected]
>>>> > > > https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
>>>> > > > Searchable archive at
>>>> > > http://www.mail-archive.com/[email protected]/
>>>> > >
>>>> > >
>>>> > >
>>>> > > --
>>>> > > David W. Bapst, PhD
>>>> > > Adjunct Asst. Professor, Geology and Geol. Eng.
>>>> > > South Dakota School of Mines and Technology
>>>> > > 501 E. St. Joseph
>>>> > > Rapid City, SD 57701
>>>> > >
>>>> > > http://webpages.sdsmt.edu/~dbapst/
>>>> > > http://cran.r-project.org/web/packages/paleotree/index.html
>>>> > >
>>>> > > _______________________________________________
>>>> > > R-sig-phylo mailing list - [email protected]
>>>> > > https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
>>>> > > Searchable archive at
>>>> > > http://www.mail-archive.com/[email protected]/
>>>> >
>>>> >         [[alternative HTML version deleted]]
>>>> >
>>>> > _______________________________________________
>>>> > R-sig-phylo mailing list - [email protected]
>>>> > https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
>>>> > Searchable archive at
>>>> > http://www.mail-archive.com/[email protected]/
>>>> >
>>>>
>>>>         [[alternative HTML version deleted]]
>>>>
>>>> _______________________________________________
>>>> R-sig-phylo mailing list - [email protected]
>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
>>>> Searchable archive at
>>>> http://www.mail-archive.com/[email protected]/
>>>
>>>
>>>
>>>
>>> --
>>> Klaus Schliep
>>> Postdoctoral Fellow
>>> Revell Lab, University of Massachusetts Boston
>>>
>>
>
>
>
> --
> Klaus Schliep
> Postdoctoral Fellow
> Revell Lab, University of Massachusetts Boston
>



-- 
David W. Bapst, PhD
Adjunct Asst. Professor, Geology and Geol. Eng.
South Dakota School of Mines and Technology
501 E. St. Joseph
Rapid City, SD 57701

http://webpages.sdsmt.edu/~dbapst/
http://cran.r-project.org/web/packages/paleotree/index.html

_______________________________________________
R-sig-phylo mailing list - [email protected]
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/[email protected]/

Reply via email to