Yes, I agree. How does one conceptually achieve polymorphic behavior without instantiating 100,000s of instances? Perhaps one way around this is to represent the data in an efficient R way -- i.e. a data.frame -- and create a set of re-usable singleton instances of different node types. To perform some polymorphic operation on a node, a singleton gets assigned to a node in the tree. But behavior such as node$parent() or node$child(1) will require a small pool of these singletons. Doable, I think.
PS. FWIW, I found another strike against the "massive tree of refClass instances". It's save(). save() appears to unnecessarily expand/duplicate refClass structures. Write time becomes prohibitive and loading in the data structure again results in a far greater memory usage. On May 3, 2013, at 9:47 AM, Jeff Newmiller wrote: > Interesting conclusion. Alternatively, that representation of your object > model may not be computationally effective. This discrepancy may be less > exaggerated in C++, but you may still find that large numbers of objects are > less efficient in their use of memory or cpu time than vector processing even > there. I would read the point of Martin's response as "Don't confuse your > mental model of the solution with its implementation". > --------------------------------------------------------------------------- > Jeff Newmiller The ..... ..... Go Live... > DCN:<jdnew...@dcn.davis.ca.us> Basics: ##.#. ##.#. Live Go... > Live: OO#.. Dead: OO#.. Playing > Research Engineer (Solar/Batteries O.O#. #.O#. with > /Software/Embedded Controllers) .OO#. .OO#. rocks...1k > --------------------------------------------------------------------------- > Sent from my phone. Please excuse my brevity. > > David Kulp <dk...@fiksu.com> wrote: > >> Good tip. Thanks Morgan. >> I agree that a different structure might (necessarily) be in order. I >> wanted to create a tree where nodes in a tree were of different derived >> sub-classes -- possibly holding more data and behaving polymorphically. >> OO programming seemed ideal for this: lots of small things with >> specialized behavior -- but this isn't R's strength. >> >> On May 2, 2013, at 4:57 PM, Martin Morgan wrote: >> >>> On 05/01/2013 11:20 AM, David Kulp wrote: >>>> I'm using refClass for a complex multi-directional tree structure >> with >>>> possibly 100,000s of nodes. The refClass design is very impressive >> and I'd >>>> love to use it, but I've found that the size of refClass instances >> are very >>>> large and creation time is slow. For example, below is a RefClass >> and normal >>>> S4 class. The RefClass requires about 4KB per instance vs 500B for >> the S4 >>>> class -- based on adding the Ncells and Vcells of used memory >> reported by >>>> gc(). And instantiation is more than twice as slow for a RefClass. >> (R >>>> 2.14.2) >>>> >>>> Anyone have thoughts on this and whether there's any hope for >> improving >>>> resources on either front? >>> >>> Hi David -- not necessarily helpful but creating a few large objects >> is always better than creating many small in R, so perhaps >> re-conceptualize your data structure? As a rough analogy, instead of >> constructing a graph as a large number of 'Node' instances each >> pointing to one another, a graph could be represented as a data.frame >> containing columns of 'from' and 'to' indexes (neighbour-edge list, a >> few large objects) or as an adjacency matrix. One would also implement >> creation and update of the few large objects in an R-friendly >> (vectorized) way. >>> >>> Perhaps there are existing packages that already model the data >> you're interested in? If your multi-directional tree can be represented >> as a graph, then perhaps >>> >>> http://bioconductor.org/packages/release/bioc/html/graph.html >>> >>> including facilities in the Boost graph library (RBGL, on the >> Bioconductor web site, too) or the igraph package can be put to use. >>> >>> Martin >>> >>>> >>>> I wonder what others are doing. I've been thinking about >> lightweight >>>> alternative implementations, but nothing particularly elegant has >> come to >>>> mind, yet! >>>> >>>> Thanks! >>>> >>>> >>>> simple <- setRefClass('simple', fields = list(a = "character", >> b="numeric") >>>> ) gc() system.time(simple.list <- lapply(1:100000, function(i) { >>>> simple$new(a='foo',b=i) })) gc() >>>> >>>> setClass('simple2', representation(a="character",b="numeric")) >>>> setMethod("initialize", "simple2", function(.Object, a, b) { >> .Object@a <- a >>>> .Object@b <- b .Object }) >>>> >>>> gc() system.time(simple2.list <- lapply(1:100000, function(i) { >>>> new('simple2',a='foo',b=i) })) gc() >>>> >>>> ______________________________________________ R-help@r-project.org >> mailing >>>> list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the >> posting >>>> guide http://www.R-project.org/posting-guide.html and provide >> commented, >>>> minimal, self-contained, reproducible code. >>>> >>> >>> >>> -- >>> Computational Biology / Fred Hutchinson Cancer Research Center >>> 1100 Fairview Ave. N. >>> PO Box 19024 Seattle, WA 98109 >>> >>> Location: Arnold Building M1 B861 >>> Phone: (206) 667-2793 >> >> ______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.