I do lots of analyses on large microarray data sets so memory use and speed and both important issues for me. I have been trying to estimate the overheads associated with using formal S4 data objects instead of ordinary lists for large data objects. In some simple experiments (using R 1.7.1 in Windows 2000) with large but simple objects it seems that giving a data object a formal class definition and using extractor and assignment functions may increase both memory usage and the time taken by simple numeric operations by several fold.

Here is a test function which uses a list representation to add 1 to the elements of a long numeric vector:

addlist <- function(len,iter) {
   object <- list(x=rnorm(len))
   for (i in 1:iter) object$x <- object$x+1
   object
}

Typical times on my machine are:

> system.time(a <- addlist(10^6,10))
[1] 2.91 0.00 2.96   NA   NA
> system.time(addlist(10^7,10))
[1] 28.03  0.44 28.65    NA    NA

Here is a test function doing the same operation with a formal S4 data representation:

addS4 <- function(len,iter) {
  object <- new("MyClass",x=rnorm(len))
  for (i in 1:iter) x(object) <- x(object)+1
  object
}

The timing with len=10^6 increases to

> system.time(a <- addS4(10^6,10))
[1] 6.79 0.06 6.90   NA   NA

With len=10^7 the operation fails altogether due to insufficient memory after thrashing around with virtual memory for a very long time.

I guess I'm not surprised by the performance penalty with S4. My question is: is the performance penalty likely to be an ongoing feature of using S4 methods or will it likely go away in future versions of R?

Thanks
Gordon

Here are my S4 definitions:

setClass("MyClass",representation(x="numeric"))
setGeneric("x",function(object) standardGeneric("x"))
setMethod("x","MyClass",function(object) [EMAIL PROTECTED])
setGeneric("x<-", function(object, value) standardGeneric("x<-"))
setReplaceMethod("x","MyClass",function(object,value) [EMAIL PROTECTED] <- value; return(object)})


> version
            _
platform i386-pc-mingw32
arch     i386
os       mingw32
system   i386, mingw32
status
major    1
minor    7.1
year     2003
month    06
day      16
language R

______________________________________________
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-devel

Reply via email to