My code seems to be spending most of its time in assignment statements,
in some cases simple assignment of a model frame or model matrix.

Can anyone provide any insights into what's going on, or how to speed
things up?

For starters, is it possible that the reports are not accurate, or that
I am misreading them.  In R 3.0.1 (running under ESS):
 > Rprof(line.profiling=TRUE)
 > system.time(r <- totalEffect(dodata[[1]], dodata[[2]], 1:3, 4))
    user  system elapsed
  21.629   0.756  22.469
!> Rprof(NULL)                                                                  
                                                                                
                                                                                
                               
 > summaryRprof(lines="both")
 $by.self
                            self.time self.pct total.time total.pct
 box.R#158                       6.74    29.56      13.06     57.28             
                                                                                
                                                                                
                               
 simulator.multinomial.R#64      2.92    12.81       2.96     12.98             
                                                                                
                                                                                
                               
 simulator.multinomial.R#63      2.76    12.11       2.76     12.11             
                                                                                
                                                                                
                               
 box.R#171                       2.54    11.14       5.08     22.28             
                                                                                
                                                                                
                               
 simulator.d1.R#70               0.98     4.30       0.98      4.30             
                                                                                
                                                                                
                               
 simulator.d1.R#71               0.98     4.30       0.98      4.30             
                                                                                
                                                                                
                               
 densMap.R#42                    0.72     3.16       0.86      3.77             
                                                                                
                                                                                
                               
 "standardGeneric"               0.52     2.28      11.30     49.56
......

Here's some of the code, with comments at the line numbers
box.R:
                sp <- merge(sexpartner, data, by="studyidx")                    
                                                                                
                                                                                
                               
                sp$y <- numFactor(sp$pEthnic)  #I think y is not used but must 
be present                                                                      
                                                                                
                                
                data(sims.c1[[k]]) <- sp    ###<<<<< line 158                   
                                                                                
                                                                                
                                                
                sp0 <- sp                                                       
                                                                                
                                                                                
                               
                sp <- sim(sims.c1[[k]], i)                                      
                                                                                
                                                                                
                               
                ctable[[k]] <- update.c1(ctable[[k]], sp)                       
                                                                                
                                                                                
                               
                if (is.null(i.c1.in)) {                                         
                                                                                
                                                                                
                               
                    i.c1.in <- match("pEthnic", colnames(sp0))                  
                                                                                
                                                                                
                               
                    i.c1.out <- match(c("studyidx", "n", "pEthnic"), 
colnames(sp))                                                                   
                                                                                
                                          
                }                                                               
                                                                                
                                                                                
                               
                sp0 <- merge(sp0[,-i.c1.in], sp[,i.c1.out], by=c("studyidx", 
"n"))                                                                           
                                                                                
                                  
                # d1                                                            
                                                                                
                                                                                
                               
                sp0 <- sp0[sp0$pIsMale == 1,]                                   
                                                                                
                                                                                
                               
                # avoid lots of conversion warnings                             
                                                                                
                                                                                
                               
                sp0$pEthnic <- factor(sp0$pEthnic, levels=partRaceLevels)       
                                                                                
                                                                                
                               
                data(sims.d1[[k]]) <- sp0    ###<<<<< line 171                  
                                                                                
                                                                                
                                              
                sp <- sim(sims.d1[[k]], i)                                      
                                                                                
                                                                                
                               
                dtable[[k]] <- update.d1(dtable[[k]], sp)                       
                                                                                
                                                                                
                               
                rngstate[[k]] <- .Random.seed   
The timing seems odd since it doesn't appear there's anything to do at
the 2 lines except invoke data<-, but if that's slow I would expect the
time to go to the data<- function (in a different file) and not to the
call.

In fact the other big time items are inside the data<- functions.
simulator.multinomial.R:

   setMethod("data<-", c("simulator.multinomial", "data.frame"),
          function(obj, value) {
    mf <- model.frame(obj@dataFormula, data=value)
    mf$iCluster <- fromOrig(obj@idmap, as.character(mf$studyidx))
    if (any(is.na(mf$iCluster)))
        stop("New studyidx--need to draw from meta distn")
    mm <- model.matrix(obj@modelFormula, data=mf)
    obj@data <- mf  ##<<< line 63
    obj@mm <- mm    ##<<< line 64
    return(obj)
})

The mm and data slots have type restrictions, but no other validation
tests.
setClass("simulator.multinomial",
         representation(fit="stanfit", idmap="sIDMap",
                        modelFormula="formula",
                        categories="ANY",  # could be factor or character       
                                                                                
                                                                                
                               
                                        # categories should be in the order of 
their numeric codes in y                                                        
                                                                                
                                
                        # cached results                                        
                                                                                
                                                                                
                               
                        coef="list",
                        data="data.frame",
                        dataFormula="formula",
                        mm="matrix"))
Does it matter that, e.g., a model frame is more than a vanilla data frame?

I thought assignment, given R's lazy copying behavior, was essentially
resetting a pointer, and so should be fast.

Or maybe the time is going to garbage collecting the previous contents
of the slots?

Ross Boylan

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to