For purposes of clarity only... On Mon, Apr 16, 2012 at 12:40 PM, David A Vavra <dava...@verizon.net> wrote: > Bert, > > My apologies on the name. > > I haven't kept any data on loop times. I don't know why lapply seems faster > but the difference is quite noticeable. It has struck me as odd. I would > have thought lapply would be slower. It has taken an effort to change my > thinking to force fit solutions to it but I've gotten used to it. As of now > I reserve loops to times when there are only a few iterations (as in 10) and > to solutions that require passing large amounts of information among > iterations. lapply is particularly handy when constructing lists. > > As for vectorizing, see the code below.
No. Despite the name, this is **not** what I mean by vectorization. What I mean is pushing the loops down to the C level rather than doing them at the interpreted level, which is where your code below still leaves you. -- Bert Note that it uses mapply but that > simply may have made implementation easier. However, if vectorizing gives an > improvement over looping, the mapply may be the reason. > >> f<-function(x,y,z) catn("do something") >> Vectorize(f,c('x','y')) > function (x, y, z) > { > args <- lapply(as.list(match.call())[-1L], eval, parent.frame()) > names <- if (is.null(names(args))) > character(length(args)) > else names(args) > dovec <- names %in% vectorize.args > do.call("mapply", c(FUN = FUN, args[dovec], MoreArgs = > list(args[!dovec]), > SIMPLIFY = SIMPLIFY, USE.NAMES = USE.NAMES)) > } > <environment: 0x7fb3442553c8> > > DAV > > > -----Original Message----- > From: Bert Gunter [mailto:gunter.ber...@gene.com] > Sent: Monday, April 16, 2012 3:07 PM > To: David A Vavra > Cc: r-help@r-project.org > Subject: Re: [R] Effeciently sum 3d table > > David: > > 1. My first name is Bert. > > 2. " It never occurred to me that there would be a question." > Indeed. But in fact you got solutions for two different > interpretations (Greg's is what you wanted). That is what I meant when > I said that clarity in asking the question is important. > > 3. > I have gotten the impression that a for loop is very inefficient. > Whenever I >> change them to lapply calls there is a noticeable improvement in run time >> for whatever reason. > I'd like to see your data on this. My experience is that they are > typically comparable. Chambers in his "Software for Data Analysis" > book says (pp 213): (with apply type functions rather than explicit > loops), " The computation should run faster... However, none of the > apply mechanisms changes the number of times the supplied functions is > called, so serious improvements will be limited to iterating simple > calculations many times." > > 4. You can get serious improvements by vectorizing; and you can do > that here, if I understand correctly, because all your arrays have > identical dim = d. Here's how: > > ## assume your list of arrays is in listoftables > > alldat <- do.call(cbind,listoftables) ## this might be the slow part > ans <- array(.rowSums (allDat), dim = d) > > See ?rowSums for explanations and caveats, especially with NA's . > > Cheers, > Bert > > On Mon, Apr 16, 2012 at 11:35 AM, David A Vavra <dava...@verizon.net> wrote: >> Thanks Gunter, >> >> I mean what I think is the normal definition of 'sum' as in: >> T1 + T2 + T3 + ... >> It never occurred to me that there would be a question. >> >> I have gotten the impression that a for loop is very inefficient. Whenever > I >> change them to lapply calls there is a noticeable improvement in run time >> for whatever reason. The problem with lapply here is that I effectively > need >> a global table to hold the final sum. lapply also wants to return a > value. >> >> You may be correct that in the long run, the loop is the best. There's a > lot >> of extraneous memory wastage holding all of the tables in a list as well > as >> the return 'values'. >> >> As an alternate and given a pre-existing list of tables, I was thinking of >> creating a temporary environment to hold the final result so it could be >> passed globally to each lapply execution level but that seems clunky and >> wasteful as well. >> >> Example in partial code: >> >> Env <- CreatEnv() # my own function >> Assign('final',T1-T1,envir=env) >> L<-listOfTables >> >> lapply(L,function(t) { >> final <- get('final',envir=env) + t >> assign('final',final,envir=env) >> NULL >> }) >> >> But I was hoping for a more elegant and hopefully more efficient solution. >> Greg's suggestion for using reduce seems in order but as yet I'm > unfamiliar >> with the function. >> >> DAV >> >> >> >> -----Original Message----- >> From: Bert Gunter [mailto:gunter.ber...@gene.com] >> Sent: Monday, April 16, 2012 12:42 PM >> To: Greg Snow >> Cc: David A Vavra; r-help@r-project.org >> Subject: Re: [R] Effeciently sum 3d table >> >> Define "sum" . Do you mean you want to get a single sum for each >> array? -- get marginal sums for each array? -- get a single array in >> which each value is the sum of all the individual values at the >> position? >> >> Due thought and consideration for those trying to help by formulating >> your query carefully and concisely vastly increases the chance of >> getting a useful answer. See the posting guide -- this is a skill that >> needs to be learned and the guide is quite helpful. And I must >> acknowledge that it is a skill that I also have not yet mastered. >> >> Concerning your query, I would only note that the two responses from >> Greg and Petr that you received are unlikely to be significantly >> faster than just using loops, since both are still essentially looping >> at the interpreted level. Whether either give you what you want, I do >> not know. >> >> -- Bert >> >> On Mon, Apr 16, 2012 at 8:53 AM, Greg Snow <538...@gmail.com> wrote: >>> Look at the Reduce function. >>> >>> On Mon, Apr 16, 2012 at 8:28 AM, David A Vavra <dava...@verizon.net> >> wrote: >>>> I have a large number of 3d tables that I wish to sum >>>> Is there an efficient way to do this? Or perhaps a function I can call? >>>> >>>> I tried using do.call("sum",listoftables) but that returns a single >> value. >>>> >>>> So far, it seems only a loop will do the job. >>>> >>>> >>>> TIA, >>>> DAV >> >> >> -- >> >> Bert Gunter >> Genentech Nonclinical Biostatistics >> >> Internal Contact Info: >> Phone: 467-7374 >> Website: >> > http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biost >> atistics/pdb-ncb-home.htm >> > > > > -- > > Bert Gunter > Genentech Nonclinical Biostatistics > > Internal Contact Info: > Phone: 467-7374 > Website: > http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biost > atistics/pdb-ncb-home.htm > -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.