Wacek Kusnierczyk wrote:
Patrick Burns wrote:
If the goal is to "look" professional, then
'replicate' probably suits.  If the goal is to
compute as fast as possible, then that isn't
the case because 'replicate' is really a 'for'
loop in disguise and there are other ways.

Here's one other way:

function (size, replicates, distfun, ...)
{

       colMeans(array(distfun(size * replicates, ...), c(size,
replicates)))
}

a naive benchmark:

f.rep = function(n, m) replicate(n, rnorm(m))
f.pat = function(n, m) colMeans(array(rnorm(n*m), c(n, m)))

system.time(f.pat(1000, 1000))
system.time(f.rep(1000, 1000))

makes me believe that there is no significant difference in efficiency
between the 'professionally-looking' replicate-based solution and the
'as fast as possible' pat's solution.

I think Wacek is largely correct.  First off, a correction:
the dimensions on the array if 'f.pat' should be c(m, n)
rather than c(n, m).

What I'm seeing on my machine is that the array trick seems
always to be a bit faster, but only substantially faster if 'm'
(that is, the number being summed) is smallish.

That makes sense: loops are "slow" because of the overhead
of doing the calling.  When each call takes a lot of time,
the overhead becomes insignificant.


Patrick Burns
patr...@burns-stat.com
+44 (0)20 8525 0696
http://www.burns-stat.com
(home of "The R Inferno" and "A Guide for the Unwilling S User")
vQ





______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to