On 2/9/07, Prof Brian Ripley <[EMAIL PROTECTED]> wrote: > The other reason why pmin/pmax are preferable to your functions is that > they are fully generic. It is not easy to write C code which takes into > account that <, [, [<- and is.na are all generic. That is not to say that > it is not worth having faster restricted alternatives, as indeed we do > with rep.int and seq.int. > > Anything that uses arithmetic is making strong assumptions about the > inputs. It ought to be possible to write a fast C version that worked for > atomic vectors (logical, integer, real and character), but is there > any evidence of profiled real problems where speed is an issue?
Yes. I don't have the profiled timings available now and one would need to go back to earlier versions of R to reproduce them but I did encounter a situation where the bottleneck in a practical computation was pmin/pmax. The binomial and poisson families for generalized linear models used pmin and pmax to avoid boundary conditions when evaluating the inverse link and other functions. When I profiled the execution of some generalized linear model and, more importantly for me, generalized linear mixed model fits, these calls to pmin and pmax were the bottleneck. That is why I moved some of the calculations for the binomial and poisson families in the stats package to compiled code. In that case I didn't rewrite the general form of pmin and pmax, I replaced specific calls in the compiled code. > > On Fri, 9 Feb 2007, Martin Maechler wrote: > > >>>>>> "Ravi" == Ravi Varadhan <[EMAIL PROTECTED]> > >>>>>> on Thu, 8 Feb 2007 18:41:38 -0500 writes: > > > > Ravi> Hi, > > Ravi> "greaterOf" is indeed an interesting function. It is much faster > > than the > > Ravi> equivalent R function, "pmax", because pmax does a lot of checking > > for > > Ravi> missing data and for recycling. Tom Lumley suggested a simple > > function to > > Ravi> replace pmax, without these checks, that is analogous to > > greaterOf, which I > > Ravi> call fast.pmax. > > > > Ravi> fast.pmax <- function(x,y) {i<- x<y; x[i]<-y[i]; x} > > > > Ravi> Interestingly, greaterOf is even faster than fast.pmax, although > > you have to > > Ravi> be dealing with very large vectors (O(10^6)) to see any real > > difference. > > > > Yes. Indeed, I have a file, first version dated from 1992 > > where I explore the "slowness" of pmin() and pmax() (in S-plus > > 3.2 then). I had since added quite a few experiments and versions to that > > file in the past. > > > > As consequence, in the robustbase CRAN package (which is only a bit > > more than a year old though), there's a file, available as > > https://svn.r-project.org/R-packages/robustbase/R/Auxiliaries.R > > with the very simple content {note line 3 !}: > > > > ------------------------------------------------------------------------- > > ### Fast versions of pmin() and pmax() for 2 arguments only: > > > > ### FIXME: should rather add these to R > > pmin2 <- function(k,x) (x+k - abs(x-k))/2 > > pmax2 <- function(k,x) (x+k + abs(x-k))/2 > > ------------------------------------------------------------------------- > > > > {the "funny" argument name 'k' comes from the use of these to > > compute Huber's psi() fast : > > > > psiHuber <- function(x,k) pmin2(k, pmax2(- k, x)) > > curve(psiHuber(x, 1.35), -3,3, asp = 1) > > } > > > > One point *is* that I think proper function names would be pmin2() and > > pmax2() since they work with exactly 2 arguments, > > whereas IIRC the feature to work with '...' is exactly the > > reason that pmax() and pmin() are so much slower. > > > > I've haven't checked if Gabor's > > pmax2.G <- function(x,y) {z <- x > y; z * (x-y) + y} > > is even faster than the abs() using one. > > It may have the advantage of giving *identical* results (to the > > last bit!) to pmax() which my version does not --- IIRC the > > only reason I did not follow my own 'FIXME' above. > > > > I had then planned to implement pmin2() and pmax2() in C code, trivially, > > and and hence get identical (to the last bit!) behavior as > > pmin()/pmax(); but I now tend to think that the proper approach is to > > code pmin() and pmax() via .Internal() and hence C code ... > > > > [Not before DSC and my vacations though!!] > > > > Martin Maechler, ETH Zurich > > > > ______________________________________________ > > R-help@stat.math.ethz.ch mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > > -- > Brian D. Ripley, [EMAIL PROTECTED] > Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ > University of Oxford, Tel: +44 1865 272861 (self) > 1 South Parks Road, +44 1865 272866 (PA) > Oxford OX1 3TG, UK Fax: +44 1865 272595 > ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.