On Fri, 1 Sep 2006, [EMAIL PROTECTED] wrote:

> Prof Brian Ripley wrote:
> > I would not have expected glm to be more than say 5x slower than lm if 
> CPU 
> > cycles and not memory were the limiting factor.  In that case more RAM 
> > might be all you need.
> 
> The ratio between glm and lm might well be about 5x, but that's still a 
> big difference for us.   

You said lm was 'very fast', so I did not expect 5x 'very fast' to be 'too 
slow'.

> I am pretty sure that RAM is not the main 
> problem; according to the Windows Task Manager the computer is at close to 
> 100% CPU usage, and swapping is not going on.   Of course L1/L2 caches may 
> still be
> something one can work on, but I'm not sure whether glm has enough 
> repeated access to the same data for that to help.   (I don't know how glm 
> works,
> but I guess it does a lot of scans through the whole data set, and that 
> the amount of working memory it needs during these scans is basically a 
> function of the number of parameters, not the number of observations, is 
> that right?)

Not so.  Because glm does weighted fits, it needs to access the whole data 
matrix at each iteration (to re-weight).

> Many thanks for your observations about subset selection by the way, they 
> are a lot of help.   Would a good approach be, say, to use some stricter 
> criteria like BIC for choosing a model, and then use non-statistical 
> methods to improve the plausibility of the chosen parameters?

The latter entirely I would say.  All statistics can say is that a 
variable improves the fit measurably more than one that is unrelated to 
the response: whether it improves it enough to be worthwhile in your 
application is non-statistical. The point here is that all but the most 
uselss variables will measurably improve the fit in large problems with 
few variables.

-- 
Brian D. Ripley,                  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to