Hello all,
I need to tap into the collective wisdom of the group re an issue of
efficiency.
A sketch of the situation:
Let's say 4000 observations in variables Y, X1, X2 , X3 and X4.
I would like to feed various combinations of this expression
Y ~ X1+X2+X3+X4 + I(X1^2)+I(X2^2)+I(X3^2)+I(X4^2) + X1*X2 + X1*X3 + X1*X4 +
X2*X3 + X2*X4 + X3*X4
repeatedly to glm(). (I really have little knowledge about how R or
glm() works internally)
Let's say I call glm() 200 times with various combinations does it
make sense to compute these various factors based on X1 .. X4 and
store them in a file along with the original data, and then use
that file for the glm() calls or will the overhead of computing
these factors be so small that it's not worth computing these
values ahead of time and storing them in a file?
This is simplified example, I actually have 20 original variables
rather than the 4 I show above. I hope this made some sense.
Thanks,
Esmail
ps: If it makes sense to preprocess X1,X2,X3 and X4 to generate a new
file that contains the values for
X1, X2, X3, X4, I(X1^2), I(X2^2), I(X3^2), I(X4^2), X1*X2, X1*X3, X1*X4
,X2*X3, X2*X4, X3*X4
is there an easy way to take the expression at the top of the message
and convert the values in the original dataframe and compute them so that
I can write them out to a new file?
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.