On Jan 10, 2011, at 5:28 PM, efreeman wrote:

I'm looking for a formula for memory usage in standard regression; that
is, if I have X rows with Y predictors, how much memory is needed? I'm
speccing out a system, and I'd like to be able to get enough memory
that we can do some fairly large regressions.

figure 10-12 bytes times X * Y as the size of the matrix or dataframe and you will need 4-5 times that amount to do useful work,

You can check my guesstimate on one of my objects:

> object.size(set1HLI)
5907427736 bytes
> nrow(set1HLI)
[1] 5325006
> length(set1HLI)
[1] 166

> 5907427736/5325006
[1] 1109.375
> 1109.375/166
[1] 6.682982

So I might have been a bit on the high side with my estimate for number of bytes per cell. I have a bunch of constructed factor variables that only take 4 bytes per "cell". The byte-to-cell ratio is 8 for "numeric" variables and 4 for "factor" or "integer" variables, plus variable amounts for character variables and "overhead". With my other computer activities I end up needing about 24 GB which can holds probably 10 regression models ... needing space for vectors of predicted values and residuals that are as long as the input, and they typically run around 300-500MB.


==Ed Freeman


        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to