Aimin Yan wrote: > I have a data set like this. > I want to do glm, but I get this error: > > Error in model.matrix.default(mt, mf, contrasts) : > cannot allocate vector of length 932889958 > > I am wondering if my data set is too large or I did something wrong. > > Is there some limitation for data size for R? > > thanks, > > Aimin > > > > p1982<- read.csv("p_1982_aa.csv") > > names(p1982) > [1] "p" "aa" "as" "ms" "cur" "sc" > > str(p1982) > 'data.frame': 465979 obs. of 6 variables: > $ p : Factor w/ 1982 levels "154l_aa","1A0P_aa",..: 1 1 1 1 1 1 1 1 1 1 ... > $ aa : Factor w/ 19 levels "ALA","ARG","ASN",..: 2 16 4 5 18 3 19 3 2 9 ... > $ as : num 152.0 15.9 65.1 57.2 28.9 ... > $ ms : num 108.8 28.3 59.2 49.9 31.8 ... > $ cur: num -0.1020 0.2564 0.0312 -0.0550 0.0526 ... > $ sc : num 92.10 103.67 7.27 72.98 96.12 ... > > attach(p1982) > > m<-glm(sc~p+aa+as+cur,data=p1982) > Error in model.matrix.default(mt, mf, contrasts) : > cannot allocate vector of length 932889958 >
Your "p" is a factor with many levels, so the design matrix for your model is roughly 500000 x 2000. That gives 1 billion (US) entries of 8 bytes, so you need at least 8 GB just to store the design matrix. So either you don't want "p" in the model or you have indeed exceeded your capacity. > > > > ______________________________________________ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- O__ ---- Peter Dalgaard Ă˜ster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.