thanks Duncan for this clarification. A double precision matrix with 2e11 elements (as the op wanted) would need about 1.5 TB memory, that's more than a standard (windows 64bit) computer can handle.
Cheers. Am 12.12.2013 13:00, schrieb Duncan Murdoch: > On 13-12-12 6:51 AM, Eik Vettorazzi wrote: >> I thought so (with all the limitations due to collinearity and so on), >> but actually there is a limit for the maximum size of an array which is >> independent of your memory size and is due to the way arrays are >> indexed. You can't create an object with more than 2^31-1 = 2147483647 >> elements. >> >> https://stat.ethz.ch/pipermail/r-help/2007-June/133238.html > > That post is from 2007. The limits were raised considerably when R > 3.0.0 was released, and it is now 2^48 for disk-based operations, 2^52 > for working in memory. > > Duncan Murdoch > > >> >> cheers >> >> Am 12.12.2013 12:34, schrieb Romeo Kienzler: >>> ok, so 200K predictors an 10M observations would work? >>> >>> >>> On 12/12/2013 12:12 PM, Eik Vettorazzi wrote: >>>> it is simply because you can't do a regression with more predictors >>>> than >>>> observations. >>>> >>>> Cheers. >>>> >>>> Am 12.12.2013 09:00, schrieb Romeo Kienzler: >>>>> Dear List, >>>>> >>>>> I'm quite new to R and want to do logistic regression with a 200K >>>>> feature data set (around 150 training examples). >>>>> >>>>> I'm aware that I should use Naive Bayes but I have a more general >>>>> question about the capability of R handling very high dimensional >>>>> data. >>>>> >>>>> Please consider the following R code where "mygenestrain.tab" is a 150 >>>>> by 200000 matrix: >>>>> >>>>> traindata <- read.table('mygenestrain.tab'); >>>>> mylogit <- glm(V1 ~ ., data = traindata, family = "binomial"); >>>>> >>>>> When executing this code I get the following error: >>>>> >>>>> Error in terms.formula(formula, data = data) : >>>>> allocMatrix: too many elements specified >>>>> Calls: glm ... model.frame -> model.frame.default -> terms -> >>>>> terms.formula >>>>> Execution halted >>>>> >>>>> Is this because R can't handle 200K features or am I doing something >>>>> completely wrong here? >>>>> >>>>> Thanks a lot for your help! >>>>> >>>>> best Regards, >>>>> >>>>> Romeo >>>>> >>>>> ______________________________________________ >>>>> R-help@r-project.org mailing list >>>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>>> PLEASE do read the posting guide >>>>> http://www.R-project.org/posting-guide.html >>>>> and provide commented, minimal, self-contained, reproducible code. >>> >> > -- Eik Vettorazzi Department of Medical Biometry and Epidemiology University Medical Center Hamburg-Eppendorf Martinistr. 52 20246 Hamburg T ++49/40/7410-58243 F ++49/40/7410-57790 -- Besuchen Sie uns auf: www.uke.de _____________________________________________________________________ Universitätsklinikum Hamburg-Eppendorf; Körperschaft des öffentlichen Rechts; Gerichtsstand: Hamburg Vorstandsmitglieder: Prof. Dr. Christian Gerloff (Vertreter des Vorsitzenden), Prof. Dr. Dr. Uwe Koch-Gromus, Joachim Prölß, Rainer Schoppik _____________________________________________________________________ SAVE PAPER - THINK BEFORE PRINTING ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.