On 12/12/2013 7:08 AM, Eik Vettorazzi wrote:
thanks Duncan for this clarification.
A double precision matrix with 2e11 elements (as the op wanted) would
need about 1.5 TB memory, that's more than a standard (windows 64bit)
computer can handle.

According to Microsoft's "Memory Limits" web page (currently at http://msdn.microsoft.com/en-us/library/windows/desktop/aa366778%28v=vs.85%29.aspx#memory_limits, but these things tend to move around), the limit is 8 TB for virtual memory. (The same page lists a variety of smaller physical memory limits, depending on the Windows version, but R doesn't need physical memory, virtual is good enough. )

R would be very slow if it was working with objects bigger than physical memory, but it could conceivably work.

Duncan Murdoch

Cheers.

Am 12.12.2013 13:00, schrieb Duncan Murdoch:
> On 13-12-12 6:51 AM, Eik Vettorazzi wrote:
>> I thought so (with all the limitations due to collinearity and so on),
>> but actually there is a limit for the maximum size of an array which is
>> independent of your memory size and is due to the way arrays are
>> indexed. You can't create an object with more than 2^31-1 = 2147483647
>> elements.
>>
>> https://stat.ethz.ch/pipermail/r-help/2007-June/133238.html
>
> That post is from 2007.  The limits were raised considerably when R
> 3.0.0 was released, and it is now 2^48 for disk-based operations, 2^52
> for working in memory.
>
> Duncan Murdoch
>
>
>>
>> cheers
>>
>> Am 12.12.2013 12:34, schrieb Romeo Kienzler:
>>> ok, so 200K predictors an 10M observations would work?
>>>
>>>
>>> On 12/12/2013 12:12 PM, Eik Vettorazzi wrote:
>>>> it is simply because you can't do a regression with more predictors
>>>> than
>>>> observations.
>>>>
>>>> Cheers.
>>>>
>>>> Am 12.12.2013 09:00, schrieb Romeo Kienzler:
>>>>> Dear List,
>>>>>
>>>>> I'm quite new to R and want to do logistic regression with a 200K
>>>>> feature data set (around 150 training examples).
>>>>>
>>>>> I'm aware that I should use Naive Bayes but I have a more general
>>>>> question about the capability of R handling very high dimensional
>>>>> data.
>>>>>
>>>>> Please consider the following R code where "mygenestrain.tab" is a 150
>>>>> by 200000 matrix:
>>>>>
>>>>> traindata <- read.table('mygenestrain.tab');
>>>>> mylogit <- glm(V1 ~ ., data = traindata, family = "binomial");
>>>>>
>>>>> When executing this code I get the following error:
>>>>>
>>>>> Error in terms.formula(formula, data = data) :
>>>>>     allocMatrix: too many elements specified
>>>>> Calls: glm ... model.frame -> model.frame.default -> terms ->
>>>>> terms.formula
>>>>> Execution halted
>>>>>
>>>>> Is this because R can't handle 200K features or am I doing something
>>>>> completely wrong here?
>>>>>
>>>>> Thanks a lot for your help!
>>>>>
>>>>> best Regards,
>>>>>
>>>>> Romeo
>>>>>
>>>>> ______________________________________________
>>>>> R-help@r-project.org mailing list
>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>> PLEASE do read the posting guide
>>>>> http://www.R-project.org/posting-guide.html
>>>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>


______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to