ok, so 200K predictors an 10M observations would work?


On 12/12/2013 12:12 PM, Eik Vettorazzi wrote:
it is simply because you can't do a regression with more predictors than
observations.

Cheers.

Am 12.12.2013 09:00, schrieb Romeo Kienzler:
Dear List,

I'm quite new to R and want to do logistic regression with a 200K
feature data set (around 150 training examples).

I'm aware that I should use Naive Bayes but I have a more general
question about the capability of R handling very high dimensional data.

Please consider the following R code where "mygenestrain.tab" is a 150
by 200000 matrix:

traindata <- read.table('mygenestrain.tab');
mylogit <- glm(V1 ~ ., data = traindata, family = "binomial");

When executing this code I get the following error:

Error in terms.formula(formula, data = data) :
   allocMatrix: too many elements specified
Calls: glm ... model.frame -> model.frame.default -> terms -> terms.formula
Execution halted

Is this because R can't handle 200K features or am I doing something
completely wrong here?

Thanks a lot for your help!

best Regards,

Romeo

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to