[R] Performance issue on sparse matrix object

2015-03-08 Thread Romeo Kienzler
Hi,

I'm writing to a sparse 2500x18 matrix in a column wise random access
pattern and I'm facing very strong performance issues which I'm not facing
with the dense implementation ( where I'm facing main memory issues )

Is there another way to solve this?

Best regards

Romeo

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Logistic Regression with 200K features in R?

2013-12-12 Thread Romeo Kienzler

Dear List,

I'm quite new to R and want to do logistic regression with a 200K 
feature data set (around 150 training examples).


I'm aware that I should use Naive Bayes but I have a more general 
question about the capability of R handling very high dimensional data.


Please consider the following R code where mygenestrain.tab is a 150 
by 20 matrix:


traindata - read.table('mygenestrain.tab');
mylogit - glm(V1 ~ ., data = traindata, family = binomial);

When executing this code I get the following error:

Error in terms.formula(formula, data = data) :
  allocMatrix: too many elements specified
Calls: glm ... model.frame - model.frame.default - terms - terms.formula
Execution halted

Is this because R can't handle 200K features or am I doing something 
completely wrong here?


Thanks a lot for your help!

best Regards,

Romeo

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Logistic Regression with 200K features in R?

2013-12-12 Thread Romeo Kienzler

ok, so 200K predictors an 10M observations would work?


On 12/12/2013 12:12 PM, Eik Vettorazzi wrote:

it is simply because you can't do a regression with more predictors than
observations.

Cheers.

Am 12.12.2013 09:00, schrieb Romeo Kienzler:

Dear List,

I'm quite new to R and want to do logistic regression with a 200K
feature data set (around 150 training examples).

I'm aware that I should use Naive Bayes but I have a more general
question about the capability of R handling very high dimensional data.

Please consider the following R code where mygenestrain.tab is a 150
by 20 matrix:

traindata - read.table('mygenestrain.tab');
mylogit - glm(V1 ~ ., data = traindata, family = binomial);

When executing this code I get the following error:

Error in terms.formula(formula, data = data) :
   allocMatrix: too many elements specified
Calls: glm ... model.frame - model.frame.default - terms - terms.formula
Execution halted

Is this because R can't handle 200K features or am I doing something
completely wrong here?

Thanks a lot for your help!

best Regards,

Romeo

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Logistic Regression with 200K features in R?

2013-12-12 Thread Romeo Kienzler

Dear Eik,

thank you so much for your help!

best Regards,

Romeo

On 12/12/2013 12:51 PM, Eik Vettorazzi wrote:

I thought so (with all the limitations due to collinearity and so on),
but actually there is a limit for the maximum size of an array which is
independent of your memory size and is due to the way arrays are
indexed. You can't create an object with more than 2^31-1 = 2147483647
elements.

https://stat.ethz.ch/pipermail/r-help/2007-June/133238.html

cheers

Am 12.12.2013 12:34, schrieb Romeo Kienzler:

ok, so 200K predictors an 10M observations would work?


On 12/12/2013 12:12 PM, Eik Vettorazzi wrote:

it is simply because you can't do a regression with more predictors than
observations.

Cheers.

Am 12.12.2013 09:00, schrieb Romeo Kienzler:

Dear List,

I'm quite new to R and want to do logistic regression with a 200K
feature data set (around 150 training examples).

I'm aware that I should use Naive Bayes but I have a more general
question about the capability of R handling very high dimensional data.

Please consider the following R code where mygenestrain.tab is a 150
by 20 matrix:

traindata - read.table('mygenestrain.tab');
mylogit - glm(V1 ~ ., data = traindata, family = binomial);

When executing this code I get the following error:

Error in terms.formula(formula, data = data) :
allocMatrix: too many elements specified
Calls: glm ... model.frame - model.frame.default - terms -
terms.formula
Execution halted

Is this because R can't handle 200K features or am I doing something
completely wrong here?

Thanks a lot for your help!

best Regards,

Romeo

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.