[R] Eleaps in package subselect crashes when using include arguement
I'm using eleaps to build a forward selection algorithm iteratively, but the program unexpectedly crashes. In fact, it completely closes my session in RStudio. The first 39 steps work fine, but on the 40th step, it unexpectedly stops with no errors. I've isolated the error to the code snippit below. There are 39 predictors, and I'm searching for the 40th best. I've passed in a fisher information matrix and the H matrix. I can't figure out why the process is stopping at the 40th iteration. include<-scan() 1 2 3 5 8 16 18 19 25 32 33 34 36 37 38 40 41 42 46 49 52 54 55 58 60 62 63 66 67 70 72 74 78 81 83 88 89 100 105 lps <- eleaps(x$mat, 40, 40, 1, NULL, include, H=x$H, r=1) --Nathan [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Optimization inconsistencies
I have a very simple maximization problem where I'm solving for the vector x: objective function: w'x = value to maximize box constraints (for all elements of w): low < x < high equality constraint: sum(x) = 1 But I get inconsistent results depending on what starting values I. I've tried various packages but none seem to bee the very solver in Excel. Any recommendations on what packages or functions I should try? --Nathan [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] glmnet sparse matrix error: dim specifies too large an array
I'm running into an unexpected error using the glmnet and Matrix packages. I have a matrix that is 8 million rows by 100 columns with 75% of the entries being zero. When I run a vanilla glmnet logistic model on my server with 300 GB of RAM, the task completes in 20 minutes: > x # 8 million x 100 matrix > model1 <- glmnet(x,y,'binomial',alpha=1) # run time 20 minutes But if I convert the matrix to a sparse matrix using the Matrix package, the model does not run at all: > x2 <- Matrix(x,sparse=T) # 75% sparse > model2 <- glmnet(x2,y,'binomial',alpha=1) # error Error in array(0, c(n, p)) : 'dim' specifies too large an array This result is the opposite of what I might have expected. The non-sparse data runs fine, but the sparse data fails because it is "too large". Is this a glmnet issue or an R memory issue? Is there a way to fix this in glmnet? --Nathan [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] What is the largest in memory data object you've worked with in R?
For me, I've found that I can easily work with 1 GB datasets. This includes linear models and aggregations. Working with 5 GB becomes cumbersome. Anything over that, and R croaks. I'm using a dual quad core Dell with 48 GB of RAM. I'm wondering if there is anyone out there running jobs in the 100 GB range. If so, what does your hardware look like? --Nathan [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Min hash
Anyone know of a min hash algorithm written in R? --Nathan [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Server hanging despite efforts to correct memory limits
My group is working with datasets between 100 Mb and 1 GB in size, using multiple log ins. From the documentation, it appears that vsize is limited to 2^30-1, which tends to prove too restrictive for our use. When we drop that restriction (set vsize = NA) we end up hanging the server, which requires a restart. Is there any way to increase the memory limits on R while keeping our jobs from hanging? Having to restart the server is a major inconvenience, second only to memory limitations in R. > mem.limits() nsize vsize 1NA > mem.limits(vsize=2^30) nsize vsize 1 1073741824 > mem.limits(vsize=2^31) nsize vsize 1 1073741824 Warning message: In structure(.Internal(mem. limits(as.integer(nsize), as.integer(vsize))), : NAs introduced by coercion [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R jobs keep hanging linux server despite mem.limits modifcations
My group is working with datasets between 100 Mb and 1 GB in size, using multiple log ins. From the documentation, it appears that vsize is limited to 2^30-1, which tends to prove too restrictive for our use. When we drop that restriction (set vsize = NA) we end up hanging the server, which requires a restart. Is there any way to increase the memory limits on R while keeping our jobs from hanging? Having to restart the server is a major inconvenience, second only to memory limitations in R. > mem.limits() nsize vsize 1NA > mem.limits(vsize=2^30) nsize vsize 1 1073741824 > mem.limits(vsize=2^31) nsize vsize 1 1073741824 Warning message: In structure(.Internal(mem.limits(as.integer(nsize), as.integer(vsize))), : NAs introduced by coercion --Nathan [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.