Re: [R] re-sampling of large sacle data
On Jul 28, 2010, at 12:09 AM, jd6688 wrote: d <- apply(s, 2, sample, size = 1*nrow(s), replace = TRUE) why the code above return the following error Error: cannot allocate vector of size 218.8 Mb Possibilities: Your workspace is full of other junk? Your workspace used to be full of other junk and its memory is too fragmented to find a contiguous chunk of memory? Your computer is full of other junk? You have not read the R-FAQ ( or the RW-FAQ ) items on the the topic of memory usage on whatever operating system you are working with. -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] re-sampling of large sacle data
On Jul 27, 2010, at 6:44 PM, jd6688 wrote: I am trying to do the following to accomplish the tasks, can anybody to simplify the solutions. Thanks, for (i in 1:1){ d<-apply(s,2,sample) pos_neg_tem<-t(apply(d,1,doit)) if (i>1){ pos_neg_pool<-rbind(pos_neg_pool,pos_neg_tem) }else{ pos_neg_pool<- pos_neg_tem }} A bit of efficiency advice: incremental creation of objects is generally a major source of slowness. Consider creating pos_neg_pool before the loop and then "filling it in" within the loop. It would also let you remove that "if{}else{}" statement. -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] re-sampling of large sacle data
d <- apply(s, 2, sample, size = 1*nrow(s), replace = TRUE) why the code above return the following error Error: cannot allocate vector of size 218.8 Mb -- View this message in context: http://r.789695.n4.nabble.com/re-sampling-of-large-sacle-data-tp2304165p2304413.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] re-sampling of large sacle data
It looks to me like you keep sampling from some dataset 's' 10,000 times. Since you can sample() with replacement, I wonder if you could just take a sample of the size you want, rather than using a loop with sample. Perhaps along these lines: d <- apply(s, 2, sample, size = 1*nrow(s), replace = TRUE) pos_neg_tem <- t(apply(d,1,doit)) Josh On Tue, Jul 27, 2010 at 3:44 PM, jd6688 wrote: > > I am trying to do the following to accomplish the tasks, can anybody to > simplify the solutions. > > Thanks, > > for (i in 1:1){ > d<-apply(s,2,sample) > pos_neg_tem<-t(apply(d,1,doit)) > if (i>1){ > pos_neg_pool<-rbind(pos_neg_pool,pos_neg_tem) > > }else{ > > pos_neg_pool<- pos_neg_tem > }} > -- > View this message in context: > http://r.789695.n4.nabble.com/re-sampling-of-large-sacle-data-tp2304165p2304221.html > Sent from the R help mailing list archive at Nabble.com. > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] re-sampling of large sacle data
I am trying to do the following to accomplish the tasks, can anybody to simplify the solutions. Thanks, for (i in 1:1){ d<-apply(s,2,sample) pos_neg_tem<-t(apply(d,1,doit)) if (i>1){ pos_neg_pool<-rbind(pos_neg_pool,pos_neg_tem) }else{ pos_neg_pool<- pos_neg_tem }} -- View this message in context: http://r.789695.n4.nabble.com/re-sampling-of-large-sacle-data-tp2304165p2304221.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] re-sampling of large sacle data
Write a function that incorporates "doit" and the column shuffle. Let's call it "doitbetter" replicate(1, doitbetter()) You'll probably want to read the help for "replicate" to make sure the defaults are what you want. --Gray On Tue, Jul 27, 2010 at 4:43 PM, jd6688 wrote: > > myDF: > > d1 d2 d3 d4 > d5 > -0.166910351 0.022304377 -0.00825924 0.008330689 -0.000925938 > -0.166910351 0.022304377 -0.00825924 0.008330689 -0.000925938 > -0.166910351 0.022304377 -0.00825924 0.008330689 -0.168225938 > -0.166910351 0.022304377 -0.00825924 0.008330689 -0.168225938 > -0.166910351 0.022304377 -0.00825924 0.008330689 -0.168225938 > -0.166910351 0.022304377 -0.00825924 0.008330689 -0.168225938 > -0.166910351 0.022304377 -0.00825924 0.008330689 -0.168225938 > -0.166910351 0.022304377 -0.00825924 0.008330689 -0.168225938 > -0.166910351 0.022304377 -0.00825924 0.008330689 -0.168225938 > -0.166910351 0.022304377 -0.00825924 0.008330689 -0.168225938 > -0.166910351 0.022304377 -0.00825924 0.008330689 -0.168225938 > -0.166910351 0.022304377 -0.00825924 0.008330689 -0.168225938 > -0.166910351 0.022304377 -0.00825924 0.008330689 -0.168225938 > -0.166910351 0.022304377 -0.00825924 0.008330689 -0.168225938 > -0.166910351 0.022304377 -0.00825924 0.008330689 -0.168225938 > -0.166910351 0.022304377 -0.00825924 0.008330689 -0.168225938 > -0.166910351 0.022304377 -0.00825924 0.008330689 -0.168225938 > -0.166910351 0.022304377 -0.00825924 0.008330689 -0.168225938 > -0.166910351 0.022304377 -0.00825924 0.008330689 -0.168225938 > -0.166910351 0.022304377 -0.00825924 0.008330689 -0.168225938 > -0.166910351 0.022304377 -0.00825924 0.008330689 -0.168225938 > -0.166910351 0.022304377 -0.00825924 0.008330689 -0.168225938 > -0.166910351 0.022304377 -0.00825924 0.008330689 -0.168225938 > > > per the dataframe above, > step 1: do the following > > > doit=function(x)c(sum_positive=sum(x[-1][x[-1]>0]),sum_negative=sum(x[-1][x[-1]<0])) > > pos_neg_pool<-t(apply(myDF,1,doit)) > if not first run then append the data to the pos_neg_pool > step2: reshuffle the data by columns then do step1, this step need to run > 1 times; > > output will be 23*1=230,000 rows. > > Can anyone point out how to automate this 1 runs in R? > > Thanks, > > > > > > > -- > View this message in context: > http://r.789695.n4.nabble.com/re-sampling-of-large-sacle-data-tp2304165p2304165.html > Sent from the R help mailing list archive at Nabble.com. > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Gray Calhoun Assistant Professor of Economics, Iowa State University http://www.econ.iastate.edu/~gcalhoun/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.