Re: [R] Using multicores in R
Thanks for the help, Perhaps I should elaborate a bit, I am working on bioinformatics project in which I am trying to run a forward selection algorithm for machine learning classification of two biological conditions. At each iteration I want to find the gene that in addition to those I have found already does the best classification. It looks something like this: for (j in 1:5030) { tp <- 0; for (i in 1:5030) { if (!(i %in% idx)) { classifier<-naiveBayes(trn[,c(i,idx)], trn[,20118]) tbl <-table(predict(classifier, trn[,-20118]), trn[,20118]) success <- (tbl[[1]] +tbl[[4]])/(tbl[[1]] +tbl[[4]]+tbl[[2]]+tbl[[3]]) if (success > tp) { tp <- success ind <- i gene <- names(trn)[i] } } } idx <- c(idx,ind) res <- rbind(res, data.frame(Iteration=j,Success=tp*100,Gene=gene)) } I am no expert when it comes to programming so I am not sure how can I optimize my relatively primitive code in the best way... Thanks, Moriah -- View this message in context: http://r.789695.n4.nabble.com/Using-multicores-in-R-tp4651808p4652034.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using multicores in R
Moriah, Since you are doing nested loops, Rcpp may be an easy speed-up. Follow all the links here http://blog.revolutionanalytics.com/2012/11/hadleys-guide-to-high-performance-r-with-rcpp.html for details. HTH, Jim Porzak Minted.com San Francisco, CA www.linkedin.com/in/jimporzak use R! Group SF: www.meetup.com/R-Users/ On Mon, Dec 3, 2012 at 2:14 AM, moriah wrote: > Hi, > > I have an R script which is time consuming because it has two nested loops > in it of at least 5000 iterations each, I have tried to use the multicore > package but id doesn't seem to improve the elapsed time of the script(a > shorter script for example) and I can't use the mcapply because of technical > reasons. > > I was wondering how can I make my script use more cores and memory because I > am running it on a server and it is a shame that it uses only one core. > > Thanks! > Moriah > > > > > -- > View this message in context: > http://r.789695.n4.nabble.com/Using-multicores-in-R-tp4651808.html > Sent from the R help mailing list archive at Nabble.com. > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using multicores in R
1. Have you looked at CRAN Task View: High-Performance and Parallel Computing with R (http://cran.r-project.org/web/views/HighPerformanceComputing.html)? 2. Have you tried the "compiler" package? If I understand correctly, R is a two-stage interpreter, first translating what we know as R into byte code, which is then interpreted by a byte code interpreter. If my memory is correct, this approach can cut the compute time by a factor of 100. 3. Have you reviewed the section on "Profiling R code for speed" in the "Writing R Extensions" manual that becomes available after help.start()? The profiling tools discussed there help identify the portion of more complex code that takes the most time. The standard advice then is to experiment with writing the most time consuming portion several different ways. I've seen many examples where writing what appears to be the same thing in R several different ways identifies one that is easily 10 and maybe 100 or 1000 times faster than the slowest alternative tried. 4. Have you tried using the "sos" package to search for other functions and packages in R that may already have good code doing some of the things you want to do? The "findFn" function in "sos" searches the "functions" subset of the "RSiteSearch" database and returns the result sorted by package. There are also a "union" and "writeFindFn2xls" functions to make it easy to manipulate and evaluate the results, described in a vignette. It's the best literature search I know for anything statistical: If I don't find it there, it's OK to look someplace else. [Caveat: I'm the lead author of "sos", so I'm biased.] Best Wishes, Spencer On 12/3/2012 6:24 AM, Steve Lianoglou wrote: And also: On Monday, December 3, 2012, Uwe Ligges wrote: On 03.12.2012 11:14, moriah wrote: Hi, I have an R script which is time consuming because it has two nested loops in it of at least 5000 iterations each, I have tried to use the multicore package but id doesn't seem to improve the elapsed time of the script(a shorter script for example) and I can't use the mcapply because of technical reasons. Errr, but otherwise multicore does not have an effect ... See package "parallel" that offers various functions for parallel computations. We cannot help much more if you do not tell us what the technical reasons are why mcapply() does not work. If the work you are doing within each iteration of the loop is trivial, you will likely even see a decrease in performance if you try to parallelize it. Without more info from you regarding your problem, there's little we can do to help, tho. -Steve -- Spencer Graves, PE, PhD President and Chief Technology Officer Structure Inspection and Monitoring, Inc. 751 Emerson Ct. San José, CA 95126 ph: 408-655-4567 web: www.structuremonitoring.com -- Spencer Graves, PE, PhD President and Chief Technology Officer Structure Inspection and Monitoring, Inc. 751 Emerson Ct. San José, CA 95126 ph: 408-655-4567 web: www.structuremonitoring.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using multicores in R
And also: On Monday, December 3, 2012, Uwe Ligges wrote: > > > On 03.12.2012 11:14, moriah wrote: > >> Hi, >> >> I have an R script which is time consuming because it has two nested loops >> in it of at least 5000 iterations each, I have tried to use the multicore >> package but id doesn't seem to improve the elapsed time of the script(a >> shorter script for example) and I can't use the mcapply because of >> technical >> reasons. >> > > Errr, but otherwise multicore does not have an effect ... > > See package "parallel" that offers various functions for parallel > computations. We cannot help much more if you do not tell us what the > technical reasons are why mcapply() does not work. If the work you are doing within each iteration of the loop is trivial, you will likely even see a decrease in performance if you try to parallelize it. Without more info from you regarding your problem, there's little we can do to help, tho. -Steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using multicores in R
On 03.12.2012 11:14, moriah wrote: Hi, I have an R script which is time consuming because it has two nested loops in it of at least 5000 iterations each, I have tried to use the multicore package but id doesn't seem to improve the elapsed time of the script(a shorter script for example) and I can't use the mcapply because of technical reasons. Errr, but otherwise multicore does not have an effect ... See package "parallel" that offers various functions for parallel computations. We cannot help much more if you do not tell us what the technical reasons are why mcapply() does not work. Best, Uwe Ligges I was wondering how can I make my script use more cores and memory because I am running it on a server and it is a shame that it uses only one core. Thanks! Moriah -- View this message in context: http://r.789695.n4.nabble.com/Using-multicores-in-R-tp4651808.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Using multicores in R
Hi, I have an R script which is time consuming because it has two nested loops in it of at least 5000 iterations each, I have tried to use the multicore package but id doesn't seem to improve the elapsed time of the script(a shorter script for example) and I can't use the mcapply because of technical reasons. I was wondering how can I make my script use more cores and memory because I am running it on a server and it is a shame that it uses only one core. Thanks! Moriah -- View this message in context: http://r.789695.n4.nabble.com/Using-multicores-in-R-tp4651808.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.