Re: [R] Multiple CPU HowTo in Linux?
Hello all, Thanks for your input, and helping to clear things up on where to go. I will try out the multicore package and see if there are further bottlenecks. It looks like some loops might need special treatment with parallelization. I have been pampered with the excellent walk-through vignettes of the packages I have used so far. The HPC package guides lacked something in the practical aspects of their usage. lapply <- mclapply at the beginning of my script? Well, I never would have thought of such a thing. Thanks! I might be back on the list when I run out of physical RAM ;-) Edwin -- On Tue, 14 Sep 2010 12:11:17 -0700 Martin Morgan wrote: > On 09/14/2010 08:36 AM, Christian Raschke wrote: > > Edwin, > > > > I'm not sure what you mean by "adapting"; other than installing > > multicore, there is nothing else to set up. How and whether you > could > > then parallelise your code strongly depends on the specific problem > you > > are facing. > > > > What have done in the past was to look at the source of the > functions > > from whatever package I was using that produced the bottleneck. If > what > > is taking the longest time is actually embarrassingly parallel, > > mclapply() from package multicore can help. In the simplest case > you > > could simply replace lapply() in the with an appropriate > mclapply(). > > Check out ?mclapply. But then again you might have to get a little > more > > creative, depending on exactly what in the code is taking so long > to > > run. If your problem is inherently sequential then even multicore > won't > > help. > > > > Christian > > > > On 09/14/2010 09:35 AM, Edwin Groot wrote: > >> Hello Cedrick, > >> Ah, yes, that looks like it would apply to my situation. I was > >> previously reading on snow, which is tailored for clusters, rather > than > >> a single desktop computer. > >> Anyone with experience adapting multicore to an R-script? > >> I have to admit I know little about parallel processing, > >> multiprocessing and cluster processing. > >> > >> Edwin > >> > >> On Tue, 14 Sep 2010 10:15:42 -0400 > >> "Johnson, Cedrick W." wrote: > >> > >>>?multicore perhaps > >>> > >>> On 09/14/2010 10:01 AM, Edwin Groot wrote: > >>> > Hello all, > I upgraded my R workstation, and to my dismay, only one core > > >>> appears to > >>> > be used during intensive computation of a bioconductor function. > > Hi Edwin -- Since you have a Bioconductor package, you might ask on > the > Bioconductor list, as the authors of some computationally intensive > tasks have provided facilities for relatively transparent use of, > e.g., > multicore or Rmpi. In ShortRead, for instance, loading multicore is > enough to distribute some tasks across cores, and the srapply > function > can help (or not; things might be as easy as lapply <- mclapply at > the > top of your script) with your own lapply-like code. > > http://bioconductor.org/help/mailing-list/ > > Martin > What I have now is two dual-core Xeon 5160 CPUs and 10 GB RAM. > When > > >>> I > >>> > fully load it, top reports about 25% user, 75% idle and 0.98 > > >>> short-term > >>> > load. > The archives gave nothing helpful besides mention of snow. I > > >>> thought of > >>> > posting to HPC, but this system is fairly modest WRT processing > > >>> power. > >>> > Any pointers of where to start? > --- > #Not running anything at the moment > > > sessionInfo() > > > R version 2.11.1 (2010-05-31) > x86_64-pc-linux-gnu > > locale: > [1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C > [3] LC_TIME=en_GB.UTF-8LC_COLLATE=en_GB.UTF-8 > [5] LC_MONETARY=C LC_MESSAGES=en_GB.UTF-8 > [7] LC_PAPER=en_GB.UTF-8 LC_NAME=C > [9] LC_ADDRESS=C LC_TELEPHONE=C > [11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C > > attached base packages: > [1] stats graphics grDevices utils datasets methods > > >>>base > >>> > > loaded via a namespace (and not attached): > [1] tools_2.11.1 > --- > $ uname -a > Linux laux29 2.6.26-2-amd64 #1 SMP Sun Jun 20 20:16:30 UTC 2010 > > >>> x86_64 > >>> > GNU/Linux > --- > Thanks for your help, > Edwin > > >>> __ > >>> R-help@r-project.org mailing list > >>> https://stat.ethz.ch/mailman/listinfo/r-help > >>> PLEASE do read the posting guide > >>> http://www.R-project.org/posting-guide.html > >>> and provide commented, minimal, self-contained, reproducible > code. > >>> > >> Dr. Edwin Groot, postdoctoral associate > >> AG Laux > >> Institut fuer Biologie III > >> Schaenzlestr. 1 > >> 79104 Freiburg, Deutschland > >> +49 761-2032945 > >> > >> _
Re: [R] Multiple CPU HowTo in Linux?
On 09/14/2010 08:36 AM, Christian Raschke wrote: > Edwin, > > I'm not sure what you mean by "adapting"; other than installing > multicore, there is nothing else to set up. How and whether you could > then parallelise your code strongly depends on the specific problem you > are facing. > > What have done in the past was to look at the source of the functions > from whatever package I was using that produced the bottleneck. If what > is taking the longest time is actually embarrassingly parallel, > mclapply() from package multicore can help. In the simplest case you > could simply replace lapply() in the with an appropriate mclapply(). > Check out ?mclapply. But then again you might have to get a little more > creative, depending on exactly what in the code is taking so long to > run. If your problem is inherently sequential then even multicore won't > help. > > Christian > > On 09/14/2010 09:35 AM, Edwin Groot wrote: >> Hello Cedrick, >> Ah, yes, that looks like it would apply to my situation. I was >> previously reading on snow, which is tailored for clusters, rather than >> a single desktop computer. >> Anyone with experience adapting multicore to an R-script? >> I have to admit I know little about parallel processing, >> multiprocessing and cluster processing. >> >> Edwin >> >> On Tue, 14 Sep 2010 10:15:42 -0400 >> "Johnson, Cedrick W." wrote: >> >>>?multicore perhaps >>> >>> On 09/14/2010 10:01 AM, Edwin Groot wrote: >>> Hello all, I upgraded my R workstation, and to my dismay, only one core >>> appears to >>> be used during intensive computation of a bioconductor function. Hi Edwin -- Since you have a Bioconductor package, you might ask on the Bioconductor list, as the authors of some computationally intensive tasks have provided facilities for relatively transparent use of, e.g., multicore or Rmpi. In ShortRead, for instance, loading multicore is enough to distribute some tasks across cores, and the srapply function can help (or not; things might be as easy as lapply <- mclapply at the top of your script) with your own lapply-like code. http://bioconductor.org/help/mailing-list/ Martin What I have now is two dual-core Xeon 5160 CPUs and 10 GB RAM. When >>> I >>> fully load it, top reports about 25% user, 75% idle and 0.98 >>> short-term >>> load. The archives gave nothing helpful besides mention of snow. I >>> thought of >>> posting to HPC, but this system is fairly modest WRT processing >>> power. >>> Any pointers of where to start? --- #Not running anything at the moment > sessionInfo() > R version 2.11.1 (2010-05-31) x86_64-pc-linux-gnu locale: [1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_GB.UTF-8LC_COLLATE=en_GB.UTF-8 [5] LC_MONETARY=C LC_MESSAGES=en_GB.UTF-8 [7] LC_PAPER=en_GB.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods >>>base >>> loaded via a namespace (and not attached): [1] tools_2.11.1 --- $ uname -a Linux laux29 2.6.26-2-amd64 #1 SMP Sun Jun 20 20:16:30 UTC 2010 >>> x86_64 >>> GNU/Linux --- Thanks for your help, Edwin >>> __ >>> R-help@r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >>> >> Dr. Edwin Groot, postdoctoral associate >> AG Laux >> Institut fuer Biologie III >> Schaenzlestr. 1 >> 79104 Freiburg, Deutschland >> +49 761-2032945 >> >> __ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Multiple CPU HowTo in Linux?
Edwin, I'm not sure what you mean by "adapting"; other than installing multicore, there is nothing else to set up. How and whether you could then parallelise your code strongly depends on the specific problem you are facing. What have done in the past was to look at the source of the functions from whatever package I was using that produced the bottleneck. If what is taking the longest time is actually embarrassingly parallel, mclapply() from package multicore can help. In the simplest case you could simply replace lapply() in the with an appropriate mclapply(). Check out ?mclapply. But then again you might have to get a little more creative, depending on exactly what in the code is taking so long to run. If your problem is inherently sequential then even multicore won't help. Christian On 09/14/2010 09:35 AM, Edwin Groot wrote: Hello Cedrick, Ah, yes, that looks like it would apply to my situation. I was previously reading on snow, which is tailored for clusters, rather than a single desktop computer. Anyone with experience adapting multicore to an R-script? I have to admit I know little about parallel processing, multiprocessing and cluster processing. Edwin On Tue, 14 Sep 2010 10:15:42 -0400 "Johnson, Cedrick W." wrote: ?multicore perhaps On 09/14/2010 10:01 AM, Edwin Groot wrote: Hello all, I upgraded my R workstation, and to my dismay, only one core appears to be used during intensive computation of a bioconductor function. What I have now is two dual-core Xeon 5160 CPUs and 10 GB RAM. When I fully load it, top reports about 25% user, 75% idle and 0.98 short-term load. The archives gave nothing helpful besides mention of snow. I thought of posting to HPC, but this system is fairly modest WRT processing power. Any pointers of where to start? --- #Not running anything at the moment sessionInfo() R version 2.11.1 (2010-05-31) x86_64-pc-linux-gnu locale: [1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_GB.UTF-8LC_COLLATE=en_GB.UTF-8 [5] LC_MONETARY=C LC_MESSAGES=en_GB.UTF-8 [7] LC_PAPER=en_GB.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] tools_2.11.1 --- $ uname -a Linux laux29 2.6.26-2-amd64 #1 SMP Sun Jun 20 20:16:30 UTC 2010 x86_64 GNU/Linux --- Thanks for your help, Edwin __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Dr. Edwin Groot, postdoctoral associate AG Laux Institut fuer Biologie III Schaenzlestr. 1 79104 Freiburg, Deutschland +49 761-2032945 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Christian Raschke Department of Economics and ISDS Research Lab (HSRG) Louisiana State University Patrick Taylor Hall, Rm 2128 Baton Rouge, LA 70803 cras...@lsu.edu __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Multiple CPU HowTo in Linux?
Hello Cedrick, Ah, yes, that looks like it would apply to my situation. I was previously reading on snow, which is tailored for clusters, rather than a single desktop computer. Anyone with experience adapting multicore to an R-script? I have to admit I know little about parallel processing, multiprocessing and cluster processing. Edwin On Tue, 14 Sep 2010 10:15:42 -0400 "Johnson, Cedrick W." wrote: > ?multicore perhaps > > On 09/14/2010 10:01 AM, Edwin Groot wrote: > > Hello all, > > I upgraded my R workstation, and to my dismay, only one core > appears to > > be used during intensive computation of a bioconductor function. > > What I have now is two dual-core Xeon 5160 CPUs and 10 GB RAM. When > I > > fully load it, top reports about 25% user, 75% idle and 0.98 > short-term > > load. > > The archives gave nothing helpful besides mention of snow. I > thought of > > posting to HPC, but this system is fairly modest WRT processing > power. > > Any pointers of where to start? > > --- > > #Not running anything at the moment > >> sessionInfo() > > R version 2.11.1 (2010-05-31) > > x86_64-pc-linux-gnu > > > > locale: > > [1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C > > [3] LC_TIME=en_GB.UTF-8LC_COLLATE=en_GB.UTF-8 > > [5] LC_MONETARY=C LC_MESSAGES=en_GB.UTF-8 > > [7] LC_PAPER=en_GB.UTF-8 LC_NAME=C > > [9] LC_ADDRESS=C LC_TELEPHONE=C > > [11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C > > > > attached base packages: > > [1] stats graphics grDevices utils datasets methods > base > > > > > > loaded via a namespace (and not attached): > > [1] tools_2.11.1 > > --- > > $ uname -a > > Linux laux29 2.6.26-2-amd64 #1 SMP Sun Jun 20 20:16:30 UTC 2010 > x86_64 > > GNU/Linux > > --- > > Thanks for your help, > > Edwin > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. Dr. Edwin Groot, postdoctoral associate AG Laux Institut fuer Biologie III Schaenzlestr. 1 79104 Freiburg, Deutschland +49 761-2032945 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Multiple CPU HowTo in Linux?
?multicore perhaps On 09/14/2010 10:01 AM, Edwin Groot wrote: Hello all, I upgraded my R workstation, and to my dismay, only one core appears to be used during intensive computation of a bioconductor function. What I have now is two dual-core Xeon 5160 CPUs and 10 GB RAM. When I fully load it, top reports about 25% user, 75% idle and 0.98 short-term load. The archives gave nothing helpful besides mention of snow. I thought of posting to HPC, but this system is fairly modest WRT processing power. Any pointers of where to start? --- #Not running anything at the moment sessionInfo() R version 2.11.1 (2010-05-31) x86_64-pc-linux-gnu locale: [1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_GB.UTF-8LC_COLLATE=en_GB.UTF-8 [5] LC_MONETARY=C LC_MESSAGES=en_GB.UTF-8 [7] LC_PAPER=en_GB.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] tools_2.11.1 --- $ uname -a Linux laux29 2.6.26-2-amd64 #1 SMP Sun Jun 20 20:16:30 UTC 2010 x86_64 GNU/Linux --- Thanks for your help, Edwin __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Multiple CPU HowTo in Linux?
Hello all, I upgraded my R workstation, and to my dismay, only one core appears to be used during intensive computation of a bioconductor function. What I have now is two dual-core Xeon 5160 CPUs and 10 GB RAM. When I fully load it, top reports about 25% user, 75% idle and 0.98 short-term load. The archives gave nothing helpful besides mention of snow. I thought of posting to HPC, but this system is fairly modest WRT processing power. Any pointers of where to start? --- #Not running anything at the moment > sessionInfo() R version 2.11.1 (2010-05-31) x86_64-pc-linux-gnu locale: [1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_GB.UTF-8LC_COLLATE=en_GB.UTF-8 [5] LC_MONETARY=C LC_MESSAGES=en_GB.UTF-8 [7] LC_PAPER=en_GB.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] tools_2.11.1 --- $ uname -a Linux laux29 2.6.26-2-amd64 #1 SMP Sun Jun 20 20:16:30 UTC 2010 x86_64 GNU/Linux --- Thanks for your help, Edwin -- Dr. Edwin Groot, postdoctoral associate AG Laux Institut fuer Biologie III Schaenzlestr. 1 79104 Freiburg, Deutschland +49 761-2032945 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.