Re: [R] Multiple CPU HowTo in Linux?

2010-09-15 Thread Edwin Groot
Hello all,
Thanks for your input, and helping to clear things up on where to go.
I will try out the multicore package and see if there are further
bottlenecks. It looks like some loops might need special treatment with
parallelization.
I have been pampered with the excellent walk-through vignettes of the
packages I have used so far. The HPC package guides lacked something in
the practical aspects of their usage.
lapply <- mclapply at the beginning of my script? Well, I never would
have thought of such a thing. Thanks!

I might be back on the list when I run out of physical RAM ;-)

Edwin
-- 
On Tue, 14 Sep 2010 12:11:17 -0700
 Martin Morgan  wrote:
> On 09/14/2010 08:36 AM, Christian Raschke wrote:
> > Edwin,
> > 
> > I'm not sure what you mean by "adapting"; other than installing
> > multicore, there is nothing else to set up. How and whether you
> could
> > then parallelise your code strongly depends on the specific problem
> you
> > are facing.
> > 
> > What have done in the past was to look at the source of the
> functions
> > from whatever package I was using that produced the bottleneck. If
> what
> > is taking the longest time is actually embarrassingly parallel,
> > mclapply() from package multicore can help. In the simplest case
> you
> > could simply replace lapply() in the with an appropriate
> mclapply().
> > Check out ?mclapply. But then again you might have to get a little
> more
> > creative, depending on exactly what in the code is taking so long
> to
> > run. If your problem is inherently sequential then even multicore
> won't
> > help.
> > 
> > Christian
> > 
> > On 09/14/2010 09:35 AM, Edwin Groot wrote:
> >> Hello Cedrick,
> >> Ah, yes, that looks like it would apply to my situation. I was
> >> previously reading on snow, which is tailored for clusters, rather
> than
> >> a single desktop computer.
> >> Anyone with experience adapting multicore to an R-script?
> >> I have to admit I know little about parallel processing,
> >> multiprocessing and cluster processing.
> >>
> >> Edwin
> >>
> >> On Tue, 14 Sep 2010 10:15:42 -0400
> >>   "Johnson, Cedrick W."  wrote:
> >>   
> >>>?multicore perhaps
> >>>
> >>> On 09/14/2010 10:01 AM, Edwin Groot wrote:
> >>> 
>  Hello all,
>  I upgraded my R workstation, and to my dismay, only one core
> 
> >>> appears to
> >>> 
>  be used during intensive computation of a bioconductor function.
> 
> Hi Edwin -- Since you have a Bioconductor package,  you might ask on
> the
> Bioconductor list, as the authors of some computationally intensive
> tasks have provided facilities for relatively transparent use of,
> e.g.,
> multicore or Rmpi. In ShortRead, for instance, loading multicore is
> enough to distribute some tasks across cores, and the srapply
> function
> can help (or not; things might be as easy as lapply <- mclapply at
> the
> top of your script) with your own lapply-like code.
> 
> http://bioconductor.org/help/mailing-list/
> 
> Martin
>  What I have now is two dual-core Xeon 5160 CPUs and 10 GB RAM.
> When
> 
> >>> I
> >>> 
>  fully load it, top reports about 25% user, 75% idle and 0.98
> 
> >>> short-term
> >>> 
>  load.
>  The archives gave nothing helpful besides mention of snow. I
> 
> >>> thought of
> >>> 
>  posting to HPC, but this system is fairly modest WRT processing
> 
> >>> power.
> >>> 
>  Any pointers of where to start?
>  ---
>  #Not running anything at the moment
>    
> > sessionInfo()
> >  
>  R version 2.11.1 (2010-05-31)
>  x86_64-pc-linux-gnu
> 
>  locale:
> [1] LC_CTYPE=en_GB.UTF-8   LC_NUMERIC=C
> [3] LC_TIME=en_GB.UTF-8LC_COLLATE=en_GB.UTF-8
> [5] LC_MONETARY=C  LC_MESSAGES=en_GB.UTF-8
> [7] LC_PAPER=en_GB.UTF-8   LC_NAME=C
> [9] LC_ADDRESS=C   LC_TELEPHONE=C
>  [11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C
> 
>  attached base packages:
>  [1] stats graphics  grDevices utils datasets  methods
> 
> >>>base
> >>> 
> 
>  loaded via a namespace (and not attached):
>  [1] tools_2.11.1
>  ---
>  $ uname -a
>  Linux laux29 2.6.26-2-amd64 #1 SMP Sun Jun 20 20:16:30 UTC 2010
> 
> >>> x86_64
> >>> 
>  GNU/Linux
>  ---
>  Thanks for your help,
>  Edwin
> 
> >>> __
> >>> R-help@r-project.org mailing list
> >>> https://stat.ethz.ch/mailman/listinfo/r-help
> >>> PLEASE do read the posting guide
> >>> http://www.R-project.org/posting-guide.html
> >>> and provide commented, minimal, self-contained, reproducible
> code.
> >>>  
> >> Dr. Edwin Groot, postdoctoral associate
> >> AG Laux
> >> Institut fuer Biologie III
> >> Schaenzlestr. 1
> >> 79104 Freiburg, Deutschland
> >> +49 761-2032945
> >>
> >> _

Re: [R] Multiple CPU HowTo in Linux?

2010-09-14 Thread Martin Morgan
On 09/14/2010 08:36 AM, Christian Raschke wrote:
> Edwin,
> 
> I'm not sure what you mean by "adapting"; other than installing
> multicore, there is nothing else to set up. How and whether you could
> then parallelise your code strongly depends on the specific problem you
> are facing.
> 
> What have done in the past was to look at the source of the functions
> from whatever package I was using that produced the bottleneck. If what
> is taking the longest time is actually embarrassingly parallel,
> mclapply() from package multicore can help. In the simplest case you
> could simply replace lapply() in the with an appropriate mclapply().
> Check out ?mclapply. But then again you might have to get a little more
> creative, depending on exactly what in the code is taking so long to
> run. If your problem is inherently sequential then even multicore won't
> help.
> 
> Christian
> 
> On 09/14/2010 09:35 AM, Edwin Groot wrote:
>> Hello Cedrick,
>> Ah, yes, that looks like it would apply to my situation. I was
>> previously reading on snow, which is tailored for clusters, rather than
>> a single desktop computer.
>> Anyone with experience adapting multicore to an R-script?
>> I have to admit I know little about parallel processing,
>> multiprocessing and cluster processing.
>>
>> Edwin
>>
>> On Tue, 14 Sep 2010 10:15:42 -0400
>>   "Johnson, Cedrick W."  wrote:
>>   
>>>?multicore perhaps
>>>
>>> On 09/14/2010 10:01 AM, Edwin Groot wrote:
>>> 
 Hello all,
 I upgraded my R workstation, and to my dismay, only one core

>>> appears to
>>> 
 be used during intensive computation of a bioconductor function.

Hi Edwin -- Since you have a Bioconductor package,  you might ask on the
Bioconductor list, as the authors of some computationally intensive
tasks have provided facilities for relatively transparent use of, e.g.,
multicore or Rmpi. In ShortRead, for instance, loading multicore is
enough to distribute some tasks across cores, and the srapply function
can help (or not; things might be as easy as lapply <- mclapply at the
top of your script) with your own lapply-like code.

http://bioconductor.org/help/mailing-list/

Martin
 What I have now is two dual-core Xeon 5160 CPUs and 10 GB RAM. When

>>> I
>>> 
 fully load it, top reports about 25% user, 75% idle and 0.98

>>> short-term
>>> 
 load.
 The archives gave nothing helpful besides mention of snow. I

>>> thought of
>>> 
 posting to HPC, but this system is fairly modest WRT processing

>>> power.
>>> 
 Any pointers of where to start?
 ---
 #Not running anything at the moment
   
> sessionInfo()
>  
 R version 2.11.1 (2010-05-31)
 x86_64-pc-linux-gnu

 locale:
[1] LC_CTYPE=en_GB.UTF-8   LC_NUMERIC=C
[3] LC_TIME=en_GB.UTF-8LC_COLLATE=en_GB.UTF-8
[5] LC_MONETARY=C  LC_MESSAGES=en_GB.UTF-8
[7] LC_PAPER=en_GB.UTF-8   LC_NAME=C
[9] LC_ADDRESS=C   LC_TELEPHONE=C
 [11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C

 attached base packages:
 [1] stats graphics  grDevices utils datasets  methods

>>>base
>>> 

 loaded via a namespace (and not attached):
 [1] tools_2.11.1
 ---
 $ uname -a
 Linux laux29 2.6.26-2-amd64 #1 SMP Sun Jun 20 20:16:30 UTC 2010

>>> x86_64
>>> 
 GNU/Linux
 ---
 Thanks for your help,
 Edwin

>>> __
>>> R-help@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>  
>> Dr. Edwin Groot, postdoctoral associate
>> AG Laux
>> Institut fuer Biologie III
>> Schaenzlestr. 1
>> 79104 Freiburg, Deutschland
>> +49 761-2032945
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
> 
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Multiple CPU HowTo in Linux?

2010-09-14 Thread Christian Raschke

Edwin,

I'm not sure what you mean by "adapting"; other than installing 
multicore, there is nothing else to set up. How and whether you could 
then parallelise your code strongly depends on the specific problem you 
are facing.


What have done in the past was to look at the source of the functions 
from whatever package I was using that produced the bottleneck. If what 
is taking the longest time is actually embarrassingly parallel, 
mclapply() from package multicore can help. In the simplest case you 
could simply replace lapply() in the with an appropriate mclapply(). 
Check out ?mclapply. But then again you might have to get a little more 
creative, depending on exactly what in the code is taking so long to 
run. If your problem is inherently sequential then even multicore won't 
help.


Christian

On 09/14/2010 09:35 AM, Edwin Groot wrote:

Hello Cedrick,
Ah, yes, that looks like it would apply to my situation. I was
previously reading on snow, which is tailored for clusters, rather than
a single desktop computer.
Anyone with experience adapting multicore to an R-script?
I have to admit I know little about parallel processing,
multiprocessing and cluster processing.

Edwin

On Tue, 14 Sep 2010 10:15:42 -0400
  "Johnson, Cedrick W."  wrote:
   

   ?multicore perhaps

On 09/14/2010 10:01 AM, Edwin Groot wrote:
 

Hello all,
I upgraded my R workstation, and to my dismay, only one core
   

appears to
 

be used during intensive computation of a bioconductor function.
What I have now is two dual-core Xeon 5160 CPUs and 10 GB RAM. When
   

I
 

fully load it, top reports about 25% user, 75% idle and 0.98
   

short-term
 

load.
The archives gave nothing helpful besides mention of snow. I
   

thought of
 

posting to HPC, but this system is fairly modest WRT processing
   

power.
 

Any pointers of where to start?
---
#Not running anything at the moment
   

sessionInfo()
 

R version 2.11.1 (2010-05-31)
x86_64-pc-linux-gnu

locale:
   [1] LC_CTYPE=en_GB.UTF-8   LC_NUMERIC=C
   [3] LC_TIME=en_GB.UTF-8LC_COLLATE=en_GB.UTF-8
   [5] LC_MONETARY=C  LC_MESSAGES=en_GB.UTF-8
   [7] LC_PAPER=en_GB.UTF-8   LC_NAME=C
   [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods
   

   base
 


loaded via a namespace (and not attached):
[1] tools_2.11.1
---
$ uname -a
Linux laux29 2.6.26-2-amd64 #1 SMP Sun Jun 20 20:16:30 UTC 2010
   

x86_64
 

GNU/Linux
---
Thanks for your help,
Edwin
   

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
 

Dr. Edwin Groot, postdoctoral associate
AG Laux
Institut fuer Biologie III
Schaenzlestr. 1
79104 Freiburg, Deutschland
+49 761-2032945

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
   



--
Christian Raschke
Department of Economics
and
ISDS Research Lab (HSRG)
Louisiana State University
Patrick Taylor Hall, Rm 2128
Baton Rouge, LA 70803
cras...@lsu.edu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Multiple CPU HowTo in Linux?

2010-09-14 Thread Edwin Groot
Hello Cedrick,
Ah, yes, that looks like it would apply to my situation. I was
previously reading on snow, which is tailored for clusters, rather than
a single desktop computer.
Anyone with experience adapting multicore to an R-script?
I have to admit I know little about parallel processing,
multiprocessing and cluster processing.

Edwin

On Tue, 14 Sep 2010 10:15:42 -0400
 "Johnson, Cedrick W."  wrote:
>   ?multicore perhaps
> 
> On 09/14/2010 10:01 AM, Edwin Groot wrote:
> > Hello all,
> > I upgraded my R workstation, and to my dismay, only one core
> appears to
> > be used during intensive computation of a bioconductor function.
> > What I have now is two dual-core Xeon 5160 CPUs and 10 GB RAM. When
> I
> > fully load it, top reports about 25% user, 75% idle and 0.98
> short-term
> > load.
> > The archives gave nothing helpful besides mention of snow. I
> thought of
> > posting to HPC, but this system is fairly modest WRT processing
> power.
> > Any pointers of where to start?
> > ---
> > #Not running anything at the moment
> >> sessionInfo()
> > R version 2.11.1 (2010-05-31)
> > x86_64-pc-linux-gnu
> >
> > locale:
> >   [1] LC_CTYPE=en_GB.UTF-8   LC_NUMERIC=C
> >   [3] LC_TIME=en_GB.UTF-8LC_COLLATE=en_GB.UTF-8
> >   [5] LC_MONETARY=C  LC_MESSAGES=en_GB.UTF-8
> >   [7] LC_PAPER=en_GB.UTF-8   LC_NAME=C
> >   [9] LC_ADDRESS=C   LC_TELEPHONE=C
> > [11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C
> >
> > attached base packages:
> > [1] stats graphics  grDevices utils datasets  methods
>   base
> >
> >
> > loaded via a namespace (and not attached):
> > [1] tools_2.11.1
> > ---
> > $ uname -a
> > Linux laux29 2.6.26-2-amd64 #1 SMP Sun Jun 20 20:16:30 UTC 2010
> x86_64
> > GNU/Linux
> > ---
> > Thanks for your help,
> > Edwin
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

Dr. Edwin Groot, postdoctoral associate
AG Laux
Institut fuer Biologie III
Schaenzlestr. 1
79104 Freiburg, Deutschland
+49 761-2032945

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Multiple CPU HowTo in Linux?

2010-09-14 Thread Johnson, Cedrick W.

 ?multicore perhaps

On 09/14/2010 10:01 AM, Edwin Groot wrote:

Hello all,
I upgraded my R workstation, and to my dismay, only one core appears to
be used during intensive computation of a bioconductor function.
What I have now is two dual-core Xeon 5160 CPUs and 10 GB RAM. When I
fully load it, top reports about 25% user, 75% idle and 0.98 short-term
load.
The archives gave nothing helpful besides mention of snow. I thought of
posting to HPC, but this system is fairly modest WRT processing power.
Any pointers of where to start?
---
#Not running anything at the moment

sessionInfo()

R version 2.11.1 (2010-05-31)
x86_64-pc-linux-gnu

locale:
  [1] LC_CTYPE=en_GB.UTF-8   LC_NUMERIC=C
  [3] LC_TIME=en_GB.UTF-8LC_COLLATE=en_GB.UTF-8
  [5] LC_MONETARY=C  LC_MESSAGES=en_GB.UTF-8
  [7] LC_PAPER=en_GB.UTF-8   LC_NAME=C
  [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base


loaded via a namespace (and not attached):
[1] tools_2.11.1
---
$ uname -a
Linux laux29 2.6.26-2-amd64 #1 SMP Sun Jun 20 20:16:30 UTC 2010 x86_64
GNU/Linux
---
Thanks for your help,
Edwin


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Multiple CPU HowTo in Linux?

2010-09-14 Thread Edwin Groot
Hello all,
I upgraded my R workstation, and to my dismay, only one core appears to
be used during intensive computation of a bioconductor function.
What I have now is two dual-core Xeon 5160 CPUs and 10 GB RAM. When I
fully load it, top reports about 25% user, 75% idle and 0.98 short-term
load.
The archives gave nothing helpful besides mention of snow. I thought of
posting to HPC, but this system is fairly modest WRT processing power.
Any pointers of where to start?
---
#Not running anything at the moment
> sessionInfo()
R version 2.11.1 (2010-05-31) 
x86_64-pc-linux-gnu 

locale:
 [1] LC_CTYPE=en_GB.UTF-8   LC_NUMERIC=C  
 [3] LC_TIME=en_GB.UTF-8LC_COLLATE=en_GB.UTF-8
 [5] LC_MONETARY=C  LC_MESSAGES=en_GB.UTF-8   
 [7] LC_PAPER=en_GB.UTF-8   LC_NAME=C 
 [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C   

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base


loaded via a namespace (and not attached):
[1] tools_2.11.1
---
$ uname -a
Linux laux29 2.6.26-2-amd64 #1 SMP Sun Jun 20 20:16:30 UTC 2010 x86_64
GNU/Linux
---
Thanks for your help,
Edwin
-- 
Dr. Edwin Groot, postdoctoral associate
AG Laux
Institut fuer Biologie III
Schaenzlestr. 1
79104 Freiburg, Deutschland
+49 761-2032945

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.