Re: [R] Amazon AWS, RGenoud, Parallel Computing

2011-06-11 Thread Mike Marchywka





 Date: Sat, 11 Jun 2011 13:03:10 +0200
 From: lui.r.proj...@googlemail.com
 To: r-help@r-project.org
 Subject: [R] Amazon AWS, RGenoud, Parallel Computing

 Dear R group,


[...]

 I am a little bit puzzled now about what I could do... It seems like
 there are only very limited options for me to increase the
 performance. Does anybody have experience with parallel computations
 with rGenoud or parallelized sorting algorithms? I think one major
 problem is that the sorting happens rather quick (only a few hundred
 entries to sort), but needs to be done very frequently (population
 size 2000, iterations 500), so I guess the problem with the
 housekeeping of the parallel computation deminishes all benefits.

Your sort is part of algorithm or you have to sort results after 
getting then back out of order from async processes? One of 
my favorite anecdotes is how I used a bash sort on huge data
file to make program run faster ( from impractical zero percent CPU
to very fast with full CPU usage and you complain about exactly
a lack of CPU saturation). I guess a couple of comments. First, 
if you have specialized apps you need optimized, you may want
to write dedicated c++ code. However, this won't help if
you don't find the bottleneck. Lack of CPU saturation could
easily be due to waiting for stuff like disk IO or VM
swap. You really ought to find the bottle neck first, it
could be anything ( except the CPU maybe LOL). The sort
that I used prevented VM thrashing with no change to the app
code- the app got sorted data and so VM paging became infrequent.
If you can specify the problem precisely you may be able to find
a simple solution. 

  
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Amazon AWS, RGenoud, Parallel Computing

2011-06-11 Thread Lui ##
Hello Mike,

thank you very much for your response!
Best to my knowledge the sort algorithm implemented in R is already
backed by C++ code and not natively written in R. Writing the code
in C++ is not really an option either (i think rGenoud is also written
in C++). I am not sure whether there really is a bottleneck with
respect to the computer - I/O is pretty low, plenty of RAM left etc.
It really seems to me as if parallelizing is not easily possible or
only at high costs so that the benefits diminish through all the
coordination and handling needed...
Did anybody use rGenoud in cluster mode an experience sth similar?
Are quicksort packages available using multiple processors efficiently
(I didnt find any... :-( ).

I am by no means an expert on parallel processing, but is it possible,
that benefits from parallelizing a process greatly diminish if a large
set of variables/functions need to be made available and the actual
function (in this case sorting a few hundred entries) is quite short
whereas the number of times the function is called is very high!? It
was quite striking that the first run usually took several hours
(instead of half an hour) and the subsequent runs were much much
faster..

There is so much happening behind the scenes that it is a little
hard for me to tell what might help - and what will not...

Help appreciated :-)
Thank you

Lui

On Sat, Jun 11, 2011 at 4:42 PM, Mike Marchywka marchy...@hotmail.com wrote:




 
 Date: Sat, 11 Jun 2011 13:03:10 +0200
 From: lui.r.proj...@googlemail.com
 To: r-help@r-project.org
 Subject: [R] Amazon AWS, RGenoud, Parallel Computing

 Dear R group,


 [...]

 I am a little bit puzzled now about what I could do... It seems like
 there are only very limited options for me to increase the
 performance. Does anybody have experience with parallel computations
 with rGenoud or parallelized sorting algorithms? I think one major
 problem is that the sorting happens rather quick (only a few hundred
 entries to sort), but needs to be done very frequently (population
 size 2000, iterations 500), so I guess the problem with the
 housekeeping of the parallel computation deminishes all benefits.

 Your sort is part of algorithm or you have to sort results after
 getting then back out of order from async processes? One of
 my favorite anecdotes is how I used a bash sort on huge data
 file to make program run faster ( from impractical zero percent CPU
 to very fast with full CPU usage and you complain about exactly
 a lack of CPU saturation). I guess a couple of comments. First,
 if you have specialized apps you need optimized, you may want
 to write dedicated c++ code. However, this won't help if
 you don't find the bottleneck. Lack of CPU saturation could
 easily be due to waiting for stuff like disk IO or VM
 swap. You really ought to find the bottle neck first, it
 could be anything ( except the CPU maybe LOL). The sort
 that I used prevented VM thrashing with no change to the app
 code- the app got sorted data and so VM paging became infrequent.
 If you can specify the problem precisely you may be able to find
 a simple solution.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Amazon AWS, RGenoud, Parallel Computing

2011-06-11 Thread Mike Marchywka












 Date: Sat, 11 Jun 2011 19:57:47 +0200
 Subject: Re: [R] Amazon AWS, RGenoud, Parallel Computing
 From: lui.r.proj...@googlemail.com
 To: marchy...@hotmail.com
 CC: r-help@r-project.org

 Hello Mike,

[[elided Hotmail spam]]
 Best to my knowledge the sort algorithm implemented in R is already
 backed by C++ code and not natively written in R. Writing the code
 in C++ is not really an option either (i think rGenoud is also written
 in C++). I am not sure whether there really is a bottleneck with
 respect to the computer - I/O is pretty low, plenty of RAM left etc.
 It really seems to me as if parallelizing is not easily possible or
 only at high costs so that the benefits diminish through all the
 coordination and handling needed...
 Did anybody use rGenoud in cluster mode an experience sth similar?
 Are quicksort packages available using multiple processors efficiently
 (I didnt find any... :-( ).

I'm no expert but these don't seem to be terribly subtle problems
in most cases. Sure, if the task is not suited to parallelism and
you force it to be parallel and it spends all its time syncing
up, that can be a problem. Just making more tasks to fight over
the bottle neck- memory, CPU, locks- can easily make things worse.
I think I posted my link earlier on IEEE blurb showing 
how easy it is for many cores to make things worse on non-contrived
benchmarks.





 I am by no means an expert on parallel processing, but is it possible,
 that benefits from parallelizing a process greatly diminish if a large
 set of variables/functions need to be made available and the actual
 function (in this case sorting a few hundred entries) is quite short
 whereas the number of times the function is called is very high!? It
 was quite striking that the first run usually took several hours
 (instead of half an hour) and the subsequent runs were much much
 faster..

 There is so much happening behind the scenes that it is a little
 hard for me to tell what might help - and what will not...

 Help appreciated :-)
 Thank you

 Lui

 On Sat, Jun 11, 2011 at 4:42 PM, Mike Marchywka  wrote:
 
 
 
 
  
  Date: Sat, 11 Jun 2011 13:03:10 +0200
  From: lui.r.proj...@googlemail.com
  To: r-help@r-project.org
  Subject: [R] Amazon AWS, RGenoud, Parallel Computing
 
  Dear R group,
 
 
  [...]
 
  I am a little bit puzzled now about what I could do... It seems like
  there are only very limited options for me to increase the
  performance. Does anybody have experience with parallel computations
  with rGenoud or parallelized sorting algorithms? I think one major
  problem is that the sorting happens rather quick (only a few hundred
  entries to sort), but needs to be done very frequently (population
  size 2000, iterations 500), so I guess the problem with the
  housekeeping of the parallel computation deminishes all benefits.
 
  Your sort is part of algorithm or you have to sort results after
  getting then back out of order from async processes? One of
  my favorite anecdotes is how I used a bash sort on huge data
  file to make program run faster ( from impractical zero percent CPU
  to very fast with full CPU usage and you complain about exactly
  a lack of CPU saturation). I guess a couple of comments. First,
  if you have specialized apps you need optimized, you may want
  to write dedicated c++ code. However, this won't help if
  you don't find the bottleneck. Lack of CPU saturation could
  easily be due to waiting for stuff like disk IO or VM
  swap. You really ought to find the bottle neck first, it
  could be anything ( except the CPU maybe LOL). The sort
  that I used prevented VM thrashing with no change to the app
  code- the app got sorted data and so VM paging became infrequent.
  If you can specify the problem precisely you may be able to find
  a simple solution.
 
 
  
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.