Re: [R] Amazon AWS, RGenoud, Parallel Computing
Date: Sat, 11 Jun 2011 13:03:10 +0200 From: lui.r.proj...@googlemail.com To: r-help@r-project.org Subject: [R] Amazon AWS, RGenoud, Parallel Computing Dear R group, [...] I am a little bit puzzled now about what I could do... It seems like there are only very limited options for me to increase the performance. Does anybody have experience with parallel computations with rGenoud or parallelized sorting algorithms? I think one major problem is that the sorting happens rather quick (only a few hundred entries to sort), but needs to be done very frequently (population size 2000, iterations 500), so I guess the problem with the housekeeping of the parallel computation deminishes all benefits. Your sort is part of algorithm or you have to sort results after getting then back out of order from async processes? One of my favorite anecdotes is how I used a bash sort on huge data file to make program run faster ( from impractical zero percent CPU to very fast with full CPU usage and you complain about exactly a lack of CPU saturation). I guess a couple of comments. First, if you have specialized apps you need optimized, you may want to write dedicated c++ code. However, this won't help if you don't find the bottleneck. Lack of CPU saturation could easily be due to waiting for stuff like disk IO or VM swap. You really ought to find the bottle neck first, it could be anything ( except the CPU maybe LOL). The sort that I used prevented VM thrashing with no change to the app code- the app got sorted data and so VM paging became infrequent. If you can specify the problem precisely you may be able to find a simple solution. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Amazon AWS, RGenoud, Parallel Computing
Hello Mike, thank you very much for your response! Best to my knowledge the sort algorithm implemented in R is already backed by C++ code and not natively written in R. Writing the code in C++ is not really an option either (i think rGenoud is also written in C++). I am not sure whether there really is a bottleneck with respect to the computer - I/O is pretty low, plenty of RAM left etc. It really seems to me as if parallelizing is not easily possible or only at high costs so that the benefits diminish through all the coordination and handling needed... Did anybody use rGenoud in cluster mode an experience sth similar? Are quicksort packages available using multiple processors efficiently (I didnt find any... :-( ). I am by no means an expert on parallel processing, but is it possible, that benefits from parallelizing a process greatly diminish if a large set of variables/functions need to be made available and the actual function (in this case sorting a few hundred entries) is quite short whereas the number of times the function is called is very high!? It was quite striking that the first run usually took several hours (instead of half an hour) and the subsequent runs were much much faster.. There is so much happening behind the scenes that it is a little hard for me to tell what might help - and what will not... Help appreciated :-) Thank you Lui On Sat, Jun 11, 2011 at 4:42 PM, Mike Marchywka marchy...@hotmail.com wrote: Date: Sat, 11 Jun 2011 13:03:10 +0200 From: lui.r.proj...@googlemail.com To: r-help@r-project.org Subject: [R] Amazon AWS, RGenoud, Parallel Computing Dear R group, [...] I am a little bit puzzled now about what I could do... It seems like there are only very limited options for me to increase the performance. Does anybody have experience with parallel computations with rGenoud or parallelized sorting algorithms? I think one major problem is that the sorting happens rather quick (only a few hundred entries to sort), but needs to be done very frequently (population size 2000, iterations 500), so I guess the problem with the housekeeping of the parallel computation deminishes all benefits. Your sort is part of algorithm or you have to sort results after getting then back out of order from async processes? One of my favorite anecdotes is how I used a bash sort on huge data file to make program run faster ( from impractical zero percent CPU to very fast with full CPU usage and you complain about exactly a lack of CPU saturation). I guess a couple of comments. First, if you have specialized apps you need optimized, you may want to write dedicated c++ code. However, this won't help if you don't find the bottleneck. Lack of CPU saturation could easily be due to waiting for stuff like disk IO or VM swap. You really ought to find the bottle neck first, it could be anything ( except the CPU maybe LOL). The sort that I used prevented VM thrashing with no change to the app code- the app got sorted data and so VM paging became infrequent. If you can specify the problem precisely you may be able to find a simple solution. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Amazon AWS, RGenoud, Parallel Computing
Date: Sat, 11 Jun 2011 19:57:47 +0200 Subject: Re: [R] Amazon AWS, RGenoud, Parallel Computing From: lui.r.proj...@googlemail.com To: marchy...@hotmail.com CC: r-help@r-project.org Hello Mike, [[elided Hotmail spam]] Best to my knowledge the sort algorithm implemented in R is already backed by C++ code and not natively written in R. Writing the code in C++ is not really an option either (i think rGenoud is also written in C++). I am not sure whether there really is a bottleneck with respect to the computer - I/O is pretty low, plenty of RAM left etc. It really seems to me as if parallelizing is not easily possible or only at high costs so that the benefits diminish through all the coordination and handling needed... Did anybody use rGenoud in cluster mode an experience sth similar? Are quicksort packages available using multiple processors efficiently (I didnt find any... :-( ). I'm no expert but these don't seem to be terribly subtle problems in most cases. Sure, if the task is not suited to parallelism and you force it to be parallel and it spends all its time syncing up, that can be a problem. Just making more tasks to fight over the bottle neck- memory, CPU, locks- can easily make things worse. I think I posted my link earlier on IEEE blurb showing how easy it is for many cores to make things worse on non-contrived benchmarks. I am by no means an expert on parallel processing, but is it possible, that benefits from parallelizing a process greatly diminish if a large set of variables/functions need to be made available and the actual function (in this case sorting a few hundred entries) is quite short whereas the number of times the function is called is very high!? It was quite striking that the first run usually took several hours (instead of half an hour) and the subsequent runs were much much faster.. There is so much happening behind the scenes that it is a little hard for me to tell what might help - and what will not... Help appreciated :-) Thank you Lui On Sat, Jun 11, 2011 at 4:42 PM, Mike Marchywka wrote: Date: Sat, 11 Jun 2011 13:03:10 +0200 From: lui.r.proj...@googlemail.com To: r-help@r-project.org Subject: [R] Amazon AWS, RGenoud, Parallel Computing Dear R group, [...] I am a little bit puzzled now about what I could do... It seems like there are only very limited options for me to increase the performance. Does anybody have experience with parallel computations with rGenoud or parallelized sorting algorithms? I think one major problem is that the sorting happens rather quick (only a few hundred entries to sort), but needs to be done very frequently (population size 2000, iterations 500), so I guess the problem with the housekeeping of the parallel computation deminishes all benefits. Your sort is part of algorithm or you have to sort results after getting then back out of order from async processes? One of my favorite anecdotes is how I used a bash sort on huge data file to make program run faster ( from impractical zero percent CPU to very fast with full CPU usage and you complain about exactly a lack of CPU saturation). I guess a couple of comments. First, if you have specialized apps you need optimized, you may want to write dedicated c++ code. However, this won't help if you don't find the bottleneck. Lack of CPU saturation could easily be due to waiting for stuff like disk IO or VM swap. You really ought to find the bottle neck first, it could be anything ( except the CPU maybe LOL). The sort that I used prevented VM thrashing with no change to the app code- the app got sorted data and so VM paging became infrequent. If you can specify the problem precisely you may be able to find a simple solution. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.