On Thu, Jan 17, 2013 at 12:58 PM, Nanditha Rao <[email protected]> wrote:
> 1. I need to run multiple jobs on a multicore (and multithreaded) machine. I > am using the GNU Parallel utility to distribute jobs across the cores to > speed up the task. The commands to be executed are available in a file > called 'commands'. I use the following command to run the GNU Parallel. > > cat commands | parallel -j +0 > > As per the guidance at this location- gnu parallel, this command is supposed > to use all the cores to run this task. My machine has 2 cores and 2 threads > per core. I take it that you have a CPU with hyperthreading. > The system monitor however shows 4 CPUs (CPU1 and CPU2 belong to > core1, CPU3 and CPU4 belong to core2). Each job (simulation) takes about 20 > seconds to run on a single core. I ran 2 jobs in parallel using this GNU > parallel utility with the command above. I observe in the system monitor What system monitor are you using? > that, if the 2 jobs are assigned to cpu1 and cpu2 (that is the same core), > there is obviously no speed-up. Why obviously? Normally I measure a speedup of 30-70% when using hyperthreading. > They take about 40seconds to finish, which > is about the time they would take if run sequentially. However, sometimes > the tool distributes the 2 jobs to CPU1 and CPU3 or CPU4 (which means, 2 > jobs are assigned to 2 different cores). In this case, both jobs finish > parallely in 20 seconds. GNU Parallel does not do the distributing; it simply spawns jobs. The distribution is done by your operating system. > Now, I want to know if there is a way in which I can force the tool to run > on different "cores" and not on different "threads" on the same core, so > that there is appreciable speed-up. Any help is appreciated. Thanks! If you are using GNU/Linux you can use taskset which can set a mask on which cores a task can be scheduled on. If you want every other: 1010(bin) = 0xA. For a 128 core machine you could run: cat commands | taskset 0xaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa parallel -j +0 > 2. Also, I want to know if there is a way to run this utility over a cluster > of machines.. say, there are four 12-core machines in a cluster (making it a > 48-core cluster). cat commands | parallel -j +0 -S server1,server2,server3,server4 Please read http://www.gnu.org/software/parallel/man.html#example__using_remote_computers or watch the intro videos: https://www.youtube.com/playlist?list=PL284C9FF2488BC6D1 /Ole
