Dear All,

I am trying to implement gnu parallel on a cluster with several nodes and each 
node has up to 12 cores.

Here "file_name" name contains two parameters that is fed to my script 
"program.sh". My script needs two parameters to run. 

While "login_server_names" contains the address to each of the nodes in the 
cluster.

Example of what my "file_name" comprise:
1     1
1     2
1     3
..........
1   12

Example of what (my_login_server_name) file comprise.  For a single node case, 
my "login_server_name" will have a similar address:

node_1_server_address
node_1_server_address

-----
node_1_server_address


or equivalently 

12/node_1_server_address

Here is my understanding of the gnu parallel implementations on a single node:

cat  $file_name | parallel -u -j 12  --sshloginfile  $login_server_name 
--colsep '  '   program.sh {1} {2}


My idea is for the above command is to distribute each of my jobs to each of 
the cores of a single node. If my implementation is correct, this is what I 
expect the gnu  parallel to do:

At core 1 of node 1:

"program.sh 1    1"  ought to run


At core 2 of node 1:

"program.sh  1     2"  ought to run

------------------------
so on
--------------------------
At core 12 of node 1:

"program.sh  1     12"  ought to run



Please confirm 

All the best,

Yacob

PS. For several nodes, the above command will also work accordingly.

Reply via email to