Dear Ole Tange,
Thanks for this great tool and for helping me out on this.
We, Computational chemist widely run long/large scale calculations on
biological data and GNU parallel is very helpful in achieving it.
As per you suggestion I've introduced a simple function and tested in my
local machine which has a 4 cpu for a list contains 10,000:
top shows 4 running while using --pipepart --block -10k (option 2) but
shows only 1-2 running in others ( option 1 & 3):
Here are the completed time in using/not-using function in script:
*1. Introducing new function (example_10k.lst contains only 10,000)*
job_script.sh
#!/bin/bash
dowork() {
export WDIR=/shared/TF_data/work_dir
cd $WDIR;
parallel --wd $WDIR sh run_script.sh {}
}
export -f dowork
cat example_10k.lst | dowork
Completes the job in
real 4m4.599s
user 2m29.206s
sys 0m30.267s
*2. Intoducing the new function, --pipepart --block -10k instead of cat in
job_script.sh*
parallel -a example_10k.lst --pipepart --block -10 dowork
Completes the job in
real 4m44.067s
user 2m58.884s
sys 0m46.153s
*3. With No new functions*
#!/bin/bash
export WDIR=/shared/TF_data/work_dir
cd $WDIR;
cat example_10k.lst | parallel --wd $WDIR sh run_script.sh {}
Completes the job in
real 4m5.139s
user 2m28.339s
sys 0m30.723s
Since i tested in 4 cpu local machine, wonder whether testing on more cpu
(72 x 5 nodes) or full list (200k) can get a better cpu utilzation ?
Any suggestion is much appreciated.
Best,
Rajiv