Hello, I have an interactive perl script that prompts a user to respond to a few questions, then based on the responses runs a binary (each with different arguments) thousands of times. This is what I've tried so far.
salloc -n total_cores_across_nodes_1-20 -w "node[1-20]"
Inside the script:
loop over $i
cd output_dir_$i
srun -J job$i -o %J.out executable ../input_file other_arguments
end loop
Is this work flow reasonable/recommended?
Instead of supplying -n total_cores_across_nodes_1-20, is there a way to
specify one job step should run on one core? Each node has a different
number of cores.
squeue -s seems to get stuck showing about the first 15 or so job steps
and then it only seems to update one entry in the list with new jobs.
So, for example I see something like:
STEPID NAME PARTITION USER TIME NODELIST
1180.185 js_66 all jrm 0:06 awarnach[1-15]
1180.2 js_1-2 all jrm 41:09 awarnach[1-15]
1180.3 js_1-3 all jrm 41:09 awarnach[1-15]
...
1180.19 js_1-19 all jrm 41:05 awarnach[1-15]
1180.1 js_1-1 all jrm 41:09 awarnach[1-15]
or it simply shows only one listing even when there are many job steps.
Am I misunderstanding something?
Thanks,
Joseph
signature.asc
Description: PGP signature
