Dear list members, i have the problem that jobs submitted over the parallel environment smp keep hanging in the queue with the stat 'qw', but never run.
First i defined the smp: ~> qconf -sp smp pe_name smp slots 999999 user_lists NONE xuser_lists NONE start_proc_args NONE stop_proc_args NONE allocation_rule $round_robin control_slaves FALSE job_is_first_task FALSE urgency_slots min accounting_summary FALSE Then i created i simple file: ~> cat teste.ttt #!/bin/bash # #$ -cwd #$ -S /bin/bash #$ -o out.txt #$ -e err.txt #$ -pe smp 8 /bin/date > /tmp/ddd Then i ran the job: qsub teste.ttt and it keeps hanging forever in qw-state There is nothing in the messages file about it, neither in the qmaster, nor in the nodes. When i delete the pe-line in the file: ~> cat teste.ttt #!/bin/bash # #$ -cwd #$ -S /bin/bash #$ -o out.txt #$ -e err.txt /bin/date > /tmp/ddd The job runs on all nodes. How can i find out what makes it hanging forever in qw-state? What is it waiting for? With kind regards, ulrich _______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
