Dear list members,

i have the problem that jobs submitted over the parallel environment smp
keep hanging in the queue with the stat 'qw', but never run.

First i defined the smp:
~>  qconf -sp smp
pe_name            smp
slots              999999
user_lists         NONE
xuser_lists        NONE
start_proc_args    NONE
stop_proc_args     NONE
allocation_rule    $round_robin
control_slaves     FALSE
job_is_first_task  FALSE
urgency_slots      min
accounting_summary FALSE

Then i created i simple file:
~> cat teste.ttt
#!/bin/bash
#
#$ -cwd
#$ -S /bin/bash
#$ -o out.txt
#$ -e err.txt
#$ -pe smp 8
/bin/date > /tmp/ddd

Then i ran the job:
qsub teste.ttt

and it keeps hanging forever in qw-state

There is nothing in the messages file about it, neither in the qmaster,
nor in the nodes.

When i delete the pe-line in the file:
~> cat teste.ttt
#!/bin/bash
#
#$ -cwd
#$ -S /bin/bash
#$ -o out.txt
#$ -e err.txt
/bin/date > /tmp/ddd


The job runs on all nodes.
How can i find out what makes it hanging forever in qw-state? What is it
waiting for?

With kind regards, ulrich
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to