Am 11.02.2014 um 23:20 schrieb Stephen Spencer: > The definition of "qconf -sconf" is as you expected: all "builtin." > > Could you please be specific as to the commands you'd like me to try from the > next line? > > Any output when you use the "-q ..." for `qrsh` too? In addition, you can try > "-w v" and "-w p" too.
I meant: $ qrsh -q all.q@n20 hostname (queue@host, did you swap them?) -- Reuti > > I tried "qrsh -w v" and "qrsh -w p" and both returned "verification: found > suitable queue(s)". > "qrsh -q all.q" gave me a shell, surprisingly, on one of the troublesome > nodes. (Actually, was three for three.) > All nodes have "BIP" for "qtype" - no limitations, there. > > Best, > Stephen > > > On Tue, Feb 11, 2014 at 1:57 PM, Reuti <[email protected]> wrote: > Hi, > > Am 11.02.2014 um 22:37 schrieb Stephen Spencer: > > > I have a sixty-node cluster running SGE 6.2u5 (RHEL 6.5). > > > > The immediate issue is that a user has jobs in the "qw" state, and there > > are idle nodes in the cluster which appear to be able to accept the jobs. > > > > What works and doesn't work? > > • "qsub -q [email protected] job.sh" works - the job runs on "n20" > > • Repeated invocations of "qrsh hostname" will not, however, result > > in the job running on one of the troublesome hosts. > > What is the definition of: > > $ qconf -sconf > ... > qlogin_command builtin > qlogin_daemon builtin > rlogin_command builtin > rlogin_daemon builtin > rsh_command builtin > rsh_daemon builtin > > Any output when you use the "-q ..." for `qrsh` too? In addition, you can try > "-w v" and "-w p" too. > > > > Things I've tried, and know, so far: > > • I've restarted the troublesome nodes - no change. > > • "sge_execd" is running on the the troublesome nodes. > > • The troublesome nodes are in the execution host list and the submit > > host list. > > • Most of the rest of the cluster's pretty busy. > > • Interestingly, the troublesome nodes don't show up in the > > "scheduling info" list produced as part of the "qstat -j <jobid>" command's > > output. > > Short of restarting the entire cluster, I'm at a loss as to what to look at > > next. > > Is "qtype INTERACTIVE" limited to certain nodes/queues? > > -- Reuti > > > > -- > > Stephen Spencer > > [email protected] > > _______________________________________________ > > users mailing list > > [email protected] > > https://gridengine.org/mailman/listinfo/users > > > > > -- > Stephen Spencer > [email protected] _______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
