Hi, > Am 09.01.2019 um 01:14 schrieb Derrick Lin <klin...@gmail.com>: > > Hi guys, > > I just brought up a new SGE cluster, but somehow the qrsh session does not > work: > > tester@login-gpu:~$ qrsh > ^Cerror: error while waiting for builtin IJS connection: "got select timeout" > > after I hit entered, the session just stuck there forever instead of bring me > to a compute node. I have to entered Crtl+c to terminate and it gave the > above error. > > I noticed, the SGE did send my qrsh request to a compute node as I could tell > from qstat: > > --------------------------------------------------------------------------------- > short.q@zeta-4-15.local BIP 0/1/80 0.01 lx-amd64 > 15 0.55500 QRLOGIN tester r 01/09/2019 10:47:13 1 > > We have a prolog script configured globally, the script deals with local disk > quota and keep all output to a log file for each job. So I went to that > compute node, and check, found that a log file was created but it was empty. > > So my thinking so far is, my qrsh stuck because the prolog script is not > fully executed.
Is there any statement in the prolog, which could wait for stdin – and in a batch job there is just no stdin, hence it continues? Could be tested with "-i" to a batch job. -- Reuti > qsub job are working fine. > > Any idea will be appreciated > > Cheers, > Derrick > _______________________________________________ > users mailing list > users@gridengine.org > https://gridengine.org/mailman/listinfo/users _______________________________________________ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users