Hi,

> Am 09.01.2019 um 01:14 schrieb Derrick Lin <klin...@gmail.com>:
> 
> Hi guys,
> 
> I just brought up a new SGE cluster, but somehow the qrsh session does not 
> work:
> 
> tester@login-gpu:~$ qrsh
> ^Cerror: error while waiting for builtin IJS connection: "got select timeout"
> 
> after I hit entered, the session just stuck there forever instead of bring me 
> to a compute node. I have to entered Crtl+c to terminate and it gave the 
> above error.
> 
> I noticed, the SGE did send my qrsh request to a compute node as I could tell 
> from qstat:
> 
> ---------------------------------------------------------------------------------
> short.q@zeta-4-15.local        BIP   0/1/80         0.01     lx-amd64
>      15 0.55500 QRLOGIN    tester       r    01/09/2019 10:47:13     1
> 
> We have a prolog script configured globally, the script deals with local disk 
> quota and keep all output to a log file for each job. So I went to that 
> compute node, and check, found that a log file was created but it was empty. 
> 
> So my thinking so far is, my qrsh stuck because the prolog script is not 
> fully executed.

Is there any statement in the prolog, which could wait for stdin – and in a 
batch job there is just no stdin, hence it continues? Could be tested with "-i" 
to a batch job.

-- Reuti


> qsub job are working fine.
> 
> Any idea will be appreciated 
> 
> Cheers,
> Derrick
> _______________________________________________
> users mailing list
> users@gridengine.org
> https://gridengine.org/mailman/listinfo/users


_______________________________________________
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users

Reply via email to