Hi,

We have added the below qmaster params in the SGE configuration

qmaster_params               gdi_timeout=240 gdi_retries=-1 cl_ping=true

Could you let me know the difference between gdi_timeout and gdi_retries. Why 
is there gdi_retries parameter? Why can't we use gdi_timeout alone to retry 
permanently like allowing an option -1 for gdi-timeout. I don't get the 
specific purpose of having extra parameter gdi_retries.

Because when we have NFS latency issue we receive the error "failed receiving 
gdi request" but yet the job is submitted which is causing confusion.

Regards,
Sudha

-----Original Message-----
From: William Hay [mailto:w....@ucl.ac.uk]
Sent: Wednesday, June 22, 2016 1:31 PM
To: Sudha Padmini Penmetsa (BAS) <sudha.penme...@wipro.com>
Cc: users@gridengine.org; Jeevan Behara Patnaik (GIS) <jeevan.patn...@wipro.com>
Subject: Re: [gridengine users] Error message- failed receving gdi request when 
calling qsub, but job is started

On Tue, Jun 21, 2016 at 04:12:35PM +0000, sudha.penme...@wipro.com wrote:
>    Hi,
>
>    Since this morning, sometimes users are facing an issue in grid while
>    submitting qsub jobs.
>
>    When submitting the job, it displays error message: "Unable to run job:
>    failed receiving gdi request. Exiting"
>
>    But the job runs successfully when it is seen later with qstat.
>
That sounds like some sort of connectivity problem to me.  The job is 
successfully submitted but the acknowledgement doesn't make it back to the 
client.  I'd try poking at things with qping and checking the network config on 
the qmaster and submit hosts.

William

The information contained in this electronic message and any attachments to 
this message are intended for the exclusive use of the addressee(s) and may 
contain proprietary, confidential or privileged information. If you are not the 
intended recipient, you should not disseminate, distribute or copy this e-mail. 
Please notify the sender immediately and destroy all copies of this message and 
any attachments. WARNING: Computer viruses can be transmitted via email. The 
recipient should check this email and any attachments for the presence of 
viruses. The company accepts no liability for any damage caused by any virus 
transmitted by this email. www.wipro.com

_______________________________________________
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users

Reply via email to