Hi, We have added the below qmaster params in the SGE configuration
qmaster_params gdi_timeout=240 gdi_retries=-1 cl_ping=true Could you let me know the difference between gdi_timeout and gdi_retries. Why is there gdi_retries parameter? Why can't we use gdi_timeout alone to retry permanently like allowing an option -1 for gdi-timeout. I don't get the specific purpose of having extra parameter gdi_retries. Because when we have NFS latency issue we receive the error "failed receiving gdi request" but yet the job is submitted which is causing confusion. Regards, Sudha -----Original Message----- From: William Hay [mailto:w....@ucl.ac.uk] Sent: Wednesday, June 22, 2016 1:31 PM To: Sudha Padmini Penmetsa (BAS) <sudha.penme...@wipro.com> Cc: users@gridengine.org; Jeevan Behara Patnaik (GIS) <jeevan.patn...@wipro.com> Subject: Re: [gridengine users] Error message- failed receving gdi request when calling qsub, but job is started On Tue, Jun 21, 2016 at 04:12:35PM +0000, sudha.penme...@wipro.com wrote: > Hi, > > Since this morning, sometimes users are facing an issue in grid while > submitting qsub jobs. > > When submitting the job, it displays error message: "Unable to run job: > failed receiving gdi request. Exiting" > > But the job runs successfully when it is seen later with qstat. > That sounds like some sort of connectivity problem to me. The job is successfully submitted but the acknowledgement doesn't make it back to the client. I'd try poking at things with qping and checking the network config on the qmaster and submit hosts. William The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments. WARNING: Computer viruses can be transmitted via email. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email. www.wipro.com _______________________________________________ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users