Hi, > Am 21.08.2017 um 09:18 schrieb John_Tai <john_...@smics.com>: > > I changed gid_range, it used to be just 20000. Now it's 20000-20200
Unless you have more than 201 cores per exechost, this is fine. > However now when I submit a job the host goes in error state. I checked the > messages log: > > 08/21/2017 15:06:55| main|BJSMICDS126|E|shepherd of job 89.1 exited with > exit status = 7 > 08/21/2017 15:06:55| main|BJSMICDS126|E|can't open pid file > "active_jobs/89.1/pid" for job 89.1 > > There must be another config problem. Can the exechosts write to the location of the spool directory? Often it's better to have at least the nodes writing to a local place. This can even be done after installation: shut down the exechosts, change the setting of the spool directory to a local place on the exechosts (`qconf -mconf`), create these directories like /var/spool/sge (the exechost specific directory will be created when the sge_execd starts up). https://arc.liv.ac.uk/SGE/howto/nfsreduce.html -- Reuti > Any ideas? > > > > > > -----Original Message----- > From: Reuti [mailto:re...@staff.uni-marburg.de] > Sent: Friday, August 18, 2017 4:37 > To: John_Tai > Cc: users@gridengine.org > Subject: Re: [gridengine users] error reason 1: can not find an unused > add_grp_id > > Hi, > >> Am 18.08.2017 um 02:30 schrieb John_Tai <john_...@smics.com>: >> >> When I submit more than 1 job to a queue, the job is queued even though >> there are free slots available. When I check this waiting job status with >> qstat –j I find this error message: >> >> error reason 1: can not find an unused add_grp_id >> >> What does it mean? > > Each job in SGE gets an additional group ID attached, which enables SGE to > track the consumed resources. > > What is your setting of: > > $ qconf -sconf > #global: > … > gid_range 20000-20100 > > Is this range in your case lower than the number of installed cores per > exechost? As there might be a delay when old group IDs are released again, it > would help to have some more IDs than the real number of cores (resp. threads > in case you use them). > > -- Reuti > ________________________________ > > This email (including its attachments, if any) may be confidential and > proprietary information of SMIC, and intended only for the use of the named > recipient(s) above. Any unauthorized use or disclosure of this email is > strictly prohibited. If you are not the intended recipient(s), please notify > the sender immediately and delete this email from your computer. > _______________________________________________ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users