Hi Chris, Am 08.03.2016 um 18:29 schrieb Christopher Black:
> Thanks for the reply Reuti! > > Sounds like some of the suggestions are moving limits out of RQS and into > complexes and consumable resources. Yep. > How do we make that happen without > requiring users to add -l bits to their qsubs? You can use a JSV (job submission verifier) and each time a queue is specified in addition a consumable complex for this type of queue is requested too. This could also overwrite any specified value for the consumable complex (users could request zero of them - AFAIK only UGE introduced also a lower limit for the consumption which could be requested). Do your users need the feature to specify more than one queue per submission? > On 3/8/16, 7:32 AM, "Reuti" <re...@staff.uni-marburg.de> wrote: > >> I saw cases were RQS blocks further scheduling and shows up in `qstat -j` >> with a cryptic message. Although this was in 6.2u5, I don't know whether >> there was any work in this area to fix it. >> >> Often you can spot it in the scheduling output that an RQS was violated >> although it's not true that the rule is violated. For me it kicked in >> when I requested a complex with a load value in the submission command. >> >> cannot run because it exceeds limit "////node20/" in rule "general/slots" > > I've also seen cryptic qstat -j messages about slots not available that > ended up requiring changing values in a pe, but that was settled a while > ago. > Within queue definitions we have: > slots 1,[@16core=16],[@20core=20],[@28core=28] > > Those are per node limits, unsure how to change this to total slots for a > queue across all eligible nodes. Not at all. These can stay as they are. Only the overall usage per queue (which is in the RQS) would be rephrased. There is no need to change anything on a queue-instance level. > We have 20+ queues and use RQS, host groups and disabling/enabling queue > instances to manage balancing nodes and load between queues. > > > When I turn schedd_job_info back on and look at qstat -j, I sometimes (but > not often) see those "exceeds limit" entries, but before that there are > MANY entries like: > queue instance "de...@pnode073.nygenome.org" dropped because it is > temporarily not available > queue instance "cus...@pnode077.nygenome.org" dropped because it is > disabled > > And these are for queues other than the one specified in -q > hard_queue_list. I am wondering if qmaster is giving up checking eligible > matching queue instances after checking all of these disabled instances > for other queues. Perhaps utilizing hostgroups in queue definitions would > be more efficient than disabling queue instances. > >> AFAICS: >> >>> >>> Some config snippets showing non-default and potentially-relevant >>> values, >>> I can put full output to a pastebin if it is useful: >>> qconf -srqs: >>> { >>> name slots_per_host >>> description Limit slots per host >>> enabled TRUE >>> limit hosts {@16core} to slots=16 >>> limit hosts {@20core} to slots=20 >>> limit hosts {@28core} to slots=28 >>> limit hosts {!@physicalNodes} to slots=2 >>> } >> >> The above RQS could be put in individual complex_values per exechost. Yes >> - the above is handy, I know. > > Is the idea to define a new complex (qconf -mc) and then set it per > exechost (qconf -me), or re-use "slots" somehow? It would mean one complex per type of queue: dev_slots, io_slots, pipeline_slots > Current qconf -mc |grep slot: > slots s INT <= YES YES 1 > 1000 > > Putting slots per exechost wouldn't be too awful to script (qconf -mattr) > but we'd need it to work without asking users to change qsub commands. > Would this be more efficient than having it in the RQS stanza above? Usually the number of slots across all queues per machine is fixed, it should be a one-time setting for the above mentioned 16, 20, 28 and 2. The values in the global configuration, which reflect the usage per queue will change. This can be done with the `qconf -mattr exechost complex_values limit_dev=99 global` too. > { >>> name io >>> description Limit max concurrent io.q slots >>> enabled TRUE >>> limit queues io.q to slots=300 >>> } >>> { >>> name dev >>> description Limit max concurrent dev.q slots >>> enabled TRUE >>> limit queues dev.q to slots=250 >>> } >>> { >>> name pipeline >>> description Limit max concurrent pipeline.q slots >>> enabled TRUE >>> limit queues pipeline.q to slots=4000 >>> } >>> ...other queues.. >> >> Here one could use a global complex for each type of queue, as long as >> the users specify the particular queue. One will lose the ability that >> potentially a job may be scheduled to different types of queues, as long >> as resource requests are met. > > Sounds like we could switch from RQS for queue core limits to complex per > queue, but I'm not sure how to make all slots for qsub -q foo.q > automatically debit against a consumable resource like foo_slots. What > would we need to do beyond defining the global complex, setting value and > marking it consumable? Yes. -- Reuti > This would be a big change to the way we currently limit load across > queues and nodes but we are willing to try to get past our issues. > >> I can't predict whether this would improve anything in the situation you >> face. > > Understood! > > Thanks! > Chris > >>> qconf -sc|grep mem (note default mem per job is 8GB and this is >>> consumable): >>> h_vmem mem MEMORY <= YES JOB >>> 8G >>> 0 >>> >>> A typical exechost qconf -se: >>> complex_values h_vmem=240G,exclusive=true >>> >>> qconf -sconf: >>> shell_start_mode unix_behavior >>> reporting_params accounting=true reporting=false \ >>> flush_time=00:00:15 joblog=true >>> sharelog=00:00:00 >>> finished_jobs 100 >>> gid_range 20000-20100 >>> max_aj_instances 3000 >>> max_aj_tasks 75000 >>> max_u_jobs 0 >>> max_jobs 0 >>> max_advance_reservations 50 >>> >>> qconf -msconf: >>> schedule_interval 0:0:45 >>> maxujobs 0 >>> queue_sort_method load >>> >>> schedd_job_info false (this used to be true, as qstat >>> -j on a stuck job can be useful) >>> params monitor=false >>> max_functional_jobs_to_schedule 1000 >>> max_pending_tasks_per_job 50 >>> max_reservation 0 (used to be 50 to allow large jobs >>> with -R y to have a better chance to run) >>> default_duration 4320:0:0 >>> >>> _______________________________________________ >>> users mailing list >>> users@gridengine.org >>> https://gridengine.org/mailman/listinfo/users >> >> > > This electronic message is intended for the use of the named recipient only, > and may contain information that is confidential, privileged or protected > from disclosure under applicable law. If you are not the intended recipient, > or an employee or agent responsible for delivering this message to the > intended recipient, you are hereby notified that any reading, disclosure, > dissemination, distribution, copying or use of the contents of this message > including any of its attachments is strictly prohibited. If you have received > this message in error or are not the named recipient, please notify us > immediately by contacting the sender at the electronic mail address noted > above, and destroy all copies of this message. Please note, the recipient > should check this email and any attachments for the presence of viruses. The > organization accepts no liability for any damage caused by any virus > transmitted by this email. > _______________________________________________ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users