Hi, > Am 26.01.2016 um 19:30 schrieb Dan Hyatt <dhy...@dsgmail.wustl.edu>: > > I am looking to use this differently. > The problem I am having is that I have users with 200-1000 jobs. I have 80 > servers with almost 1000 cores. > For my normal queue, I want SGE PE to create up to 4 jobs per server until > it runs out of servers, then add up to 4 more until all the jobs are > allocated. (1 per is fine as long as it will round robin and start adding a > second job per server, then a third until it runs out of jobs) > > Does the allocation rule limit the number of jobs per server PER qsub, or > total jobs allowed per server?
Not per se. But having a fixed allocation rule of 16 on a machine with 16 cores has this effect of course to get only one job there. Or two jobs with a fixed allocation rule of 8 on this machine. > The problem I am having is that I get 20 jobs per server and overload a > couple of servers Why? The jobs request a PE with the proper amount of cores? The job (i.e. the final application) is able to honor the granted list of machines where it should start slaves? > while 80 servers running idle. Each has 10 cores and 128 GB of RAM so they > can handle up to 20 light jobs each. What do you refer by "light" jobs? If you overload a machine it might double the execution time per (serial) job. In my opinion overloading a machine (and having an alarm_threshold > 1) was/is used in case a parallel job is badly parallelized and would leave cores often idle. It could even be adjusted so that the parallel job has a priority (i.e. nice value) in the queue definition of 0, while the serial jobs which should use the unused idling cores only get a priority of 19. > Also, for the heavy CPU jobs, I want a max of 4 jobs per server, so for > pe_slots would I just put the integer 4 in there? No, an "allocation_rule 4" would mean that each job may get 4 cores on this machine. Note that it will only start jobs in case they are dividable by 4, i.e. a job requesting 13 would never run if it requests this particular PE. Unfortunately there is no default complex which could be limited to 4 per machine by an RQS, but you can set up a consumable complex with a default value of 1 and the attribute consumable "JOB". This can then be assigned and limited on a exechost level to 4 (this works, unless the user foul the system and request 0 for this complex, but a JSV could handle it). The complex could also be assigned with an arbitrary high value on a cluster level and an RQS could limit it on certain machines. -- Reuti > Should I create a third PE, lets say "dan" with the desired settings? When I > tried this before it would throw errors. > > > Am I correct that I want to change these settings, but I suspect I really > want to make a custom PE, these are default. > > I was looking at http://linux.die.net/man/5/sge_pe and > http://www.softpanorama.org/HPC/Grid_engine/parallel_environment.shtml but > seems to assume I comprehend the details of each.. Such as...can I only put > one setting for allocation rule per PE and one PE per queue? > > > [root@blade5-1-1 ~]# qconf -sp make > pe_name make > slots 999 > user_lists NONE > xuser_lists NONE > start_proc_args NONE > stop_proc_args NONE > allocation_rule $round_robin > control_slaves TRUE > job_is_first_task FALSE > urgency_slots min > accounting_summary TRUE > qsort_args NONE > > [root@blade5-1-1 ~]# qconf -sp smp > pe_name smp > slots 999 > user_lists NONE > xuser_lists NONE > start_proc_args NONE > stop_proc_args NONE > allocation_rule $pe_slots > control_slaves TRUE > job_is_first_task TRUE > urgency_slots min > accounting_summary TRUE > qsort_args NONE > [root@blade5-1-1 ~]# echo $pe_slots > > > > [root@blade5-1-1 ~]# qconf -sp DAN > pe_name DAN > slots 999 > user_lists NONE > xuser_lists NONE > start_proc_args NONE > stop_proc_args NONE > allocation_rule $round_robin > control_slaves TRUE > job_is_first_task FALSE > urgency_slots min > accounting_summary TRUE > qsort_args NONE > > [root@blade5-1-1 ~]# qconf -sp smp > pe_name smp > slots 999 > user_lists NONE > xuser_lists NONE > start_proc_args NONE > stop_proc_args NONE > allocation_rule 4 > control_slaves TRUE > job_is_first_task TRUE > urgency_slots min > accounting_summary TRUE > qsort_args NONE > [root@blade5-1-1 ~]# echo $pe_slots > >>>> Yep, we use functional tickets to accomplish this exact goal. Every user >>>> gets 1000 functional tickets via auto_user_fshare in sge_conf(5), though >>>> your exact number will depend on the number tickets and weights you have >>>> elsewhere in your policy configuration. >>> Also the waiting time should be set to 0, and less importance of the >>> urgency (as the default is to give 1000 per slot in the complex >>> configuration - this means more slots results in being more important): >>> >>> weight_user 0.900000 >>> weight_project 0.000000 >>> weight_department 0.000000 >>> weight_job 0.100000 >>> weight_tickets_functional 1000000 >>> weight_tickets_share 0 >>> share_override_tickets TRUE >>> share_functional_shares TRUE >>> max_functional_jobs_to_schedule 200 >>> report_pjob_tickets TRUE >>> max_pending_tasks_per_job 50 >>> halflife_decay_list none >>> policy_hierarchy F >>> weight_ticket 1.000000 >>> weight_waiting_time 0.000000 >>> weight_deadline 3600000.000000 >>> weight_urgency 0.100000 >>> weight_priority 1.000000 >>> max_reservation 32 >>> default_duration 8760:00:00 >> We actually do weight waiting time, but at half the value of both >> functional and urgency tickets. We then give big urgency boosts to >> difficult-to-schedule jobs (i.e. lots of memory or CPUs in one spot). It >> took us a while to arrive at a decent mix of short-run / small jobs vs >> long-run / big jobs, and it definitely will be a site-dependent decision. >> > > _______________________________________________ > users mailing list > users@gridengine.org > https://gridengine.org/mailman/listinfo/users _______________________________________________ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users