Am 16.03.2011 um 13:18 schrieb Erik Soyez: > Well, that's probably true, "exclusive" resources are not the best choice. > But the concept could work though if you defined that resource as ordinary > "consumable", e.g.: > > Complex definition: > ------------------------------------------------------------------------ > exclusive excl INT <= YES YES 0 0 > ------------------------------------------------------------------------ > > Exec host definition (each host): > ------------------------------------------------------------------------ > complex_values exclusive=1 > ------------------------------------------------------------------------ > > sge request (e.g. Nx6-CPU-Jobs): > ------------------------------------------------------------------------ > --soft -l exclusive=0.1665 > ------------------------------------------------------------------------
Exactly this soft consumable is the main problem: Unable to run job: denied: soft requests on consumables like "exclusive" are not supported. There was a discussion on the former mailing list, how to change this behavior. -- Reuti > > Erik Soyez. > > > On Wed, 16 Mar 2011, Reuti wrote: > >> Am 16.03.2011 um 12:28 schrieb Erik Soyez: >> >>> Good day Alex, >>> >>> you could try implementing an "exclusive" ressource and request it with >>> "--soft", e.g. "--soft -l exclusive" in sge_request file as default. >> >> Won't this block the nodes completely? As soon as one job is occupying >> 6 slots, the second job can't start as the "-soft -l exclusive" can't be >> revoked in the future again, once the soft request was granted. I think >> this is the main reason why soft consumables are denied, as the intended >> behavior is not really clear (this could be changed, that granted soft >> requests are handled as hard requests lateron). >> >> >>> I have never tried this combination but have a look at "man complex", >>> it's just an idea.... Erik Soyez. >>> >>> >>> On Wed, 16 Mar 2011, Alex Phillips wrote: >>> >>>> Dear List, >>>> We have a cluster of 1920 cores spread over 160 nodes (12 cores/node), we >>>> only run one code in one queue, with jobs of between 48 and 256 cores >>>> using an mpi pe. >>>> When benchmarking our code we found a 14-15% speedup by running on 6 >>>> cores/node, compared with 12 cores/node. >>>> We also found that if we ran on 6 cores/node, with a second job on the >>>> other 6cores/node, we still have a 5-6% speedup. >>>> So I have configured our mpi pe with allocation_rule = 6, and this works, >>>> however, as the cluster fills up, the scheduler is starting a second job >>>> on some nodes, before all the nodes are busy. >>>> How can we configure the scheduler to run one job on all the nodes, before >>>> starting a second job ? >>>> I have tried defining the number of slots as a complex value on the >>>> execution hosts, I?ve tried ?np_load_avg, np_load_avg, slots, and -slots >>>> as the load_formula, but I can?t get it to work. >>>> I?ve read >>>> _http://blogs.sun.com/sgrell/entry/grid_engine_scheduler_hacks_least_ but >>>> I can?t set the allocation rule to $pe_slots, as we only want to run on 6 >>>> cores/node, not 12. >>>> Any suggestions ? > > > -- > > > > -- > Vorstand/Board of Management: > Dr. Bernd Finkbeiner, Dr. Roland Niemeier, Dr. Arno Steitz, Dr. Ingrid Zech > Vorsitzender des Aufsichtsrats/ > Chairman of the Supervisory Board: > Michel Lepert > Sitz/Registered Office: Tuebingen > Registergericht/Registration Court: Stuttgart > Registernummer/Commercial Register No.: HRB 382196 > > _______________________________________________ > users mailing list > [email protected] > https://gridengine.org/mailman/listinfo/users _______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
