Trey,

In http://slurm.schedmd.com/fair_tree.html#fairshare, take a look at the definition for "S". Basically, the normalized shares only matters between sibling associations and will equal 1.0 when summed. If an association has no siblings, the value is 1.0. If each of the four siblings in an account has the same Raw Shares (as defined in sacctmgr) value, the normalized shares value for each is 0.25. The reason why is because the Level Fairshare calculations are only done within in account, comparing siblings to each other. Note that Norm Usage is still presented in sshare but not used in the calculations.

The sshare manpage has a section about the Fair Tree modifications to existing columns: http://slurm.schedmd.com/sshare.html#SECTION_FAIR_TREE%20MODIFICATIONS

Ryan

On 06/03/2015 02:47 PM, Trey Dockendorf wrote:
FAIR_TREE in SLURM 14.11
My site is currently on 14.03.10 and we are evaluating and testing 14.11.7 as well as moving from PriorityFlags=DEPTH_OBLIVIOUS,SMALL_RELATIVE_TO_TIME to using PriorityFlags=FAIR_TREE,SMALL_RELATIVE_TO_TIME.

Our account hierarchy is very deep and is intended to represent the org structure of departments and research organizations that are using our cluster [1]. We were able to make the normalized share ratio match up so all non-stakeholders were equal (0.000323) and all stakeholders had the correct ratio based on their contributions to the cluster. The Shares value assigned represents CPUs funded. All the CPUs no longer belonging to stakeholders were given to the "mgmt" group so that the Shares given to the top level (tamu) had a meaningful value when divided up amongst all the accounts.

While testing FAIR_TREE I noticed the normalized shares were drastically different [2]. In particular the current stakeholders (idhcm and hepx) both ended up with 1.0. I'm guessing this is due to having no sibling accounts.

The docs for FAIR_TREE only describe the formula used to calculate the Level FairShare. Does the method for calculating normalized shares change for FAIR_TREE? Is the hierarchy we are using not a good fit for FAIR_TREE? The description and benefits of FAIR_TREE appeal to our use case, so modifying our hierarchy is within the realm of things I'm willing to change.

Any advice on migrating into FAIR_TREE is more than welcome. Right now I've been running "sleep" jobs under different UIDs to simulate usage to try and work out how we may need to adjust things for a migration to FAIR_TREE.

I used the attached spreadsheet to work out the share values we are using with 14.03.10.

Thanks,
- Trey

[1]:
Account User Raw Shares Norm Shares Raw Usage Effectv Usage FairShare -------------------- ---------- ---------- ----------- ----------- ------------- ----------
root                        1.000000   114089982      1.000000   0.870551
 root    root          1    0.000323           0      0.000000 1.000000
 grid                  1    0.000323        3688      0.000032 0.986174
  cms                 10    0.000269        3688      0.000027 0.986155
suragrid 1 0.000027 0 0.000000 1.000000
 tamu               3096    0.999354   114086294      0.999968 0.870477
agriculture 20 0.006671 2697 0.000024 0.999507 aglife 1 0.003336 2697 0.000012 0.999507 genomics 1 0.003336 0 0.000000 1.000000 engineering 10 0.003336 0 0.000000 1.000000
   pete                  1    0.003336           0      0.000000 1.000000
general 10 0.003336 5542 0.000049 0.997977
  geo                 10    0.003336           2      0.000000 0.999999
   atmo                  1    0.003336           2      0.000000 0.999999
liberalarts 128 0.042696 0 0.000000 1.000000 idhmc 1 0.042696 0 0.000000 1.000000 mgmt 2058 0.686472 16 0.000000 1.000000 science 760 0.253508 114078034 0.999895 0.578806
   acad                 10    0.003336           0      0.000000 1.000000
   chem                 10    0.003336           0      0.000000 1.000000
iamcs 10 0.003336 3506549 0.030649 0.279777 math-dept 20 0.006671 11735411 0.102422 0.119035 math 10 0.003336 11735411 0.102422 0.014169 secant 10 0.003336 0 0.000000 1.000000 physics 700 0.233494 98836073 0.919795 0.579205 hepx 700 0.233494 98836073 0.919795 0.579205
   stat                 10    0.003336           0      0.000000 1.000000
carroll 10 0.003336 0 0.000000 1.000000

[2]:
Account User Raw Shares Norm Shares Raw Usage Effectv Usage FairShare -------------------- ---------- ---------- ----------- ----------- ------------- ----------
root                        0.000000       53229      1.000000
 root    root          1    0.000323           0      0.000000 1.000000
 grid                  1    0.000323           0      0.000000
  cms                 10    0.909091           0      0.000000
  suragrid                   1    0.090909           0      0.000000
 tamu               3096    0.999354       53229      1.000000
  agriculture                 20    0.006676           0      0.000000
   aglife                  1    0.500000           0      0.000000
   genomics                  1    0.500000           0      0.000000
  engineering                 10    0.003338           0      0.000000
   pete                  1    1.000000           0      0.000000
  general                 10    0.003338        6326      0.118860
  geo                 10    0.003338           0      0.000000
   atmo                  1    1.000000           0      0.000000
  liberalarts                128    0.042724       13122      0.246522
   idhmc                   1    1.000000       13122      1.000000
  mgmt                2058    0.686916       20984      0.394237
  science                760    0.253672       12795      0.240382
   acad                 10    0.013158           0      0.000000
   chem                 10    0.013158           0      0.000000
   iamcs                  10    0.013158           0      0.000000
   math-dept                  20    0.026316           0      0.000000
    math                  10    0.500000           0      0.000000
    secant                  10    0.500000           0      0.000000
   physics                 700    0.921053       12795      1.000000
    hepx                   1    1.000000       12795      1.000000
   stat                 10    0.013158           0      0.000000
    carroll                  1    1.000000           0      0.000000

=============================

Trey Dockendorf
Systems Analyst I
Texas A&M University
Academy for Advanced Telecommunications and Learning Technologies
Phone: (979)458-2396
Email: treyd...@tamu.edu <mailto:treyd...@tamu.edu>
Jabber: treyd...@tamu.edu <mailto:treyd...@tamu.edu>

--
Ryan Cox
Operations Director
Fulton Supercomputing Lab
Brigham Young University

Reply via email to