Brian, your name sounded familiar - now I know why ;-)
Of course you have access to UniSight. UniSight is part of the Univa Grid Engine product bundle. The fair-share reporting & analytics functionality I was referring to is however relatively recent and you probably do not have it yet in the version you have installed. Let's take this off-line and check on that. And please get in touch with our support on any questions you might have. Cheers, Fritz Sent from my iPhone Am 09.05.2013 um 23:01 schrieb Brian McNally <[email protected]>: > Fritz, > > We're actually a UGE customer, although the particular cluster I was looking > at hasn't yet been migrated to UGE. I'm not sure if we have a > licenses/support for UniSight though. > > -- > Brian McNally > > On 05/09/2013 12:32 AM, Fritz Ferstl wrote: >> Jesse is right that it is "real" half-life as in radio active decay >> formulas. So after each half-life interval the impact of a recorded >> amount of resource consumption contributed to a job (and thus to a >> user/project leaf node in the share tree) will have become cut in half. >> If you feel that the share tree policy "forgets" too quickly then simply >> increase the half-life. There's also the compensation factor and usage >> scaling factors you can play with to adjust the policy to how you want >> it behave. >> >> While the policy's inherent algorithm is totally deterministic it is >> definitely challenging to try following what's going on. You'd do this >> for debugging reasons but otherwise it is about as helpful as >> calculating the pertinent laws of physics when steering a car around a >> corner. There's way too many variables which are constantly changing. So >> my advice would be the same as for driving a car: try to point into the >> right direction and make adjustments as you see fit. >> >> As per relating the accounting info to what's happening in the >> share-tree policy: that's next to impossible ... which is why we've >> chosen to augment the UniSight accouting/reporting technology in our >> proprietary Univa Grid Engine version to include reporting on share tree >> history. Sorry for the commercial note here but it's the only solution I >> can point you to in this regard. >> >> Cheers, >> >> Fritz >> >> Jesse Becker schrieb: >>> On Wed, May 08, 2013 at 02:02:04PM -0700, Brian McNally wrote: >>>> Thanks Reuti, you're awesome! >>>> >>>> I thought the halftime just dictated the length of time usage took >>>> before it was half its original value. It seems to be that that is >>>> not the same as how long the scheduler keeps usage information for >>>> jobs. Although, at some point, say 4-5 halflife cycles the decayed >>>> usage is very small and doesn't have much of an impact. >>> >>> I seem to recall hearing that "5 halflives" is how long radioactive >>> stuff has to decay before it's "safe." Don't quote me on that though. >>> :) >>> >>> Poking around a bit in the sgeee.c file (from SoGE version 8.1.1, which >>> is what I have handy ATM), it looks like that, even though halftime is >>> specified in hours, the calculations are actually done in minutes. >>> >>> It also looks like a real exponential decay is used, instead of a linear >>> decrease (as in some of the load calculations). I think that the actual >>> decay rates come from the following (sge_support.c): >>> >>> /*-------------------------------------------------------------------- >>> * calculate_decay_constant - calculates decay rate and constant based >>> * on the decay half life and usage interval. The halftime argument >>> * is in minutes. >>> *--------------------------------------------------------------------*/ >>> >>> void >>> calculate_decay_constant( double halftime, >>> double *decay_rate, >>> double *decay_constant ) >>> { >>> if (halftime < 0) { >>> *decay_rate = 1.0; >>> *decay_constant = 0; >>> } else if (halftime == 0) { >>> *decay_rate = 0; >>> *decay_constant = 1.0; >>> } else { >>> *decay_rate = - log(0.5) / (halftime * 60); >>> *decay_constant = 1 - (*decay_rate * sge_usage_interval); >>> } >>> return; >>> } >>> >>> This is especially interesting since it implies that negative halftimes >>> are acceptible. Sure enough, setting a negative value zeros out >>> historical usage: >>> https://blogs.oracle.com/sgrell/entry/a_couple_lines_on_halftime >>> >>> >>> So yes, you'd need to keep your accounting files around for some number >>> of halftimes. At 5 halflives, you're at 1/32nd of the original >>> weighting, or about 3%. >>> >>> >>> >>> >>>> -- >>>> Brian McNally >>>> >>>> On 05/08/2013 01:46 PM, Reuti wrote: >>>>> Hi, >>>>> >>>>> Am 08.05.2013 um 22:30 schrieb Brian McNally: >>>>> >>>>>> qacct reports usage from a file, but GE has its own internal >>>>>> database for tracking jobs and usage. >>>>> >>>>> You mean for the share tree policy? Yes. >>>>> >>>>> >>>>>> Is this correct? If so, what controls the length of time GE keeps >>>>>> job data for? >>>>> >>>>> The "halftime" setting in the scheduler configuration (`man >>>>> sched_conf`). >>>>> >>>>> >>>>>> It seems that using qacct to display overall usage per user (-o), >>>>>> for example, might be a little misleading if the actual accounting >>>>>> information is stored internally. Users might draw conclusions >>>>>> about their usage and how that'll impact their job priorities based >>>>>> on potentially incorrect data. >>>>> >>>>> Unfortunately this is correct. You can even remove the accouting >>>>> file or rotate it which might lead to even different output. It >>>>> would be hard to mimic the internal computation. Maybe setting >>>>> "report_pjob_tickets" to true could give them a hint at which >>>>> position their jobs are in the pending list (usually it's switched >>>>> off for performance reasons). >>>>> >>>>> -- Reuti >>>>> >>>>> >>>>>> >>>>>> Thanks, >>>>>> >>>>>> -- >>>>>> Brian McNally >>>>>> _______________________________________________ >>>>>> users mailing list >>>>>> [email protected] >>>>>> https://gridengine.org/mailman/listinfo/users >>>> _______________________________________________ >>>> users mailing list >>>> [email protected] >>>> https://gridengine.org/mailman/listinfo/users >> >> -- >> >> UnivaFritz Ferstl | CTO and Business Development, EMEA >> Univa Corporation <http://www.univa.com/> | The Data Center Optimization >> Company >> E-Mail: [email protected] | Phone: +49.9471.200.195 | Mobile: >> +49.170.819.7390 >> >> Where Grid Engine lives >> >> >> _______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
