On Wed, May 08, 2013 at 02:02:04PM -0700, Brian McNally wrote:
Thanks Reuti, you're awesome!

I thought the halftime just dictated the length of time usage took before it was half its original value. It seems to be that that is not the same as how long the scheduler keeps usage information for jobs. Although, at some point, say 4-5 halflife cycles the decayed usage is very small and doesn't have much of an impact.

I seem to recall hearing that "5 halflives" is how long radioactive
stuff has to decay before it's "safe."  Don't quote me on that though.
:)

Poking around a bit in the sgeee.c file (from SoGE version 8.1.1, which
is what I have handy ATM), it looks like that, even though halftime is
specified in hours, the calculations are actually done in minutes.

It also looks like a real exponential decay is used, instead of a linear
decrease (as in some of the load calculations).  I think that the actual
decay rates come from the following (sge_support.c):

/*--------------------------------------------------------------------
 * calculate_decay_constant - calculates decay rate and constant based
 * on the decay half life and usage interval. The halftime argument
 * is in minutes.
 *--------------------------------------------------------------------*/

void
calculate_decay_constant( double halftime,
                          double *decay_rate,
                          double *decay_constant )
{
   if (halftime < 0) {
      *decay_rate = 1.0;
      *decay_constant = 0;
   } else if (halftime == 0) {
      *decay_rate = 0;
      *decay_constant = 1.0;
   } else {
      *decay_rate = - log(0.5) / (halftime * 60);
      *decay_constant = 1 - (*decay_rate * sge_usage_interval);
   }
   return;
}

This is especially interesting since it implies that negative halftimes
are acceptible.  Sure enough, setting a negative value zeros out
historical usage:
https://blogs.oracle.com/sgrell/entry/a_couple_lines_on_halftime


So yes, you'd need to keep your accounting files around for some number
of halftimes.  At 5 halflives, you're at 1/32nd of the original
weighting, or about 3%.




--
Brian McNally

On 05/08/2013 01:46 PM, Reuti wrote:
Hi,

Am 08.05.2013 um 22:30 schrieb Brian McNally:

qacct reports usage from a file, but GE has its own internal database for 
tracking jobs and usage.

You mean for the share tree policy? Yes.


Is this correct? If so, what controls the length of time GE keeps job data for?

The "halftime" setting in the scheduler configuration (`man sched_conf`).


It seems that using qacct to display overall usage per user (-o), for example, 
might be a little misleading if the actual accounting information is stored 
internally. Users might draw conclusions about their usage and how that'll 
impact their job priorities based on potentially incorrect data.

Unfortunately this is correct. You can even remove the accouting file or rotate it which 
might lead to even different output. It would be hard to mimic the internal computation. 
Maybe setting "report_pjob_tickets" to true could give them a hint at which 
position their jobs are in the pending list (usually it's switched off for performance 
reasons).

-- Reuti



Thanks,

--
Brian McNally
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

--
Jesse Becker
NHGRI Linux support (Digicon Contractor)
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to