I spent a bit of time looking at things to replace ARCO--which I found
more trouble than it's worth.  You *could* get it to work, but it was
alsways more trouble that it was worth to do so, especially since it's
simple to parse the accounting and reporting files.

I looked at a few different open source projects to do SGE reporting
including S-GAE (which looked good).  I went with XDMoD largely because
the it superceeded UBMoD, the built-in reporting abilities are
decent, and it supports multiple schedulers.  I also seriously considered
writing my own, and put some work towards that (also an excuse to play
with Redis a bit).  But then we'd just have N+1 implementations[1].

I'd be very interested in hearing about S-GAE, since I didn't get around
to playing much with it.

One thing that we *have* learned is that you should keep all of the
raw records.  They compress well, and disk space is cheap.  Our UGE
logs compress about 85% using gzip -9, and is fast.  Other methods
(xz) get almost 90%, but take about 100 times longer to compress.
(The specific method doesn't matter, even LZO would do nicely).

This is important, because you can "quickly" re-ingest all of your
historical records into a new system in case you:

   1) change systems.
   2) botch an ingest and have to start over
   3) Have a catestrophic failure of {host,database,hardware}and have
      to recover.

In the case of XDMoD, "backfilling" records requires a little trickery
based on how they are processed, but it's nothing too complicated.




[1] Obligatory XKCD:  http://xkcd.com/927/



On Mon, Mar 02, 2015 at 11:54:26AM -0500, Chris Dagdigian wrote:

ooh the various MoD ("metrics on demand") look pretty interesting. Would love to chat about how people have made XDMoD and other variants work with Grid Engine(s) -- can we get a little thread going on best practices and recommendations for 3rd party reporting/metrics tools? Suspect there would be decent interest in this ...

-Chris


Tina Friedrich <mailto:[email protected]>
March 2, 2015 at 11:37 AM
Yes, there's an additional field - job_class.

I'm not using S-GAE, so got nothing for you I'm afraid; I had a similar problem with UBMoD (which I'm still running), where I had to make (probably similar) changes to make it work (keep it working, rather).

Tina




_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

--
Jesse Becker (Contractor)
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to