Our Scratch filesystem is organized in the format:
/scratch/$PROEJCT/$USER
$PROJECT is the actual funder of the use of the system and we want to track
their usage over time.
In the past I used:
rhb-report -i --top-users -P/scratch/$PROJECT
To get the information we want, we would like to run this on a regular basis
(once a week) to get time series of per user per project use of scratch, both
by count and size. All data RBH provides.
I am running a git clone of the current RBH tree, the scan went fast, but the
invocation of:
rbh-report -i -P/scratch/$PROJECT
Is actually taking more time than the entire filesystem scan :-(
The scan was 17 hours, and rbh-report -i -P /scratch/aero_flux started Jan
14th, and is not yet done.
Is there a way I can 'train' RBH to make these totals as it scans? Much like
the summary for the entire filesystem? Or is there a problem with what I am
doing? The Database appears CPU bound, disk IO is close to nothing:
21559 | robinhood | localhost | robinhood_scratch | Query | 168744 |
statistics | SELECT parent_id, name INTO pid, n from NAMES where id=
NAME_CONST('pid',_latin1'DAB1E06F:45411CC5'
Any thoughts on this?
Brock Palen
www.umich.edu/~brockp
CAEN Advanced Computing
XSEDE Campus Champion
[email protected]
(734)936-1985
signature.asc
Description: Message signed with OpenPGP using GPGMail
------------------------------------------------------------------------------ CenturyLink Cloud: The Leader in Enterprise Cloud Services. Learn Why More Businesses Are Choosing CenturyLink Cloud For Critical Workloads, Development Environments & Everything In Between. Get a Quote or Start a Free Trial Today. http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
_______________________________________________ robinhood-support mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/robinhood-support
