Our Scratch filesystem is organized in the format:

/scratch/$PROEJCT/$USER

$PROJECT is the actual funder of the use of the system and we want to track 
their usage over time. 

In the past I used:

rhb-report -i --top-users  -P/scratch/$PROJECT

To get the information we want, we would like to run this on a regular basis 
(once a week) to get time series of per user per project use of scratch, both 
by count and size.  All data RBH provides.

I am running a git clone of the current RBH tree, the scan went fast, but the 
invocation of:

rbh-report -i -P/scratch/$PROJECT  

Is actually taking more time than the entire filesystem scan :-(

The scan was 17 hours,  and rbh-report -i -P /scratch/aero_flux   started Jan 
14th,  and is not yet done. 

Is there a way I can 'train' RBH to make these totals as it scans?  Much like 
the summary for the entire filesystem?  Or is there a problem with what I am 
doing?  The Database appears CPU bound, disk IO is close to nothing:

21559 | robinhood | localhost | robinhood_scratch | Query   | 168744 | 
statistics   | SELECT parent_id, name INTO pid, n from NAMES where id= 
NAME_CONST('pid',_latin1'DAB1E06F:45411CC5'  

Any thoughts on this?

Brock Palen
www.umich.edu/~brockp
CAEN Advanced Computing
XSEDE Campus Champion
[email protected]
(734)936-1985



Attachment: signature.asc
Description: Message signed with OpenPGP using GPGMail

------------------------------------------------------------------------------
CenturyLink Cloud: The Leader in Enterprise Cloud Services.
Learn Why More Businesses Are Choosing CenturyLink Cloud For
Critical Workloads, Development Environments & Everything In Between.
Get a Quote or Start a Free Trial Today. 
http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
_______________________________________________
robinhood-support mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/robinhood-support

Reply via email to