Howdy All,
I am building a tool that uses RBH's database scheme to traverse our
scanned filesystem and report the users, along with all the folders
that contain over N amount of entries in them (1000 for example). In
addition, we will be reporting on old file usage based on mtime (we do
not support atime on our filesystem). These different reports will
then automatically email the user, us, etc. about it.
We have 75 million entires in total spread across a few RBH databases
(one database for each filesystem).
The part I am having trouble with, is _efficiently_ starting from the
lowest parent_id (6BCDE980:2), the root of the filesystem, and
traversing all the way to the lowest node on the graph. Effectlying
saying, given this parent_id of 6BCDE980:2, output all of its
children, and their children until no more children exists.
I believe part of the issue is perhaps my lack of understanding self
referencing databases, where id of one entry, is a parent_id of
another entry. I do not believe MySQL can handle this sort of data
structure. Is this correct? Have others done something similar,
traversing the RBH scheme from the root node to the bottom?
Below is the pseudocode that I have written which I is the brute force
/ naive solution for RBH 2.4.3 SCHEME. Any sort of comments will be
greatly appreciated.
function wrapper (string startPath) {
query := SELECT usr.id, usr.fullpath AS fullpath FROM ENTRIES AS
grp LEFT JOIN ENTRIES AS usr ON usr.parent_id = grp.id WHERE
grp.fullpath = 'startPath' AND usr.fullpath != 'startPath'
// ^^ Produces one level depth of children on the parent_id of startPath
foreach result in query {
find1000Entries(result)
output what is on the stack and email users
}
}
function find1000Entries (string startPath) {
query := SELECT COUNT(usr.fullpath) AS zotCount FROM ENTRIES AS
grp LEFT JOIN ENTRIES AS usr ON usr.parent_id = grp.id WHERE
grp.fullpath = 'startPath'
// ^^ Produces one level depth of children on the parent_id of startPath
if(zotCount from query >= 1000)
report a match and push to stack
queryNext := SELECT usr.id, usr.fullpath AS fullpath FROM ENTRIES
AS grp LEFT JOIN ENTRIES AS usr ON usr.parent_id = grp.id WHERE
grp.fullpath = 'startPath' AND usr.type = 'dir'
// ^^ Produces one level depth of children on the parent_id of startPath
// that are only directories
foreach result in queryNext {
find1000Entries(result)
^^^ Call this function again, but now on directory given by queryNext
^^^ keep doing this till no more entires exist
}
}
--
Adam Brenner
Computer Science, Undergraduate Student
Donald Bren School of Information and Computer Sciences
Research Computing Support
Office of Information Technology
http://www.oit.uci.edu/rcs/
University of California, Irvine
www.ics.uci.edu/~aebrenne/
[email protected]
------------------------------------------------------------------------------
Rapidly troubleshoot problems before they affect your business. Most IT
organizations don't have a clear picture of how application performance
affects their revenue. With AppDynamics, you get 100% visibility into your
Java,.NET, & PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro!
http://pubads.g.doubleclick.net/gampad/clk?id=84349831&iu=/4140/ostg.clktrk
_______________________________________________
robinhood-support mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/robinhood-support