Hello Thomas, Sorry, I just saw the email you sent on robinhood-support mailing list and that was blocked waiting for admin validation. About multiple robinhood instances, the documentation says that you can split the features on different nodes: basically, the database server can run on a machine, FS scan on another machine, disk resource monitoring and purging on another machine, etc... But you must only run a single instance of each feature at a given time.
Thomas Roth wrote: >> Is there a way to "partition" a file system for Robinhood? Tell an >> instance to only scan certain directories? Because I think the issue is >> not a really broken data base, but simply a later coming Robin scanning >> files that were already done? What is your need exactly? Do you want to speed-up the scan by running several robinhood instances, or do you only want to scan certain directories? - About speed, robinhood already performs scans in parallel with multiple threads, each one scanning different directories. So if you want more parallelism, increase the number of scan threads. - If your need is to scan only some parts of the namespace, you can ignore directories by specifying "ignore" rules in the configuration file (FS_Scan section) E.g. ignore { path == "/lustre/xyz*" } if you know the path you want to ignore, or a negation: ignore { not ( path == "/lustre/dir1" or path == "/lustre/dir2/subdir*" ) } if you know the paths you want to scan. >> > > ListMgr | DB query failed in ListMgr_Insert line 340... >> > and assorted messages, which seem to indicate that the new robinhood >> > scan tries to put something into the DB that is already there, and >> > stumbles on this. Or maybe that happens when several robins are >> > running simultaneously. >> Are you running several instances for scanning the same filesystem?? > > Well, yes, tried that also. Actually I was under the impression that > this is a feature of Robinhood - of course, now that I am looking for > this in the documentation I can't find it. > > But these errors from the DB definitely did arise first when I > restarted robinhood anew after some changes (location of log file, > debug level, ...) in the config file. But since there was no change in > the robinhood version, I did not empty the database. After this > restart, I immediately got a lot of > > 2010/11/04 11:27:45 robinhood[1489/4]: EntryProc | Error 3 > performing database operation. > > 2010/11/04 11:27:45 robinhood[1489/8]: ListMgr | DB query failed in > ListMgr_Insert line 340: pk='54051386:6D286C', code=3: Duplicate entry > '54051386:6D286C' for key 1 > > I suppose this is something that should not happen when one is feeding > a database? Yes, these errors seams to be caused by the concurrence between several feeders. This is not sane, and the db content may be inconsistent now. So I recommend you to stop all your running instances, clear the db content (command "rbh-config empty_db") and then, only start a single instance for scanning. Best regards, Thomas. _______________________________________________ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss