Re: Consistency check of indexes takes to long

Marcel Reutegger Tue, 14 Aug 2007 06:20:10 -0700

Hi Christoph,

Christoph Kiehl wrote:

we've got very big indexes/workspaces on our production servers whichhave from 3,000,000 to 8,000,000 nodes and are still growing because ofcreation of versions and adding new nodes.When it happens that the VM in which Jackrabbit lives in crashes duringa write operation, Jackrabbit nicely applies the redo log on a restartwhich gets done quite quick but then starts its consistency check. Thischeck takes from 30 minutes to 2 hours depending on the repository. Inthis time our application is offline which we would of course like toavoid ;) Our system uses a bundle oracle pm which probably doesn't makethings better.I had a quick glance at the consistency check code and it seems likethere is nothing that could be substantially optimized in that place. Ithought it might be possible to just include those index segments thatwhere used while replaying the redo log but as the consistency checkworks this is impossible.I think the only way to fasten startup is to avoid the occurrence of theerrors that the check is checking for at all. Since the redo logmechanism seems quite good I'm not sure if those errors(MissingAncestor, MultipleEntries, NodeDeleted, UnknownParent) can stilloccur. Could you maybe elaborate on the situations where you expectthose errors to arise?

IIRC the consistence check was introduced first in jackrabbit and later the redolog mechanism, which makes the consistence check kind of superfluous.

For now I'm thinking about disabling consistency checks at all bydefault and run them in a maintenance window at night. Unfortunatelythis might be a bit dangerous if parts of the application rely oncertain nodes to be found by queries :/
WDYT?

I agree with you. We could introduce a third configuration value forforceConsistenceCheck (in addition to 'true' and 'false'): disabled. that wouldthen be the default in the next released version of jackrabbit.


WDYT?

regards
 marcel

Re: Consistency check of indexes takes to long

Reply via email to