My project has run into problems along these lines as a result of
clustering. Since our application servers are managed very strictly by a
centralized infrastructure group, we have no guarantees of having the same
filesystem available if we have to bring up another server to meet load or
balance a failure. In that case, the entire repository gets reindexed over
JDBC, which can take quite a while with 15,000 documents.

Of course, I appreciate the performance benefits of having the index
locally, but I wonder if there might be a middle ground that could
substantially help clustered deployments. Perhaps each server could maintain
its own index locally, syncing at intervals with a central datasource. If
stored in a protected JCR node, the existing journaling might work to keep
things current? Then, if a new repository joins the cluster, it could pull
down a ready-to-go index.

No idea if this is feasible, of course, just dreaming out loud :)

Cheers,
Adam Foltzer

Reply via email to