[ http://issues.apache.org/jira/browse/JCR-169?page=comments#action_12432083 ] Marcel Reutegger commented on JCR-169: --------------------------------------
Ian, thanks a lot for your comments. Here are my current thoughts on clustering the search index in jackrabbit: I think the prefered approach is to put the index into the repository itself. See: http://article.gmane.org/gmane.comp.apache.jackrabbit.devel/8530 and following messages This would also allow us to distribute index updates to cluster nodes using the repository internal observation mechanism. e.g. the update of a deleted documents file or new index segments. > I found the best indexing strategy was to have local copies of segments, > stored centrally as masters. I agree. Specifically the design of lucene where index files are only created but never modified supports this approach very nicely. > Im the search application, speed of update of segments is not that critical, > you probably have a different requirement in JCR. JCR is more restrictive in that respect, at least if we want to be compliant with the specification. As soon as a node is created in the workspace it must be searchable using a query. For most real life systems this is not a hard requirement though. E.g. when a document is added to a repository, it usually doesn't matter if it is retrievable by query only after a couple of seconds and not right away. > Make Jackrabbit clusterable > --------------------------- > > Key: JCR-169 > URL: http://issues.apache.org/jira/browse/JCR-169 > Project: Jackrabbit > Issue Type: New Feature > Components: core > Reporter: Marcel Reutegger > Priority: Minor > > This jira issue discusses the technical implications on the current design of > Jackrabbit to introduce clustering. > Particularly the following areas require thorough investigation: > - SharedItemStateManager and its cache > - cache integrity > - cache design: look aside, write through? > - hook for distributed cache, interface? > - isolation level > - transaction integrity within Jackrabbit, interaction with transient > layer > - VirtualItemStateProvider > - same strategy as SharedItemStateManager? > - Search index > - single or per cluster node index? > - Observation > Please state more areas if needed. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira