Hi, On Wed, May 29, 2013 at 3:01 PM, Thomas Mueller <muel...@adobe.com> wrote: >>There could be various reasons for why an indexer might not be >>available for an extended amount of time > > Possibly you are right, but let me try to challenge this assumption: > > Wouldn't it be a problem if the index isn't updated for a long time?
Not necessarily. For example during large batch imports or content migrations it might be useful to be able to speed things up by disabling things like full text indexing. Or it could be that an external index server like Solr is down for maintenance or other reasons. Such cases would obviously lead to some loss of functionality, but probably wouldn't be too troublesome if the relevant indexers were able to automatically pick up from where they left. > Don't we need a protection against an outdated index? Any asynchronous indexes will in any case need to be resilient against some mismatch between the index and repository content. Whether that is measured in minutes or days should be irrelevant to the index implementation, the only impact would be on the freshness assumptions that applications or end users might have. >>a) If, like in the Segment and H2 MKs, we could rely on the MKs >>supporting cheap copies and diffs across subtrees, we could implement >>this without API changes by keeping a copy of the last indexed/seen >>state of the repository in a hidden subtree. The indexer would refresh >>this copy on each index update, and could thus always know what >>content has already been indexed. Unfortunately there probably isn't >>any easy way to do this in the MongoMK. > > It sounds like reading with old revisions. Not really; let me rephrase. What I'm suggesting is something like this: NodeState root = branch.getHead(); NodeState index = root.getChildNode("oak:index").getChildNode("someIndex"); NodeState before = index.getNode(":before"); NodeBuilder rootBuilder = root.builder(); NodeBuilder indexBuilder = rootBuilder.getChildNode("oak:index").getChildNode("someIndex"); root.compareAgainstBaseState( before, new IndexUpdate(indexBuilder.getChildNode(":index"))); indexBuilder.setChildNode(":before", root); branch.setRoot(rootBuilder.getNodeState()); branch.merge(); I.e. instead of tracking things by revision, we'd just make a full copy of the entire content tree that has already been indexed. Unfortunately, AFAICT, in MongoMK this would require the duplication of the entire subtree instead of the copy by reference that the Segment and H2 MKs could do. BR, Jukka Zitting