Hi, On Wed, May 29, 2013 at 12:28 PM, Thomas Mueller <muel...@adobe.com> wrote: > What would happen if you stop a cluster node for a long time (for example > 1 day)? Would async indexing be done on another cluster node? If yes, I > guess we need a way to ensure that's the case. If not, then the problem > might be that old revisions are no longer available.
There could be various reasons for why an indexer might not be available for an extended amount of time, so I think in any case we need some mechanism for it to pick up from where it left. As you mentioned, journaled observation will need some similar mechanism. I see at least the following options: a) If, like in the Segment and H2 MKs, we could rely on the MKs supporting cheap copies and diffs across subtrees, we could implement this without API changes by keeping a copy of the last indexed/seen state of the repository in a hidden subtree. The indexer would refresh this copy on each index update, and could thus always know what content has already been indexed. Unfortunately there probably isn't any easy way to do this in the MongoMK. b) Have some way to mark specific revisions as ones that should be kept around for a longer time (e.g. using a lease mechanism). The indexer could then store such a revision id as a part of an index update as a record of what content was last indexed. c) Keep a log of all changes since the last index update. This is probably the least attractive solution as it adds quite a bit of write overhead and, unless the log is maintained by the MK, we'd still have to worry about potential lost updates. BR, Jukka Zitting