I'm actually pretty lazy about index updates, and haven't had the need for efficiency, since my requirement is that new documents should be available on a next working day basis.
I reindex everything from scatch every night (400,000 docs) and store it in an timestamped index. When the reindexing is done, I alert a controller of the new active index. I keep a few versions of the index in case of a failure somewhere and I can always send a message to the controller to use an old index. cheers, sv On Tue, 13 Apr 2004, petite_abeille wrote: > > On Apr 13, 2004, at 02:45, Kevin A. Burton wrote: > > > He mentioned that I might be able to squeeze 5-10% out of index merges > > this way. > > Talking of which... what strategy(ies) do people use to minimize > downtime when updating an index? > > My current "strategy" is as follow: > > (1) use a temporary RAMDirectory for ongoing updates. > (2) perform a "copy on write" when flushing the RAMDirectory into the > persistent index. > > The second step means that I create an offline copy of a live index > before invoking addIndexes() and then substitute the old index with the > new, updated, one. While this effectively increase the time it takes to > update an index, it nonetheless reduce the *perceived* downtime for it. > > Thoughts? Alternative strategies? > > TIA. > > Cheers, > > PA. > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]