On 17 Feb 2010, at 15:47, Alexander Klimetschek wrote: > Thus a broken search index must not break repository startup or it > must be possible to delete it with a tool w/o requiring a full > repository start.
Over in an earlier version of Sakai we have a distributed Lucene Index, where the index is produced by all nodes in the cluster indexing items and distributing the updates to segments. On one hand this distributes the load well giving a single index over the whole cluster, however, there is increased latency compared to the JR implementation, and, critically, when the index gets corrupted, its often hard to recover. Snapshots are taken real time, but you have to know how far back to go and then reindex. In 24 months of running JR 1.4 and this search index we have had to recover the distributed search index several times, but not had to recover the JR index once, although its become corrupted almost as many times (once or twice max). The distributed index, uses local file storage and segment update shipping. We did try various types of off machine storage but found them all to be way to slow. IMVHO seek performance is critical to Lucene performance. Ian
