On Sun, Mar 15, 2009 at 07:21:13PM -0400, Michael McCandless wrote: > Right. I guess it's because Lucene buffers up deletes that it can continue > to accept adds & deletes even during the blip. But it cannot write a new > segment (materialize the adds & deletes) during the blip.
OK, I think that makes sense. Lucene isn't so much performing deletions as promising to perform deletions at some point in the future. There's still a window where no new deletions are being performed (the "blip"), and the process of reconciling deletions finishes during this window. > Does this mean you can run multiple writers against the same index, to gain > concurrency? That would be fab. I hadn't thought it would be possible, but maybe we can get there... > (Though... that's tricky, with deletes; oh maybe because you store new > deletes for an old segment along with the new segment that's OK? Hmm, it > still seems like you'd have a staleness problem). What if we have the deletions reader OR together all bit vectors against a given segment? Search-time performance would dive of course, but I believe we'd get logically correct results. Under the Lucene bit-vector naming scheme, you'd need to keep every deletions file around for the life of a given segment -- at least until you had a consolidator process lock everything down and write an authoritative bit vector. With the current KS bit-vector naming scheme, out of date bit-vector files would be zapped by the merging process (which in this case means the consolidator). I don't think it's any more efficient, though it's arguably cleaner. The tombstone approach would work for the same reason. It doesn't matter if multiple tombstone rows contain a tombstone for the same document, because the priority queue ORs together the results. Therfore, you don't need to coordinate the addition of new tombstones. Claiming a new segment directory and committing a new master file (segments_XXX in Lucene, snapshot_XXX.json in KS) wouldn't require synchronization: if those ops fail because your process lost out in the race condition, you just retry. The only time we have true synchronization requirements is during merging. So... if we were to somehow make tombstones perform adequately at search-time, I think we could make a many-writers-single-merger model work. > Ugh, lock starvation. Really the OS should provide a FIFO lock queue of > some sort. Well, I think this would be less of a headache if we didn't need portability. It's just that the locking and IPC mechanisms provided by various operating systems out there are wildly incompatible. Unfortunately, I don't think there's any other way to implement background merging for all Lucy target hosts besides the multiple-process approach. Lucy will never work with Perl ithreads. PS: FYI, your messages today have premature line-wrapping issues -- your original text, not just the quotes. Marvin Humphrey
