Re: Realtime Search

Jason Rutherglen Thu, 08 Jan 2009 19:25:42 -0800

Based on our discussions, it seems best to get realtime search going in
small steps.  Below are some possible steps to take.


Patch #1: Expose an IndexWriter.getReader method that returns the current
reader and shares the write lock
Patch #2: Implement a realtime ram index class
Patch #3: Implement realtime transactions in IndexWriter or in a subclass of
IndexWriter by implementing a createTransaction method that generates a
realtime Transaction object.  When the transaction is flushed, the
transaction index modifications are available via the getReader method of
IndexWriter

The remaining question is how to synchronize the flushes to disk with
IndexWriter's other index update locking mechanisms.  The flushing could
simply use IW.addIndexes which has in place a locking mechanism.  After
flushing to disk, queued deletes would be applied to the newly copied disk
segments.  I think this entails opening the newly copied disk segments and
applying deletes that occurred to the corresponding ram segments by cloning
the new disk segments and replacing the deleteddocs bitvector then flushing
the deleteddocs to disk.  This system would allow us to avoid using UID in
documents.

The API needs to clearly separate realtime transactions vs. the existing
index update method such as addDocument, deleteDocuments, and
updateDocument.  I don't think it's possible to transparently implement both
because the underlying implementations behave differently.  It is expected
that multiple transaction may be created at once however the
Transaction.flush method would block.

Re: Realtime Search

Reply via email to