M.M.: "That reader is opened using IndexWriter's SegmentInfos instance, so
it
can read segments & deletions that have been flushed but not
committed.  It's allowed to do its own deletions & norms updating.
When reopen() is called, it grabs the writers SegmentInfos again."

Are you referring to the IW.pendingCommit SegmentInfos variable?  When you
say "flushed" you are referring to the IW.prepareCommit method?

I think step #1 is important and should be generally useful outside of
realtime search, however it's unclear how/when calls to IW.deleteDocument
will reflect in IW.getReader?  I assumed that IW.commit would result in
IW.deleteDocument changes showing up in IW.getReader.  Calls to
Transaction.deleteDocument/flush would show up immediately otherwise it's
generally unclear to the user the semantics of realtime indexing vs. IW
based batch indexing use cases.  With IW indexing one adds documents and
deletes documents then does a global commit to the main directory.
Interleaving deletes with documents added isn't possible because if the
documents are in the IW ram buffer, they are not necessarily deleted, so it
seems that if the semantics are such that IW.commit or IW.prepareCommit
expose deletes via IW.getReader, what is the difference compared to
IndexReader.reopen on the index except the shared write lock?  Ok, perhaps
this is all one gets and as you mentioned the rest is placed on a level
above IW which hopefully does not confuse the user.

M.M.: "

> Patch #2: Implement a realtime ram index class
>

I think this one is optional, or, rather an optimazation that we can
swap in later if/when necessary?  Ie for starters little segments are
written into the main Directory."

If this is swapped in later how is the system realtime except perhaps
deletes?

M.M.: "Can't this be layered on top?
Or... are you looking to add support for multiple transactions in
flight at once on IndexWriter?"

The initial version can be layered on top, that will make testing easier.
Adding support for multiple transactions at once on IndexWriter outside of
the realtime transactions seems to require a lot of refactoring.


On Fri, Jan 9, 2009 at 5:39 AM, Michael McCandless <
luc...@mikemccandless.com> wrote:

>
> Jason Rutherglen wrote:
>
>  Patch #1: Expose an IndexWriter.getReader method that returns the current
>> reader and shares the write lock
>>
>
> I tentatively like this approach so far...
>
> That reader is opened using IndexWriter's SegmentInfos instance, so it
> can read segments & deletions that have been flushed but not
> committed.  It's allowed to do its own deletions & norms updating.
> When reopen() is called, it grabs the writers SegmentInfos again.
>
>  Patch #2: Implement a realtime ram index class
>>
>
> I think this one is optional, or, rather an optimazation that we can
> swap in later if/when necessary?  Ie for starters little segments are
> written into the main Directory.
>
>  Patch #3: Implement realtime transactions in IndexWriter or in a subclass
>> of IndexWriter by implementing a createTransaction method that generates a
>> realtime Transaction object.  When the transaction is flushed, the
>> transaction index modifications are available via the getReader method of
>> IndexWriter
>>
>
> Can't this be layered on top?
>
> Or... are you looking to add support for multiple transactions in
> flight at once on IndexWriter?
>
> Mike
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-dev-h...@lucene.apache.org
>
>

Reply via email to