[jira] [Commented] (LUCENE-2312) Search on IndexWriter's RAM Buffer

Jason Rutherglen (JIRA) Fri, 09 Sep 2011 11:05:38 -0700

    [ 
https://issues.apache.org/jira/browse/LUCENE-2312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13101391#comment-13101391
 ]


Jason Rutherglen commented on LUCENE-2312:
------------------------------------------

There are many important use cases for immediate / zero delay index readers.

I'm not sure if people realize it, but one of the major gains from this issue, 
is the ability to obtain a reader after every indexed document.  

In this case, instead of performing an array copy of the RT data structures, we 
will queue the changes, and then apply to the new reader.  For arrays like term 
freqs, we will use a temp hash map of the changes made since the main array was 
created (when the hash map grows too large we can perform a full array copy).



> Search on IndexWriter's RAM Buffer
> ----------------------------------
>
>                 Key: LUCENE-2312
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2312
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: core/search
>    Affects Versions: 4.0
>            Reporter: Jason Rutherglen
>            Assignee: Michael Busch
>         Attachments: LUCENE-2312-FC.patch, LUCENE-2312.patch, 
> LUCENE-2312.patch, LUCENE-2312.patch
>
>
> In order to offer user's near realtime search, without incurring
> an indexing performance penalty, we can implement search on
> IndexWriter's RAM buffer. This is the buffer that is filled in
> RAM as documents are indexed. Currently the RAM buffer is
> flushed to the underlying directory (usually disk) before being
> made searchable. 
> Todays Lucene based NRT systems must incur the cost of merging
> segments, which can slow indexing. 
> Michael Busch has good suggestions regarding how to handle deletes using max 
> doc ids.  
> https://issues.apache.org/jira/browse/LUCENE-2293?focusedCommentId=12841923&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12841923
> The area that isn't fully fleshed out is the terms dictionary,
> which needs to be sorted prior to queries executing. Currently
> IW implements a specialized hash table. Michael B has a
> suggestion here: 
> https://issues.apache.org/jira/browse/LUCENE-2293?focusedCommentId=12841915&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12841915

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (LUCENE-2312) Search on IndexWriter's RAM Buffer

Reply via email to