[jira] Updated: (LUCENE-2324) Per thread DocumentsWriters that write their own private segments

Michael Busch (JIRA) Wed, 14 Apr 2010 14:15:19 -0700

     [ 
https://issues.apache.org/jira/browse/LUCENE-2324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Michael Busch updated LUCENE-2324:
----------------------------------

    Attachment: lucene-2324.patch

The patch removes all *PerThread classes downstream of DocumentsWriter.

This simplifies a lot of the flushing logic in the different consumers.  The 
patch also removes FreqProxMergeState, because we don't have to interleave 
posting lists from different threads anymore of course.  I really like these 
simplifications!

There is still a lot to do:  The changes in DocumentsWriter and IndexWriter are 
currently just experimental to make everything compile.  Next I will introduce 
DocumentsWriterPerThread and implement the sequenceID logic (which was 
discussed here in earlier comments) and the new RAM management.  I also want to 
go through the indexing chain once again - there are probably a few more things 
to clean up or simplify.

The patch compiles and actually a surprising amount of tests pass.  Only 
multi-threaded tests seem to fail,
which is not very surprising, considering I removed all thread-handling logic 
from DocumentsWriter. :) 

So this patch isn't working yet - just wanted to post my current progress.  

> Per thread DocumentsWriters that write their own private segments
> -----------------------------------------------------------------
>
>                 Key: LUCENE-2324
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2324
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Michael Busch
>            Assignee: Michael Busch
>            Priority: Minor
>             Fix For: 3.1
>
>         Attachments: lucene-2324.patch, LUCENE-2324.patch
>
>
> See LUCENE-2293 for motivation and more details.
> I'm copying here Mike's summary he posted on 2293:
> Change the approach for how we buffer in RAM to a more isolated
> approach, whereby IW has N fully independent RAM segments
> in-process and when a doc needs to be indexed it's added to one of
> them. Each segment would also write its own doc stores and
> "normal" segment merging (not the inefficient merge we now do on
> flush) would merge them. This should be a good simplification in
> the chain (eg maybe we can remove the *PerThread classes). The
> segments can flush independently, letting us make much better
> concurrent use of IO & CPU.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] Updated: (LUCENE-2324) Per thread DocumentsWriters that write their own private segments

Reply via email to