[
https://issues.apache.org/jira/browse/LUCENE-2324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael Busch updated LUCENE-2324:
----------------------------------
Attachment: lucene-2324.patch
The patch removes all *PerThread classes downstream of DocumentsWriter.
This simplifies a lot of the flushing logic in the different consumers. The
patch also removes FreqProxMergeState, because we don't have to interleave
posting lists from different threads anymore of course. I really like these
simplifications!
There is still a lot to do: The changes in DocumentsWriter and IndexWriter are
currently just experimental to make everything compile. Next I will introduce
DocumentsWriterPerThread and implement the sequenceID logic (which was
discussed here in earlier comments) and the new RAM management. I also want to
go through the indexing chain once again - there are probably a few more things
to clean up or simplify.
The patch compiles and actually a surprising amount of tests pass. Only
multi-threaded tests seem to fail,
which is not very surprising, considering I removed all thread-handling logic
from DocumentsWriter. :)
So this patch isn't working yet - just wanted to post my current progress.
> Per thread DocumentsWriters that write their own private segments
> -----------------------------------------------------------------
>
> Key: LUCENE-2324
> URL: https://issues.apache.org/jira/browse/LUCENE-2324
> Project: Lucene - Java
> Issue Type: Improvement
> Components: Index
> Reporter: Michael Busch
> Assignee: Michael Busch
> Priority: Minor
> Fix For: 3.1
>
> Attachments: lucene-2324.patch, LUCENE-2324.patch
>
>
> See LUCENE-2293 for motivation and more details.
> I'm copying here Mike's summary he posted on 2293:
> Change the approach for how we buffer in RAM to a more isolated
> approach, whereby IW has N fully independent RAM segments
> in-process and when a doc needs to be indexed it's added to one of
> them. Each segment would also write its own doc stores and
> "normal" segment merging (not the inefficient merge we now do on
> flush) would merge them. This should be a good simplification in
> the chain (eg maybe we can remove the *PerThread classes). The
> segments can flush independently, letting us make much better
> concurrent use of IO & CPU.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]