[ https://issues.apache.org/jira/browse/LUCENE-2324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13005188#comment-13005188 ]
Michael McCandless commented on LUCENE-2324: -------------------------------------------- Concurrency today is similar to DWPT in that we simply write into multiple segments in RAM. But the, on flush, we do a merge (in RAM) of these segments and write a single on-disk segment. Vs this change which instead writes N on-disk segments and lets "normal" merging merge them. I think making a different data structure to hold low-DF terms would actually be a big boost in RAM efficiency. The RAM-per-unique-term is fairly high... > Per thread DocumentsWriters that write their own private segments > ----------------------------------------------------------------- > > Key: LUCENE-2324 > URL: https://issues.apache.org/jira/browse/LUCENE-2324 > Project: Lucene - Java > Issue Type: Improvement > Components: Index > Reporter: Michael Busch > Assignee: Michael Busch > Priority: Minor > Fix For: Realtime Branch > > Attachments: LUCENE-2324-SMALL.patch, LUCENE-2324-SMALL.patch, > LUCENE-2324-SMALL.patch, LUCENE-2324-SMALL.patch, LUCENE-2324-SMALL.patch, > LUCENE-2324.patch, LUCENE-2324.patch, LUCENE-2324.patch, LUCENE-2324.patch, > lucene-2324.patch, lucene-2324.patch, test.out, test.out, test.out, test.out > > > See LUCENE-2293 for motivation and more details. > I'm copying here Mike's summary he posted on 2293: > Change the approach for how we buffer in RAM to a more isolated > approach, whereby IW has N fully independent RAM segments > in-process and when a doc needs to be indexed it's added to one of > them. Each segment would also write its own doc stores and > "normal" segment merging (not the inefficient merge we now do on > flush) would merge them. This should be a good simplification in > the chain (eg maybe we can remove the *PerThread classes). The > segments can flush independently, letting us make much better > concurrent use of IO & CPU. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org