[ https://issues.apache.org/jira/browse/LUCENE-2324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12983316#action_12983316 ]
Michael McCandless commented on LUCENE-2324: -------------------------------------------- The branch is looking very nice!! Very clean :) Random comments: Why does DW.anyDeletions need to be sync'd? Missing headers on at least DocumentsWriterPerThreadPool, ThreadAffinityDWTP. IWC.setIndexerThreadPool's javadoc is stale. On ThreadAffinityDWTP... it may be better if we had a single queue, where threads wait in line, if no DWPT is available? And when a DWPT finishes it then notifies any waiting threads? (Ie, instead of queue-per-DWPT). I see the fieldInfos.update(dwpt.getFieldInfos()) (in DW.updateDocument) -- is there a risk that two threads bring a new field into existence at the same time, but w/ different config? Eg one doc omitsTFAP and the other doesn't? Or, on flush, does each DWPT use its private FieldInfos to correctly flush the segment? (Hmm: do we seed each DWPT w/ the original FieldInfos created by IW on init?). How are we handling the case of open IW, do delete-by-term but no added docs? Does DW.pushDeletes really need to sync on IW? BufferedDeletes is sync'd already. DW.substractFlushedDocs is mis-spelled (not sure it's used though). In DW.deleteTerms... shouldn't we skip a DWPT if it has no buffered docs? > Per thread DocumentsWriters that write their own private segments > ----------------------------------------------------------------- > > Key: LUCENE-2324 > URL: https://issues.apache.org/jira/browse/LUCENE-2324 > Project: Lucene - Java > Issue Type: Improvement > Components: Index > Reporter: Michael Busch > Assignee: Michael Busch > Priority: Minor > Fix For: Realtime Branch > > Attachments: LUCENE-2324-SMALL.patch, LUCENE-2324-SMALL.patch, > LUCENE-2324-SMALL.patch, LUCENE-2324-SMALL.patch, LUCENE-2324-SMALL.patch, > LUCENE-2324.patch, LUCENE-2324.patch, LUCENE-2324.patch, lucene-2324.patch, > lucene-2324.patch, LUCENE-2324.patch, test.out, test.out, test.out, test.out > > > See LUCENE-2293 for motivation and more details. > I'm copying here Mike's summary he posted on 2293: > Change the approach for how we buffer in RAM to a more isolated > approach, whereby IW has N fully independent RAM segments > in-process and when a doc needs to be indexed it's added to one of > them. Each segment would also write its own doc stores and > "normal" segment merging (not the inefficient merge we now do on > flush) would merge them. This should be a good simplification in > the chain (eg maybe we can remove the *PerThread classes). The > segments can flush independently, letting us make much better > concurrent use of IO & CPU. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org