[ https://issues.apache.org/jira/browse/LUCENE-2324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12985148#action_12985148 ]
Jason Rutherglen commented on LUCENE-2324: ------------------------------------------ bq. the cost of replaying the log, assuming the log is "smallish" This is recording and replaying the doc-ids? How/when does a previous BV become 'free' to be used by the next reader? What if they're open at the same time? And if it's a previous previous reader that's been closed, won't that be quite a few docids to save? Eg, a delete-by-query has removed thousands of docs, I guess we'd use System.arraycopy then. The most usual case is updateDocument with [N]RT, which'd generate few doc-ids. bq. System.arraycopy, while fast, is still O(N) Right, the larger segments will really adversely affect performance, as they do today, however the indexing is so much slower with NRT + clone that it's not noticeable. bq. Using RT/NRT shouldn't slow down searching Right! The cost needs to be in the indexing and/or reopen threads. > Per thread DocumentsWriters that write their own private segments > ----------------------------------------------------------------- > > Key: LUCENE-2324 > URL: https://issues.apache.org/jira/browse/LUCENE-2324 > Project: Lucene - Java > Issue Type: Improvement > Components: Index > Reporter: Michael Busch > Assignee: Michael Busch > Priority: Minor > Fix For: Realtime Branch > > Attachments: LUCENE-2324-SMALL.patch, LUCENE-2324-SMALL.patch, > LUCENE-2324-SMALL.patch, LUCENE-2324-SMALL.patch, LUCENE-2324-SMALL.patch, > LUCENE-2324.patch, LUCENE-2324.patch, LUCENE-2324.patch, lucene-2324.patch, > lucene-2324.patch, LUCENE-2324.patch, test.out, test.out, test.out, test.out > > > See LUCENE-2293 for motivation and more details. > I'm copying here Mike's summary he posted on 2293: > Change the approach for how we buffer in RAM to a more isolated > approach, whereby IW has N fully independent RAM segments > in-process and when a doc needs to be indexed it's added to one of > them. Each segment would also write its own doc stores and > "normal" segment merging (not the inefficient merge we now do on > flush) would merge them. This should be a good simplification in > the chain (eg maybe we can remove the *PerThread classes). The > segments can flush independently, letting us make much better > concurrent use of IO & CPU. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org