----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/60154/#review178121 -----------------------------------------------------------
lucene/core/src/java/org/apache/lucene/index/BinaryDocValuesFieldUpdates.java Lines 126 (patched) <https://reviews.apache.org/r/60154/#comment251933> can you add a message to this assertion, if it trips we really wanna see what the values are? lucene/core/src/java/org/apache/lucene/index/BinaryDocValuesFieldUpdates.java Lines 210 (patched) <https://reviews.apache.org/r/60154/#comment251934> oh this is good!! lucene/core/src/java/org/apache/lucene/index/BufferedUpdatesStream.java Lines 168 (patched) <https://reviews.apache.org/r/60154/#comment251935> it took me a while to figure out what you are doing here. I wonder if it makes sense to extract this in an inner class like: ```Java static class FinishedSegments { private int completedDelGen; private final Set<Long> finishedDelGens = new HashSet<>(); // do your thing here... } ``` I think that way you can also sync on this instance instead of using the synchronized on the BufferedUpdatesStream instance (nice sideeffect) and it's clear that the two members belong together. I also would appreciate if you could leave a comment why we need the hashset since we need to deal with potential holes in the finished segments history and that's why we need to wait until they are consecutive in order to be deleted. lucene/core/src/java/org/apache/lucene/index/BufferedUpdatesStream.java Line 202 (original), 243 (patched) <https://reviews.apache.org/r/60154/#comment251936> use `waitFor.isEmpty()` instead? lucene/core/src/java/org/apache/lucene/index/BufferedUpdatesStream.java Lines 276 (patched) <https://reviews.apache.org/r/60154/#comment251937> is it possible that an indexing thread is currently handling this? in that case we would do it twice? I'd love if you could add some in-line comment about that lucene/core/src/java/org/apache/lucene/index/BufferedUpdatesStream.java Line 225 (original), 286 (patched) <https://reviews.apache.org/r/60154/#comment251938> do we really need this inner method, can't we inline this into `applyPacket` lucene/core/src/java/org/apache/lucene/index/BufferedUpdatesStream.java Lines 342 (patched) <https://reviews.apache.org/r/60154/#comment251939> this is pretty scary if you do that without a try / finally block. if we fail before we can decrement the refs we might leak the file references? it's pretty unclear to me where it will be released and I wonder if we sould use some kind of datastructure where we can record all the files and make sure that if we exit this loop we release them? ie. have a single place where we release, like a list we add to? lucene/core/src/java/org/apache/lucene/index/BufferedUpdatesStream.java Lines 348 (patched) <https://reviews.apache.org/r/60154/#comment251940> this loop is pretty scary while I get what you are doing. maybe it would be simpler if we can break out part of it into seperate methods? the size of the loop also makes it hard to reason about if we leak any resources lucene/core/src/java/org/apache/lucene/index/BufferedUpdatesStream.java Line 295 (original), 377 (patched) <https://reviews.apache.org/r/60154/#comment251941> this entire finally block should be a seperate method lucene/core/src/java/org/apache/lucene/index/DocumentsWriter.java Lines 720 (patched) <https://reviews.apache.org/r/60154/#comment251943> `catch(Throwable t)` really :) I know IW#tragicEvent will sort it out for us lucene/core/src/java/org/apache/lucene/index/DocumentsWriterFlushControl.java Line 442 (original), 436 (patched) <https://reviews.apache.org/r/60154/#comment251944> oh that is great! lucene/core/src/java/org/apache/lucene/index/DocumentsWriterPerThreadPool.java Line 213 (original), 217 (patched) <https://reviews.apache.org/r/60154/#comment251945> is this a leftover? lucene/core/src/java/org/apache/lucene/index/FrozenBufferedUpdates.java Lines 246 (patched) <https://reviews.apache.org/r/60154/#comment251948> this really should have a dedicated untitest if possible lucene/core/src/java/org/apache/lucene/index/FrozenBufferedUpdates.java Lines 449 (patched) <https://reviews.apache.org/r/60154/#comment251947> I'd love to see dedicated unittests for these methods here if there aren't any. I mean it's private so maybe we should add some? It's pretty intense code here lucene/core/src/java/org/apache/lucene/index/FrozenBufferedUpdates.java Lines 590 (patched) <https://reviews.apache.org/r/60154/#comment251946> maybe: ```Java int docID; while ((docID = postingsEnum.nextDoc()) != NO_MORE_DOCS) { } ``` ? lucene/core/src/java/org/apache/lucene/index/IndexWriter.java Lines 593 (patched) <https://reviews.apache.org/r/60154/#comment251951> an assertion message would be good here no? lucene/core/src/java/org/apache/lucene/index/IndexWriter.java Lines 2229 (patched) <https://reviews.apache.org/r/60154/#comment251950> can you please use a real lock for this and not an atomicboolean... you can still do `if (lock.tryLock())` there but it has clear semantics lucene/core/src/java/org/apache/lucene/index/SegmentCoreReaders.java Line 95 (original), 95 (patched) <https://reviews.apache.org/r/60154/#comment251953> unrelated debugging leftover, can you remove? lucene/core/src/java/org/apache/lucene/index/SegmentCoreReaders.java Line 167 (original), 167 (patched) <https://reviews.apache.org/r/60154/#comment251954> unrelated debugging leftover, can you remove? lucene/core/src/java/org/apache/lucene/index/SegmentReader.java Lines 165 (patched) <https://reviews.apache.org/r/60154/#comment251952> is this a bug? should this be fixed seperately? - Simon Willnauer On June 16, 2017, 1:20 p.m., Mike McCandless wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/60154/ > ----------------------------------------------------------- > > (Updated June 16, 2017, 1:20 p.m.) > > > Review request for lucene. > > > Repository: lucene-solr > > > Description > ------- > > Today Lucene uses a single thread to resolve buffered delete/update terms to > actual docIDs, but this is a costly process. This change uses multiple > threads (the incoming indexing threads) to resolve terms concurrently. > > Jira issue: https://issues.apache.org/jira/browse/LUCENE-7868 > > > Diffs > ----- > > > lucene/core/src/java/org/apache/lucene/index/BinaryDocValuesFieldUpdates.java > f8cece9 > lucene/core/src/java/org/apache/lucene/index/BufferedUpdates.java 1c3494f > lucene/core/src/java/org/apache/lucene/index/BufferedUpdatesStream.java > 9955626 > lucene/core/src/java/org/apache/lucene/index/CoalescedUpdates.java bf92ac1 > lucene/core/src/java/org/apache/lucene/index/DocValuesFieldUpdates.java > 528d4bf > lucene/core/src/java/org/apache/lucene/index/DocValuesUpdate.java 1c85f33 > lucene/core/src/java/org/apache/lucene/index/DocumentsWriter.java 2807517 > > lucene/core/src/java/org/apache/lucene/index/DocumentsWriterDeleteQueue.java > db0e571 > > lucene/core/src/java/org/apache/lucene/index/DocumentsWriterFlushControl.java > a5b4b7c > lucene/core/src/java/org/apache/lucene/index/DocumentsWriterFlushQueue.java > 2c62487 > lucene/core/src/java/org/apache/lucene/index/DocumentsWriterPerThread.java > c929ba2 > > lucene/core/src/java/org/apache/lucene/index/DocumentsWriterPerThreadPool.java > cc72342 > lucene/core/src/java/org/apache/lucene/index/FlushByRamOrCountsPolicy.java > a85c98b > lucene/core/src/java/org/apache/lucene/index/FlushPolicy.java e70959f > lucene/core/src/java/org/apache/lucene/index/FreqProxTermsWriter.java > 1ca2830 > lucene/core/src/java/org/apache/lucene/index/FrozenBufferedUpdates.java > 4f482ad > lucene/core/src/java/org/apache/lucene/index/IndexFileDeleter.java f7f196d > lucene/core/src/java/org/apache/lucene/index/IndexWriter.java 14fbbae > lucene/core/src/java/org/apache/lucene/index/IndexWriterConfig.java 0fdbc3e > lucene/core/src/java/org/apache/lucene/index/LiveIndexWriterConfig.java > d9e1bc7 > > lucene/core/src/java/org/apache/lucene/index/MergedPrefixCodedTermsIterator.java > cd14eec > > lucene/core/src/java/org/apache/lucene/index/NumericDocValuesFieldUpdates.java > 4dd3cd0 > lucene/core/src/java/org/apache/lucene/index/PrefixCodedTerms.java df1653b > lucene/core/src/java/org/apache/lucene/index/ReadersAndUpdates.java d4dd4a4 > lucene/core/src/java/org/apache/lucene/index/SegmentCommitInfo.java b1084a6 > lucene/core/src/java/org/apache/lucene/index/SegmentCoreReaders.java > ce2d448 > lucene/core/src/java/org/apache/lucene/index/SegmentInfo.java 1c02441 > lucene/core/src/java/org/apache/lucene/index/SegmentReader.java c8235d5 > lucene/core/src/java/org/apache/lucene/index/SerialMergeScheduler.java > 5a8f98b > lucene/core/src/java/org/apache/lucene/index/TieredMergePolicy.java 668f1ec > > lucene/core/src/java/org/apache/lucene/util/packed/AbstractPagedMutable.java > c5fac1e > > lucene/core/src/test/org/apache/lucene/index/TestBinaryDocValuesUpdates.java > ed2b66f > > lucene/core/src/test/org/apache/lucene/index/TestDocumentsWriterDeleteQueue.java > c60f54d > > lucene/core/src/test/org/apache/lucene/index/TestFlushByRamOrCountsPolicy.java > aa2901c > lucene/core/src/test/org/apache/lucene/index/TestForceMergeForever.java > 0379395 > lucene/core/src/test/org/apache/lucene/index/TestIndexWriter.java 6897f06 > lucene/core/src/test/org/apache/lucene/index/TestIndexWriterConfig.java > 2014c16 > lucene/core/src/test/org/apache/lucene/index/TestIndexWriterDelete.java > 25817d9 > lucene/core/src/test/org/apache/lucene/index/TestIndexWriterExceptions.java > c0907a5 > lucene/core/src/test/org/apache/lucene/index/TestIndexWriterReader.java > 584e03c > lucene/core/src/test/org/apache/lucene/index/TestNRTReaderWithThreads.java > 871715f > > lucene/core/src/test/org/apache/lucene/index/TestNumericDocValuesUpdates.java > 94da587 > lucene/core/src/test/org/apache/lucene/index/TestPerSegmentDeletes.java > 112a108 > lucene/core/src/test/org/apache/lucene/index/TestPrefixCodedTerms.java > 89d4ad1 > > lucene/core/src/test/org/apache/lucene/search/TestControlledRealTimeReopenThread.java > a1b2a5c > lucene/join/src/test/org/apache/lucene/search/join/TestJoinUtil.java > 49dd333 > > lucene/sandbox/src/java/org/apache/lucene/codecs/idversion/IDVersionPostingsWriter.java > 334f784 > > lucene/sandbox/src/java/org/apache/lucene/codecs/idversion/VersionBlockTreeTermsWriter.java > d83b915 > > lucene/test-framework/src/java/org/apache/lucene/index/BaseDocValuesFormatTestCase.java > 8cb6665 > > lucene/test-framework/src/java/org/apache/lucene/index/BaseIndexFileFormatTestCase.java > 959466a > lucene/test-framework/src/java/org/apache/lucene/util/LuceneTestCase.java > 0243a56 > > > Diff: https://reviews.apache.org/r/60154/diff/1/ > > > Testing > ------- > > I ran performance tests on my internal corpus; details described here: > https://issues.apache.org/jira/browse/LUCENE-7868?focusedCommentId=16050729&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16050729 > > I also uncovered a pre-existing bug if you use DV updates with index sorting > and update recently indexed documents; I'll open a separate issue to fix that > on 6.x. > > > Thanks, > > Mike McCandless > >
