> On June 19, 2017, 10:16 a.m., Simon Willnauer wrote: > > lucene/core/src/java/org/apache/lucene/index/FrozenBufferedUpdates.java > > Lines 332 (patched) > > <https://reviews.apache.org/r/60154/diff/2/?file=1752832#file1752832line345> > > > > can we already assign this here? it's going to be counted more than > > once when we retry no? I am not sure I understand why we add this up here > > maybe it's a leftover?
Good question! It smells funny, but it's actually safe; I'll add comments clarifying why. > On June 19, 2017, 10:16 a.m., Simon Willnauer wrote: > > lucene/core/src/java/org/apache/lucene/index/FrozenBufferedUpdates.java > > Lines 373 (patched) > > <https://reviews.apache.org/r/60154/diff/2/?file=1752832#file1752832line386> > > > > if we really want to log the sum of the del docs we should also log how > > many iterations we took? I do log that (add to message, right below this) if it's more than one. > On June 19, 2017, 10:16 a.m., Simon Willnauer wrote: > > lucene/core/src/java/org/apache/lucene/index/IndexWriter.java > > Lines 2236 (patched) > > <https://reviews.apache.org/r/60154/diff/2/?file=1752834#file1752834line2257> > > > > so lets say we index heavily and we have a heavy flushing party going > > on but we are kind of skipping most of the maybeMerge calls since we only > > wanna do this once and don't block etc. we might miss a merge when the last > > thread comes in here but skips since it can't acquire the lock. I think to > > be absolutely correct here we also need to retry if there is a thread that > > couldn't lock the `maybeMergeLock` otherwise we might miss to kick off a > > merge? I mean in practice this might not be a massive issue but testing > > will fail at some point here. > > > > We can piggyback on the `AtomicBoolean maybeMerge` and set it to `true` > > if we can't acquire the lock here. if the value is still true after we > > called `mergeScheduler.merge(this, trigger, newMergesFound);` we retry but > > we should release the lock first. something like this: > > > > ```Java > > do { > > maybeMerge.set(true); // the thread checking will retry > > if (maybeMergeLock.tryLock()) { > > if (maybeMerge.compareAndSet(true, false)) { // only try if we have > > to... some other thread might have been faster? > > maybeMerge.set(false); // we are checking now if there is another > > trigger coming it will be true again > > try { > > boolean newMergesFound = updatePendingMerges(mergePolicy, > > trigger, maxNumSegments); > > mergeScheduler.merge(this, trigger, newMergesFound); > > } finally { > > maybeMergeLock.unlock(); > > } > > } > > } > > } while (maybeMerge.get()); // retry until there is no pending merge > > check > > ``` Hmm, I tried this, but it resulting in infinite loops for some tests; then I tried to improve it, but still infinite loops. Then, I simply reverted it (back to trunk) and it works again! And the reason I had made this change was in fact invalid: it was due to a separate bug (now fixed) that was causing way too much segment flushing, so I think it's best to just leave it as it was :) > On June 19, 2017, 10:16 a.m., Simon Willnauer wrote: > > lucene/core/src/java/org/apache/lucene/index/IndexWriter.java > > Line 3110 (original), 3289 (patched) > > <https://reviews.apache.org/r/60154/diff/2/?file=1752834#file1752834line3316> > > > > man this `boolean[]` is/was scary :) YES!! > On June 19, 2017, 10:16 a.m., Simon Willnauer wrote: > > lucene/core/src/java/org/apache/lucene/index/IndexWriter.java > > Lines 3627 (patched) > > <https://reviews.apache.org/r/60154/diff/2/?file=1752834#file1752834line3689> > > > > should we assert that we don't hold the IW lock here.. afaik it could > > be an issue and deadlock? Good catch, I'll add. > On June 19, 2017, 10:16 a.m., Simon Willnauer wrote: > > lucene/core/src/java/org/apache/lucene/index/IndexWriterConfig.java > > Line 95 (original), 95 (patched) > > <https://reviews.apache.org/r/60154/diff/2/?file=1752835#file1752835line95> > > > > it would be good for me and the future readers to have an inline > > comment here when we changed this to true and why. Changes like this are > > not obvious and we really need to prepare for others to catch up with this. > > We already have a crazy bus-factor on IW. Amen! I'll add a comment. > On June 19, 2017, 10:16 a.m., Simon Willnauer wrote: > > lucene/core/src/java/org/apache/lucene/index/ReadersAndUpdates.java > > Line 261 (original) > > <https://reviews.apache.org/r/60154/diff/2/?file=1752840#file1752840line329> > > > > is this assertino on the IW lock not necessary anymore? I recall you > > saying you didn't parallelize the file writing yet!? Good catch, I'll put it back. And yes, unfortunately, no concurrent delete/update file writes yet. - Mike ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/60154/#review178210 ----------------------------------------------------------- On June 18, 2017, 1:22 p.m., Mike McCandless wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/60154/ > ----------------------------------------------------------- > > (Updated June 18, 2017, 1:22 p.m.) > > > Review request for lucene. > > > Repository: lucene-solr > > > Description > ------- > > Today Lucene uses a single thread to resolve buffered delete/update terms to > actual docIDs, but this is a costly process. This change uses multiple > threads (the incoming indexing threads) to resolve terms concurrently. > > Jira issue: https://issues.apache.org/jira/browse/LUCENE-7868 > > > Diffs > ----- > > > lucene/core/src/java/org/apache/lucene/index/BinaryDocValuesFieldUpdates.java > f8cece9 > lucene/core/src/java/org/apache/lucene/index/BufferedUpdates.java 1c3494f > lucene/core/src/java/org/apache/lucene/index/BufferedUpdatesStream.java > 9955626 > lucene/core/src/java/org/apache/lucene/index/CoalescedUpdates.java bf92ac1 > lucene/core/src/java/org/apache/lucene/index/DocValuesFieldUpdates.java > 528d4bf > lucene/core/src/java/org/apache/lucene/index/DocValuesUpdate.java 1c85f33 > lucene/core/src/java/org/apache/lucene/index/DocumentsWriter.java 2807517 > > lucene/core/src/java/org/apache/lucene/index/DocumentsWriterDeleteQueue.java > db0e571 > > lucene/core/src/java/org/apache/lucene/index/DocumentsWriterFlushControl.java > a5b4b7c > lucene/core/src/java/org/apache/lucene/index/DocumentsWriterFlushQueue.java > 2c62487 > lucene/core/src/java/org/apache/lucene/index/DocumentsWriterPerThread.java > c929ba2 > > lucene/core/src/java/org/apache/lucene/index/DocumentsWriterPerThreadPool.java > cc72342 > lucene/core/src/java/org/apache/lucene/index/FlushByRamOrCountsPolicy.java > a85c98b > lucene/core/src/java/org/apache/lucene/index/FlushPolicy.java e70959f > lucene/core/src/java/org/apache/lucene/index/FreqProxTermsWriter.java > 1ca2830 > lucene/core/src/java/org/apache/lucene/index/FrozenBufferedUpdates.java > 4f482ad > lucene/core/src/java/org/apache/lucene/index/IndexFileDeleter.java f7f196d > lucene/core/src/java/org/apache/lucene/index/IndexWriter.java 14fbbae > lucene/core/src/java/org/apache/lucene/index/IndexWriterConfig.java 0fdbc3e > lucene/core/src/java/org/apache/lucene/index/LiveIndexWriterConfig.java > d9e1bc7 > > lucene/core/src/java/org/apache/lucene/index/MergedPrefixCodedTermsIterator.java > cd14eec > > lucene/core/src/java/org/apache/lucene/index/NumericDocValuesFieldUpdates.java > 4dd3cd0 > lucene/core/src/java/org/apache/lucene/index/PrefixCodedTerms.java df1653b > lucene/core/src/java/org/apache/lucene/index/ReadersAndUpdates.java d4dd4a4 > lucene/core/src/java/org/apache/lucene/index/SegmentCommitInfo.java b1084a6 > lucene/core/src/java/org/apache/lucene/index/SegmentCoreReaders.java > ce2d448 > lucene/core/src/java/org/apache/lucene/index/SegmentInfo.java 1c02441 > lucene/core/src/java/org/apache/lucene/index/SegmentReader.java c8235d5 > lucene/core/src/java/org/apache/lucene/index/SerialMergeScheduler.java > 5a8f98b > lucene/core/src/java/org/apache/lucene/index/TieredMergePolicy.java 668f1ec > > lucene/core/src/java/org/apache/lucene/util/packed/AbstractPagedMutable.java > c5fac1e > > lucene/core/src/test/org/apache/lucene/index/TestBinaryDocValuesUpdates.java > ed2b66f > > lucene/core/src/test/org/apache/lucene/index/TestDocumentsWriterDeleteQueue.java > c60f54d > > lucene/core/src/test/org/apache/lucene/index/TestFlushByRamOrCountsPolicy.java > aa2901c > lucene/core/src/test/org/apache/lucene/index/TestForceMergeForever.java > 0379395 > lucene/core/src/test/org/apache/lucene/index/TestIndexWriter.java 6897f06 > lucene/core/src/test/org/apache/lucene/index/TestIndexWriterConfig.java > 2014c16 > lucene/core/src/test/org/apache/lucene/index/TestIndexWriterDelete.java > 25817d9 > lucene/core/src/test/org/apache/lucene/index/TestIndexWriterExceptions.java > c0907a5 > lucene/core/src/test/org/apache/lucene/index/TestIndexWriterReader.java > 584e03c > lucene/core/src/test/org/apache/lucene/index/TestNRTReaderWithThreads.java > 871715f > > lucene/core/src/test/org/apache/lucene/index/TestNumericDocValuesUpdates.java > 94da587 > lucene/core/src/test/org/apache/lucene/index/TestPerSegmentDeletes.java > 112a108 > lucene/core/src/test/org/apache/lucene/index/TestPrefixCodedTerms.java > 89d4ad1 > > lucene/core/src/test/org/apache/lucene/search/TestControlledRealTimeReopenThread.java > a1b2a5c > lucene/join/src/test/org/apache/lucene/search/join/TestJoinUtil.java > 49dd333 > > lucene/sandbox/src/java/org/apache/lucene/codecs/idversion/IDVersionPostingsWriter.java > 334f784 > > lucene/sandbox/src/java/org/apache/lucene/codecs/idversion/VersionBlockTreeTermsWriter.java > d83b915 > > lucene/test-framework/src/java/org/apache/lucene/index/BaseDocValuesFormatTestCase.java > 8cb6665 > > lucene/test-framework/src/java/org/apache/lucene/index/BaseIndexFileFormatTestCase.java > 959466a > lucene/test-framework/src/java/org/apache/lucene/util/LuceneTestCase.java > 0243a56 > > > Diff: https://reviews.apache.org/r/60154/diff/2/ > > > Testing > ------- > > I ran performance tests on my internal corpus; details described here: > https://issues.apache.org/jira/browse/LUCENE-7868?focusedCommentId=16050729&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16050729 > > I also uncovered a pre-existing bug if you use DV updates with index sorting > and update recently indexed documents; I'll open a separate issue to fix that > on 6.x. > > > Thanks, > > Mike McCandless > >