[GitHub] [lucene-solr] mikemccand commented on a change in pull request #1361: LUCENE-8118: Throw exception if DWPT grows beyond it's maximum ram limit

GitBox Sat, 28 Mar 2020 10:44:52 -0700

mikemccand commented on a change in pull request #1361: LUCENE-8118: Throw 
exception if DWPT grows beyond it's maximum ram limit
URL: https://github.com/apache/lucene-solr/pull/1361#discussion_r399689324


 ##########
 File path: lucene/core/src/java/org/apache/lucene/index/DocumentsWriter.java
 ##########
 @@ -435,55 +436,81 @@ private void ensureInitialized(ThreadState state) throws 
IOException {
   long updateDocuments(final Iterable<? extends Iterable<? extends 
IndexableField>> docs, final Analyzer analyzer,
                        final DocumentsWriterDeleteQueue.Node<?> delNode) 
throws IOException {
     boolean hasEvents = preUpdate();
-
     final ThreadState perThread = flushControl.obtainAndLock();
     DocumentsWriterPerThread flushingDWPT = null;
+    final boolean isUpdate = delNode != null && delNode.isDelete();
+    final int numDocsBefore = perThread.dwpt == null ? 0 : 
perThread.dwpt.getNumDocsInRAM();
     final long seqNo;
     try {
-      try {
-        // This must happen after we've pulled the ThreadState because IW.close
-        // waits for all ThreadStates to be released:
-        ensureOpen();
-        ensureInitialized(perThread);
-        assert perThread.isInitialized();
-        final DocumentsWriterPerThread dwpt = perThread.dwpt;
-        final int dwptNumDocs = dwpt.getNumDocsInRAM();
-        try {
-          seqNo = dwpt.updateDocuments(docs, analyzer, delNode, 
flushNotifications);
-          perThread.updateLastSeqNo(seqNo);
-        } finally {
-          // We don't know how many documents were actually
-          // counted as indexed, so we must subtract here to
-          // accumulate our separate counter:
-          numDocsInRAM.addAndGet(dwpt.getNumDocsInRAM() - dwptNumDocs);
-          if (dwpt.isAborted()) {
-            flushControl.doOnAbort(perThread);
-          } else if (dwpt.getNumDocsInRAM() > 0) {
-            // we need to check if we have at least one doc in the DWPT. This 
can be 0 if we fail
-            // due to exceeding total number of docs etc.
-            final boolean isUpdate = delNode != null && delNode.isDelete();
+      innerUpdateDocuments(perThread, docs, analyzer, delNode);
+      flushingDWPT = flushControl.doAfterDocument(perThread, isUpdate);
+    } catch (MaxBufferSizeExceededException ex) {
+      if (perThread.dwpt.isAborted()) {
+        throw ex;
+      } else {
+        // we hit an exception but still need to flush this DWPT
+        // let's run postUpdate to make sure we flush stuff to disk
+        // in the case we exceed ram limits etc.
+        hasEvents = doAfterDocumentRejected(perThread, isUpdate, hasEvents);
+        // we retry if we the DWPT had more than one document indexed and was 
flushed
+        boolean shouldRetry = perThread.dwpt == null && numDocsBefore > 0;
+        if (shouldRetry) {
+          try {
+            // we retry into a brand new DWPT, if it doesn't fit in here we 
can't index the document
 
 Review comment:
   Oh, I see: we create a new DWPT and send the doc there, ok.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] mikemccand commented on a change in pull request #1361: LUCENE-8118: Throw exception if DWPT grows beyond it's maximum ram limit

Reply via email to