>From Janhavi Tripurwar <[email protected]>:
Attention is currently required from: Michael Blow, Ritik Raj.
Hello Jenkins, Michael Blow, Ritik Raj,
I'd like you to do a code review.
Please visit
https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/21034?usp=email
to review the following change.
Change subject: [NO ISSUE][HYR] Fix NPE in AbstractIoOperation and improve
cloud interrupt handling
......................................................................
[NO ISSUE][HYR] Fix NPE in AbstractIoOperation and improve cloud interrupt
handling
- user model changes: no
- storage format changes: no
- interface changes: no
Details:
During a distributed job cancellation (e.g., HYR0034),
the Hyracks framework interrupts active threads.
If a thread is interrupted during an LSM bulk load,
it enters an abort sequence that triggers
AbstractIoOperation.cleanup().
Because the interrupt originated externally,
the I/O operation never registered an internal
failure. This resulted in getFailure() returning
null, causing a NullPointerException when
cleanup() attempted to attach suppressed
teardown exceptions. This NPE masked the true
state of the system.
Furthermore, CloudRetryableRequestUtil failed
to recognize the AWS SDK's AbortedException as
a standard interrupt, causing it to violently
rethrow the exception instead of cleanly
suppressing it during cloud teardowns.
Change-Id: If36c1336f8e8377f2d410c8878d168f1209bcf87
Reviewed-on: https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/20967
Integration-Tests: Michael Blow <[email protected]>
Tested-by: Jenkins <[email protected]>
Reviewed-by: Michael Blow <[email protected]>
Reviewed-by: Ritik Raj <[email protected]>
---
M
asterixdb/asterix-app/src/test/java/org/apache/asterix/test/dataflow/GlobalVirtualBufferCacheTest.java
M
hyracks-fullstack/hyracks/hyracks-cloud/src/main/java/org/apache/hyracks/cloud/util/CloudRetryableRequestUtil.java
M
hyracks-fullstack/hyracks/hyracks-storage-am-lsm-common/src/main/java/org/apache/hyracks/storage/am/lsm/common/impls/AbstractIoOperation.java
M
hyracks-fullstack/hyracks/hyracks-storage-am-lsm-common/src/main/java/org/apache/hyracks/storage/am/lsm/common/impls/LSMIndexDiskComponentBulkLoader.java
M
hyracks-fullstack/hyracks/hyracks-storage-common/src/main/java/org/apache/hyracks/storage/common/buffercache/BufferCache.java
5 files changed, 17 insertions(+), 7 deletions(-)
git pull ssh://asterix-gerrit.ics.uci.edu:29418/asterixdb
refs/changes/34/21034/1
diff --git
a/asterixdb/asterix-app/src/test/java/org/apache/asterix/test/dataflow/GlobalVirtualBufferCacheTest.java
b/asterixdb/asterix-app/src/test/java/org/apache/asterix/test/dataflow/GlobalVirtualBufferCacheTest.java
index 1ea6941..a286a33 100644
---
a/asterixdb/asterix-app/src/test/java/org/apache/asterix/test/dataflow/GlobalVirtualBufferCacheTest.java
+++
b/asterixdb/asterix-app/src/test/java/org/apache/asterix/test/dataflow/GlobalVirtualBufferCacheTest.java
@@ -155,7 +155,7 @@
}
if (exceptionRef.get() != null) {
exceptionRef.get().printStackTrace();
- Assert.fail("Exception in insert threads: " +
exceptionRef.get().getMessage());
+ Assert.fail("Exception in insert threads: " +
exceptionRef.get());
}
for (int i = 0; i < NUM_PARTITIONS; i++) {
List<ILSMDiskComponent> diskComponents = new
ArrayList<>(primaryIndexes[i].getDiskComponents());
diff --git
a/hyracks-fullstack/hyracks/hyracks-cloud/src/main/java/org/apache/hyracks/cloud/util/CloudRetryableRequestUtil.java
b/hyracks-fullstack/hyracks/hyracks-cloud/src/main/java/org/apache/hyracks/cloud/util/CloudRetryableRequestUtil.java
index 5461ed5..43fbfc9 100644
---
a/hyracks-fullstack/hyracks/hyracks-cloud/src/main/java/org/apache/hyracks/cloud/util/CloudRetryableRequestUtil.java
+++
b/hyracks-fullstack/hyracks/hyracks-cloud/src/main/java/org/apache/hyracks/cloud/util/CloudRetryableRequestUtil.java
@@ -111,10 +111,14 @@
try {
return doRun(request, retry, RETRY_ALWAYS_PREDICATE);
} catch (Throwable e) {
- // First, clear the interrupted flag
- interrupted |= Thread.interrupted();
- if (ExceptionUtils.causedByInterrupt(e)) {
+ // clear & check any interrupted status
+ boolean currentCycleInterrupted = Thread.interrupted();
+ interrupted |= currentCycleInterrupted;
+ if (ExceptionUtils.causedByInterrupt(e) ||
currentCycleInterrupted) {
interrupted = true;
+ LOGGER.debug(
+ "Suppressing interrupt in run().
causedByInterrupt: {}, currentCycleInterrupted: {}",
+ ExceptionUtils.causedByInterrupt(e),
currentCycleInterrupted);
} else {
// The cause isn't an interruption, rethrow
throw e;
@@ -197,9 +201,12 @@
}
try {
if (Thread.currentThread().isInterrupted() ||
!retryPolicy.retry(e)) {
+ LOGGER.debug("Exiting doRun() loop. isInterrupted: {},
attempt: {}/{}",
+ Thread.currentThread().isInterrupted(),
attempt, NUMBER_OF_RETRIES);
throw HyracksDataException.create(e);
}
} catch (InterruptedException interruptedEx) {
+ LOGGER.debug("Thread interrupted while waiting for retry
policy backoff.");
throw HyracksDataException.create(interruptedEx);
}
attempt++;
diff --git
a/hyracks-fullstack/hyracks/hyracks-storage-am-lsm-common/src/main/java/org/apache/hyracks/storage/am/lsm/common/impls/AbstractIoOperation.java
b/hyracks-fullstack/hyracks/hyracks-storage-am-lsm-common/src/main/java/org/apache/hyracks/storage/am/lsm/common/impls/AbstractIoOperation.java
index 8ae0571..7176e22 100644
---
a/hyracks-fullstack/hyracks/hyracks-storage-am-lsm-common/src/main/java/org/apache/hyracks/storage/am/lsm/common/impls/AbstractIoOperation.java
+++
b/hyracks-fullstack/hyracks/hyracks-storage-am-lsm-common/src/main/java/org/apache/hyracks/storage/am/lsm/common/impls/AbstractIoOperation.java
@@ -96,7 +96,7 @@
bufferCache.deleteFile(file);
}
} catch (HyracksDataException hde) {
- getFailure().addSuppressed(hde);
+ ExceptionUtils.suppress(failure, hde);
}
}
}
diff --git
a/hyracks-fullstack/hyracks/hyracks-storage-am-lsm-common/src/main/java/org/apache/hyracks/storage/am/lsm/common/impls/LSMIndexDiskComponentBulkLoader.java
b/hyracks-fullstack/hyracks/hyracks-storage-am-lsm-common/src/main/java/org/apache/hyracks/storage/am/lsm/common/impls/LSMIndexDiskComponentBulkLoader.java
index 841d389..3d73289 100644
---
a/hyracks-fullstack/hyracks/hyracks-storage-am-lsm-common/src/main/java/org/apache/hyracks/storage/am/lsm/common/impls/LSMIndexDiskComponentBulkLoader.java
+++
b/hyracks-fullstack/hyracks/hyracks-storage-am-lsm-common/src/main/java/org/apache/hyracks/storage/am/lsm/common/impls/LSMIndexDiskComponentBulkLoader.java
@@ -84,9 +84,9 @@
@Override
public void abort() throws HyracksDataException {
- opCtx.getIoOperation().setStatus(LSMIOOperationStatus.FAILURE);
- fail(null);
try {
+ opCtx.getIoOperation().setStatus(LSMIOOperationStatus.FAILURE);
+ fail(null);
try {
componentBulkLoader.abort();
} finally {
diff --git
a/hyracks-fullstack/hyracks/hyracks-storage-common/src/main/java/org/apache/hyracks/storage/common/buffercache/BufferCache.java
b/hyracks-fullstack/hyracks/hyracks-storage-common/src/main/java/org/apache/hyracks/storage/common/buffercache/BufferCache.java
index 8169fb9..40020a6 100644
---
a/hyracks-fullstack/hyracks/hyracks-storage-common/src/main/java/org/apache/hyracks/storage/common/buffercache/BufferCache.java
+++
b/hyracks-fullstack/hyracks/hyracks-storage-common/src/main/java/org/apache/hyracks/storage/common/buffercache/BufferCache.java
@@ -1011,6 +1011,9 @@
sweepAndFlush(fInfo, false);
try {
if (fInfo.getReferenceCount() > 0) {
+ LOGGER.info(
+ "Resource leak detected: Force-deleting file with
active references. fileId: {}, referenceCount: {}",
+ fileId, fInfo.getReferenceCount());
throw new HyracksDataException("Deleting open file");
}
} finally {
--
To view, visit https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/21034?usp=email
To unsubscribe, or for help writing mail filters, visit
https://asterix-gerrit.ics.uci.edu/settings?usp=email
Gerrit-MessageType: newchange
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Change-Id: If36c1336f8e8377f2d410c8878d168f1209bcf87
Gerrit-Change-Number: 21034
Gerrit-PatchSet: 1
Gerrit-Owner: Janhavi Tripurwar <[email protected]>
Gerrit-Reviewer: Jenkins <[email protected]>
Gerrit-Reviewer: Michael Blow <[email protected]>
Gerrit-Reviewer: Ritik Raj <[email protected]>
Gerrit-Attention: Michael Blow <[email protected]>
Gerrit-Attention: Ritik Raj <[email protected]>