[ https://issues.apache.org/jira/browse/HADOOP-19567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17951228#comment-17951228 ]
Steve Loughran commented on HADOOP-19567: ----------------------------------------- Full log. me trying to do minimal bandwidth test and calling control-c mid download. I don't mind the noise as much as the way the cancellation exception was escalated to become the failure cause. {code} 2025-05-13 17:09:16,705 [main] INFO util.ExitUtil (ExitUtil.java:terminate(241)) - Exiting with status -1: java.util.concurrent.CancellationException {code} Proposed: service stop code in {{AnalyticsStreamFactory}} to catch and swallow those. {code} > bin/hadoop jar $CLOUDSTORE bandwidth -csv out.csv 1M s3a://stevel-usw2/bw Bandwidth test against s3a://stevel-usw2/bw with data size 1m ============================================================= Block size 1 MB Saving statistics as CSV data to out.csv Using filesystem s3a://stevel-usw2 Upload size in Megabytes 1 MB Writing data as 1 blocks each of size 1,048,576 bytes Starting: Opening s3a://stevel-usw2/bw for upload Duration of Opening s3a://stevel-usw2/bw for upload: 0:00:00.018 Write block 0 in 0.015 seconds Starting: upload stream close() Duration of upload stream close(): 0:00:03.793 Progress callbacks 4; in close 4 Download s3a://stevel-usw2/bw ============================= Starting: open s3a://stevel-usw2/bw ^C2025-05-13 17:09:16,700 [shutdown-hook-0] INFO s3a.S3AFileSystem (S3AFileSystem.java:processDeleteOnExit(4331)) - Ignoring failure to deleteOnExit for path s3a://stevel-usw2/bw 2025-05-13 17:09:16,702 [s3a-transfer-stevel-usw2-unbounded-pool2-t1] ERROR s3.telemetry (LogHelper.java:logAtLevel(38)) - [2025-05-13T17:09:15.996Z] [failure] [6lu5cof10v8ch] metadata.store.head.join(thread_id=30, uri=s3://stevel-usw2/bw): 705,819,458 ns [java.util.concurrent.CancellationException: 'null'] java.util.concurrent.CancellationException at java.util.concurrent.CompletableFuture.cancel(CompletableFuture.java:2276) at software.amazon.s3.analyticsaccelerator.io.physical.data.MetadataStore.safeCancel(MetadataStore.java:147) at java.util.LinkedHashMap$LinkedValues.forEach(LinkedHashMap.java:608) at java.util.Collections$SynchronizedCollection.forEach(Collections.java:2064) at software.amazon.s3.analyticsaccelerator.io.physical.data.MetadataStore.close(MetadataStore.java:157) at software.amazon.s3.analyticsaccelerator.S3SeekableInputStreamFactory.close(S3SeekableInputStreamFactory.java:162) at org.apache.hadoop.util.functional.LazyAutoCloseableReference.close(LazyAutoCloseableReference.java:82) at org.apache.hadoop.fs.s3a.impl.streams.AnalyticsStreamFactory.serviceStop(AnalyticsStreamFactory.java:101) at org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221) at org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:53) at org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:102) at org.apache.hadoop.service.CompositeService.stop(CompositeService.java:160) at org.apache.hadoop.service.CompositeService.serviceStop(CompositeService.java:134) at org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221) at org.apache.hadoop.service.AbstractService.close(AbstractService.java:248) at org.apache.hadoop.fs.s3a.S3AUtils.closeAutocloseables(S3AUtils.java:1567) at org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$stopAllServices$24(S3AFileSystem.java:4375) at org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.invokeTrackingDuration(IOStatisticsBinding.java:547) at org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:528) at org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.trackDuration(IOStatisticsBinding.java:449) at org.apache.hadoop.fs.s3a.S3AFileSystem.stopAllServices(S3AFileSystem.java:4374) at org.apache.hadoop.fs.s3a.S3AFileSystem.close(S3AFileSystem.java:4354) at org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:3820) at org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer.run(FileSystem.java:3837) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750) 2025-05-13 17:09:16,702 [shutdown-hook-0] ERROR s3.telemetry (LogHelper.java:logAtLevel(38)) - [2025-05-13T17:09:15.995Z] [failure] [272un7pgiak2a] metadata.store.head.async(thread_id=30, uri=s3://stevel-usw2/bw): 706,204,250 ns [java.util.concurrent.CancellationException: 'null'] java.util.concurrent.CancellationException at java.util.concurrent.CompletableFuture.cancel(CompletableFuture.java:2276) at software.amazon.s3.analyticsaccelerator.io.physical.data.MetadataStore.safeCancel(MetadataStore.java:147) at java.util.LinkedHashMap$LinkedValues.forEach(LinkedHashMap.java:608) at java.util.Collections$SynchronizedCollection.forEach(Collections.java:2064) at software.amazon.s3.analyticsaccelerator.io.physical.data.MetadataStore.close(MetadataStore.java:157) at software.amazon.s3.analyticsaccelerator.S3SeekableInputStreamFactory.close(S3SeekableInputStreamFactory.java:162) at org.apache.hadoop.util.functional.LazyAutoCloseableReference.close(LazyAutoCloseableReference.java:82) at org.apache.hadoop.fs.s3a.impl.streams.AnalyticsStreamFactory.serviceStop(AnalyticsStreamFactory.java:101) at org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221) at org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:53) at org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:102) at org.apache.hadoop.service.CompositeService.stop(CompositeService.java:160) at org.apache.hadoop.service.CompositeService.serviceStop(CompositeService.java:134) at org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221) at org.apache.hadoop.service.AbstractService.close(AbstractService.java:248) at org.apache.hadoop.fs.s3a.S3AUtils.closeAutocloseables(S3AUtils.java:1567) at org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$stopAllServices$24(S3AFileSystem.java:4375) at org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.invokeTrackingDuration(IOStatisticsBinding.java:547) at org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:528) at org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.trackDuration(IOStatisticsBinding.java:449) at org.apache.hadoop.fs.s3a.S3AFileSystem.stopAllServices(S3AFileSystem.java:4374) at org.apache.hadoop.fs.s3a.S3AFileSystem.close(S3AFileSystem.java:4354) at org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:3820) at org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer.run(FileSystem.java:3837) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750) java.util.concurrent.CancellationException at java.util.concurrent.CompletableFuture.cancel(CompletableFuture.java:2276) at software.amazon.s3.analyticsaccelerator.io.physical.data.MetadataStore.safeCancel(MetadataStore.java:147) at java.util.LinkedHashMap$LinkedValues.forEach(LinkedHashMap.java:608) at java.util.Collections$SynchronizedCollection.forEach(Collections.java:2064) at software.amazon.s3.analyticsaccelerator.io.physical.data.MetadataStore.close(MetadataStore.java:157) at software.amazon.s3.analyticsaccelerator.S3SeekableInputStreamFactory.close(S3SeekableInputStreamFactory.java:162) at org.apache.hadoop.util.functional.LazyAutoCloseableReference.close(LazyAutoCloseableReference.java:82) at org.apache.hadoop.fs.s3a.impl.streams.AnalyticsStreamFactory.serviceStop(AnalyticsStreamFactory.java:101) at org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221) at org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:53) at org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:102) at org.apache.hadoop.service.CompositeService.stop(CompositeService.java:160) at org.apache.hadoop.service.CompositeService.serviceStop(CompositeService.java:134) at org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221) at org.apache.hadoop.service.AbstractService.close(AbstractService.java:248) at org.apache.hadoop.fs.s3a.S3AUtils.closeAutocloseables(S3AUtils.java:1567) at org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$stopAllServices$24(S3AFileSystem.java:4375) at org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.invokeTrackingDuration(IOStatisticsBinding.java:547) at org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:528) at org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.trackDuration(IOStatisticsBinding.java:449) at org.apache.hadoop.fs.s3a.S3AFileSystem.stopAllServices(S3AFileSystem.java:4374) at org.apache.hadoop.fs.s3a.S3AFileSystem.close(S3AFileSystem.java:4354) at org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:3820) at org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer.run(FileSystem.java:3837) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750) 2025-05-13 17:09:16,705 [main] INFO util.ExitUtil (ExitUtil.java:terminate(241)) - Exiting with status -1: java.util.concurrent.CancellationException 2025-05-13 17:09:17,429 [sdk-async-response-2-1] ERROR s3.telemetry (LogHelper.java:logAtLevel(38)) - [2025-05-13T17:09:15.993Z] [failure] [-38acq15b21262] s3.client.head(thread_id=30, uri=s3://stevel-usw2/bw): 1,435,559,459 ns [java.util.concurrent.CompletionException: 'software.amazon.awssdk.core.exception.SdkClientException: Unable to execute HTTP request: executor not accepting a task'] java.util.concurrent.CompletionException: software.amazon.awssdk.core.exception.SdkClientException: Unable to execute HTTP request: executor not accepting a task at software.amazon.awssdk.utils.CompletableFutureUtils.errorAsCompletionException(CompletableFutureUtils.java:64) at software.amazon.awssdk.core.internal.http.pipeline.stages.AsyncExecutionFailureExceptionReportingStage.lambda$execute$0(AsyncExecutionFailureExceptionReportingStage.java:51) at java.util.concurrent.CompletableFuture.uniHandle(CompletableFuture.java:836) at java.util.concurrent.CompletableFuture$UniHandle.tryFire(CompletableFuture.java:811) at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488) at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1990) at software.amazon.awssdk.utils.CompletableFutureUtils.lambda$forwardExceptionTo$0(CompletableFutureUtils.java:78) at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774) at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:750) at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488) at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1990) at software.amazon.awssdk.core.internal.http.pipeline.stages.AsyncRetryableStage2$RetryingExecutor.maybeAttemptExecute(AsyncRetryableStage2.java:135) at software.amazon.awssdk.core.internal.http.pipeline.stages.AsyncRetryableStage2$RetryingExecutor.maybeRetryExecute(AsyncRetryableStage2.java:152) at software.amazon.awssdk.core.internal.http.pipeline.stages.AsyncRetryableStage2$RetryingExecutor.lambda$attemptExecute$1(AsyncRetryableStage2.java:113) at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774) at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:750) at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488) at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1990) at software.amazon.awssdk.utils.CompletableFutureUtils.lambda$forwardExceptionTo$0(CompletableFutureUtils.java:78) at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774) at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:750) at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488) at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1990) at software.amazon.awssdk.core.internal.http.pipeline.stages.MakeAsyncHttpRequestStage.lambda$execute$0(MakeAsyncHttpRequestStage.java:108) at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774) at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:750) at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488) at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1990) at software.amazon.awssdk.core.internal.http.pipeline.stages.MakeAsyncHttpRequestStage.completeResponseFuture(MakeAsyncHttpRequestStage.java:255) at software.amazon.awssdk.core.internal.http.pipeline.stages.MakeAsyncHttpRequestStage.lambda$executeHttpRequest$3(MakeAsyncHttpRequestStage.java:167) at java.util.concurrent.CompletableFuture.uniHandle(CompletableFuture.java:836) at java.util.concurrent.CompletableFuture$UniHandle.tryFire(CompletableFuture.java:811) at java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:456) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750) Caused by: software.amazon.awssdk.core.exception.SdkClientException: Unable to execute HTTP request: executor not accepting a task at software.amazon.awssdk.core.exception.SdkClientException$BuilderImpl.build(SdkClientException.java:111) at software.amazon.awssdk.core.exception.SdkClientException.create(SdkClientException.java:47) at software.amazon.awssdk.core.internal.http.pipeline.stages.utils.RetryableStageHelper2.setLastException(RetryableStageHelper2.java:226) at software.amazon.awssdk.core.internal.http.pipeline.stages.utils.RetryableStageHelper2.setLastException(RetryableStageHelper2.java:221) at software.amazon.awssdk.core.internal.http.pipeline.stages.AsyncRetryableStage2$RetryingExecutor.maybeRetryExecute(AsyncRetryableStage2.java:151) ... 23 more Suppressed: software.amazon.awssdk.core.exception.SdkClientException: Request attempt 1 failure: Unable to execute HTTP request: The connection was closed during the request. The request will usually succeed on a retry, but if it does not: consider disabling any proxies you have configured, enabling debug logging, or performing a TCP dump to identify the root cause. If this is a streaming operation, validate that data is being read or written in a timely manner. Channel Information: ChannelDiagnostics(channel=[id: 0x801baead, L:0.0.0.0/0.0.0.0:59534 ! R:bucket.vpce-0117f6033eaf7aee5-2hxd9fg4.s3.us-west-2.vpce.amazonaws.com/10.80.134.179:443], channelAge=PT0.676S, requestCount=1, responseCount=0, lastIdleDuration=PT0.006284125S) Caused by: java.lang.IllegalStateException: executor not accepting a task at software.amazon.awssdk.thirdparty.io.netty.resolver.AddressResolverGroup.getResolver(AddressResolverGroup.java:61) at software.amazon.awssdk.thirdparty.io.netty.bootstrap.Bootstrap.doResolveAndConnect0(Bootstrap.java:208) at software.amazon.awssdk.thirdparty.io.netty.bootstrap.Bootstrap.access$000(Bootstrap.java:47) at software.amazon.awssdk.thirdparty.io.netty.bootstrap.Bootstrap$1.operationComplete(Bootstrap.java:189) at software.amazon.awssdk.thirdparty.io.netty.bootstrap.Bootstrap$1.operationComplete(Bootstrap.java:175) at software.amazon.awssdk.thirdparty.io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:590) at software.amazon.awssdk.thirdparty.io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:557) at software.amazon.awssdk.thirdparty.io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:492) at software.amazon.awssdk.thirdparty.io.netty.util.concurrent.DefaultPromise.setValue0(DefaultPromise.java:636) at software.amazon.awssdk.thirdparty.io.netty.util.concurrent.DefaultPromise.setSuccess0(DefaultPromise.java:625) at software.amazon.awssdk.thirdparty.io.netty.util.concurrent.DefaultPromise.trySuccess(DefaultPromise.java:105) at software.amazon.awssdk.thirdparty.io.netty.channel.DefaultChannelPromise.trySuccess(DefaultChannelPromise.java:84) at software.amazon.awssdk.thirdparty.io.netty.channel.AbstractChannel$AbstractUnsafe.safeSetSuccess(AbstractChannel.java:988) at software.amazon.awssdk.thirdparty.io.netty.channel.AbstractChannel$AbstractUnsafe.register0(AbstractChannel.java:515) at software.amazon.awssdk.thirdparty.io.netty.channel.AbstractChannel$AbstractUnsafe.access$200(AbstractChannel.java:428) at software.amazon.awssdk.thirdparty.io.netty.channel.AbstractChannel$AbstractUnsafe$1.run(AbstractChannel.java:485) at software.amazon.awssdk.thirdparty.io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:173) at software.amazon.awssdk.thirdparty.io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:166) at software.amazon.awssdk.thirdparty.io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:472) at software.amazon.awssdk.thirdparty.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:569) at software.amazon.awssdk.thirdparty.io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997) at software.amazon.awssdk.thirdparty.io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) ... 1 more {code} > S3A) error stack traces printed on analytics stream factory close > ----------------------------------------------------------------- > > Key: HADOOP-19567 > URL: https://issues.apache.org/jira/browse/HADOOP-19567 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 > Affects Versions: 3.4.2 > Reporter: Steve Loughran > Priority: Major > > When you close an s3a filesystem there is a lot of ERROR level stack traces > about a CancellationException -despite that being exactly what is wanted. > Core of it comes from netty. > {code} > Suppressed: software.amazon.awssdk.core.exception.SdkClientException: > Request attempt 1 failure: Unable to execute HTTP request: The connection was > closed during the request. The request will usually succeed on a retry, but > if it does not: consider disabling any proxies you have configured, enabling > debug logging, or performing a TCP dump to identify the root cause. If this > is a streaming operation, validate that data is being read or written in a > timely manner. Channel Information: ChannelDiagnostics(channel=[id: > 0x801baead, L:0.0.0.0/0.0.0.0:59534 ! > R:bucket.vpce-0117f6033eaf7aee5-2hxd9fg4.s3.us-west-2.vpce.amazonaws.com/10.80.134.179:443], > channelAge=PT0.676S, requestCount=1, responseCount=0, > lastIdleDuration=PT0.006284125S) > Caused by: java.lang.IllegalStateException: executor not accepting a task > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org