gocho1 opened a new issue #8534: Ingest task succeeded then failed but stays running forever URL: https://github.com/apache/incubator-druid/issues/8534 ### Affected Version Encountered in Druid 0.15.0, no other version tested yet. ### Description A task that succeeded switch to failed status for no visible reason. Then some resources are stopped and an exception occured on a thread that need a resource ("ServiceEmitter") that has been stopped. Here is the log of the task. ``` 2019-09-08T18:43:42,477 INFO [task-runner-0-priority-0] org.apache.druid.indexing.worker.executor.ExecutorLifecycle - Task completed with status: { "id" : "index_hadoop_datasource_2019-09-08T18:42:42.831Z", "status" : "SUCCESS", "duration" : 54326, "errorMsg" : null } 2019-09-08T18:43:42,482 INFO [main] org.apache.druid.java.util.common.lifecycle.Lifecycle - Stopping lifecycle [module] stage [ANNOUNCEMENTS] 2019-09-08T18:43:42,483 INFO [main] org.apache.druid.java.util.common.lifecycle.Lifecycle$AnnotationBasedHandler - Invoking stop method[public void org.apache.druid.curator.announcement.Announcer.stop()] on object[org.apache.druid.curator.announcement.Announcer@75bf9e67]. 2019-09-08T18:43:42,484 INFO [main] org.apache.druid.curator.announcement.Announcer - Stopping announcer 2019-09-08T18:43:42,484 INFO [main] org.apache.druid.curator.announcement.Announcer - unannouncing [/druid_iods/listeners/lookups/__default/http:ip-XXX.eu-west-1.compute.internal:8100] 2019-09-08T18:43:42,497 INFO [main] org.apache.druid.java.util.common.lifecycle.Lifecycle - Stopping lifecycle [module] stage [SERVER] 2019-09-08T18:43:42,497 INFO [main] org.apache.druid.server.initialization.jetty.JettyServerModule - Stopping Jetty Server... 2019-09-08T18:43:42,501 INFO [main] org.eclipse.jetty.server.AbstractConnector - Stopped ServerConnector@1d39a880{HTTP/1.1,[http/1.1]}{0.0.0.0:8100} 2019-09-08T18:43:42,501 INFO [main] org.eclipse.jetty.server.session - node0 Stopped scavenging 2019-09-08T18:43:42,503 INFO [main] org.eclipse.jetty.server.handler.ContextHandler - Stopped o.e.j.s.ServletContextHandler@751686aa{/,null,UNAVAILABLE} 2019-09-08T18:43:42,506 INFO [main] org.apache.druid.java.util.common.lifecycle.Lifecycle - Stopping lifecycle [module] stage [NORMAL] 2019-09-08T18:43:42,506 INFO [main] org.apache.druid.java.util.common.lifecycle.Lifecycle$AnnotationBasedHandler - Invoking stop method[public void org.apache.druid.server.listener.announcer.ListenerResourceAnnouncer.stop()] on object[org.apache.druid.query.lookup.LookupResourceListenerAnnouncer@54bb1194]. 2019-09-08T18:43:42,506 INFO [main] org.apache.druid.server.listener.announcer.ListenerResourceAnnouncer - Unannouncing start time on [/druid_iods/listeners/lookups/__default/http:ip-XXX.eu-west-1.compute.internal:8100] 2019-09-08T18:43:42,506 INFO [main] org.apache.druid.java.util.common.lifecycle.Lifecycle$AnnotationBasedHandler - Invoking stop method[public void org.apache.druid.indexing.worker.executor.ExecutorLifecycle.stop() throws java.lang.Exception] on object[org.apache.druid.indexing.worker.executor.ExecutorLifecycle@46702c26]. 2019-09-08T18:43:42,507 INFO [main] org.apache.druid.java.util.common.lifecycle.Lifecycle$AnnotationBasedHandler - Invoking stop method[public void org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner.stop()] on object[org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner@12c76d6e]. 2019-09-08T18:43:42,507 INFO [main] org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner - Starting graceful shutdown of task[index_hadoop_datasource_2019-09-08T18:42:42.831Z]. 2019-09-08T18:43:42,507 INFO [main] org.apache.druid.indexing.overlord.TaskRunnerUtils - Task [index_hadoop_datasource_2019-09-08T18:42:42.831Z] status changed to [FAILED]. 2019-09-08T18:43:42,519 INFO [main] org.apache.druid.java.util.common.lifecycle.Lifecycle$AnnotationBasedHandler - Invoking stop method[public void org.apache.druid.client.cache.CaffeineCache.close()] on object[org.apache.druid.client.cache.CaffeineCache@f08d676]. 2019-09-08T18:43:42,520 INFO [main] org.apache.druid.java.util.common.lifecycle.Lifecycle$AnnotationBasedHandler - Invoking stop method[public void org.apache.druid.discovery.DruidLeaderClient.stop()] on object[org.apache.druid.discovery.DruidLeaderClient@5d27b4c1]. 2019-09-08T18:43:42,520 INFO [main] org.apache.druid.discovery.DruidLeaderClient - Stopped. 2019-09-08T18:43:42,520 INFO [main] org.apache.druid.java.util.common.lifecycle.Lifecycle$AnnotationBasedHandler - Invoking stop method[public void org.apache.druid.curator.discovery.ServerDiscoverySelector.stop() throws java.io.IOException] on object[org.apache.druid.curator.discovery.ServerDiscoverySelector@23396fc0]. 2019-09-08T18:43:42,521 INFO [main] org.apache.druid.java.util.common.lifecycle.Lifecycle$AnnotationBasedHandler - Invoking stop method[public void org.apache.druid.security.basic.authentication.db.cache.CoordinatorPollingBasicAuthenticatorCacheManager.stop()] on object[org.apache.druid.security.basic.authentication.db.cache.CoordinatorPollingBasicAuthenticatorCacheManager@32b112a1]. 2019-09-08T18:43:42,522 INFO [main] org.apache.druid.security.basic.authentication.db.cache.CoordinatorPollingBasicAuthenticatorCacheManager - DefaultBasicAuthenticatorCacheManager is stopping. 2019-09-08T18:43:42,522 INFO [main] org.apache.druid.security.basic.authentication.db.cache.CoordinatorPollingBasicAuthenticatorCacheManager - DefaultBasicAuthenticatorCacheManager is stopped. 2019-09-08T18:43:42,522 INFO [main] org.apache.druid.java.util.common.lifecycle.Lifecycle$AnnotationBasedHandler - Invoking stop method[public void org.apache.druid.java.util.metrics.MonitorScheduler.stop()] on object[org.apache.druid.java.util.metrics.MonitorScheduler@34549979]. 2019-09-08T18:43:42,522 INFO [main] org.apache.druid.java.util.common.lifecycle.Lifecycle$AnnotationBasedHandler - Invoking stop method[public void org.apache.druid.security.basic.authorization.db.cache.CoordinatorPollingBasicAuthorizerCacheManager.stop()] on object[org.apache.druid.security.basic.authorization.db.cache.CoordinatorPollingBasicAuthorizerCacheManager@ce655b9]. 2019-09-08T18:43:42,522 INFO [main] org.apache.druid.security.basic.authorization.db.cache.CoordinatorPollingBasicAuthorizerCacheManager - CoordinatorPollingBasicAuthorizerCacheManager is stopping. 2019-09-08T18:43:42,522 INFO [main] org.apache.druid.security.basic.authorization.db.cache.CoordinatorPollingBasicAuthorizerCacheManager - CoordinatorPollingBasicAuthorizerCacheManager is stopped. 2019-09-08T18:43:42,522 INFO [main] org.apache.druid.java.util.common.lifecycle.Lifecycle$AnnotationBasedHandler - Invoking stop method[public void org.apache.druid.query.lookup.LookupReferencesManager.stop()] on object[org.apache.druid.query.lookup.LookupReferencesManager@171aa66]. 2019-09-08T18:43:42,522 INFO [main] org.apache.druid.query.lookup.LookupReferencesManager - LookupExtractorFactoryContainerProvider is stopping. 2019-09-08T18:43:42,522 INFO [LookupExtractorFactoryContainerProvider-MainThread] org.apache.druid.query.lookup.LookupReferencesManager - Lookup Management loop exited, Lookup notices are not handled anymore. 2019-09-08T18:43:42,522 INFO [main] org.apache.druid.query.lookup.LookupReferencesManager - LookupExtractorFactoryContainerProvider is stopped. 2019-09-08T18:43:42,522 INFO [main] org.apache.druid.java.util.common.lifecycle.Lifecycle$AnnotationBasedHandler - Invoking stop method[public void org.apache.druid.discovery.DruidLeaderClient.stop()] on object[org.apache.druid.discovery.DruidLeaderClient@faec277]. 2019-09-08T18:43:42,522 INFO [main] org.apache.druid.discovery.DruidLeaderClient - Stopped. 2019-09-08T18:43:42,522 INFO [main] org.apache.druid.java.util.common.lifecycle.Lifecycle$AnnotationBasedHandler - Invoking stop method[public void org.apache.druid.curator.discovery.ServerDiscoverySelector.stop() throws java.io.IOException] on object[org.apache.druid.curator.discovery.ServerDiscoverySelector@1a536164]. 2019-09-08T18:43:42,523 INFO [main] org.apache.druid.java.util.common.lifecycle.Lifecycle$AnnotationBasedHandler - Invoking stop method[public void org.apache.druid.curator.discovery.CuratorDruidNodeDiscoveryProvider.stop()] on object[org.apache.druid.curator.discovery.CuratorDruidNodeDiscoveryProvider@3df77dfa]. 2019-09-08T18:43:42,523 INFO [main] org.apache.druid.curator.discovery.CuratorDruidNodeDiscoveryProvider - stopping 2019-09-08T18:43:42,523 INFO [main] org.apache.druid.curator.discovery.CuratorDruidNodeDiscoveryProvider - stopped 2019-09-08T18:43:42,523 INFO [main] org.apache.druid.java.util.common.lifecycle.Lifecycle$AnnotationBasedHandler - Invoking stop method[public void org.apache.druid.java.util.http.client.NettyHttpClient.stop()] on object[org.apache.druid.java.util.http.client.NettyHttpClient@64f3ca6]. 2019-09-08T18:43:42,538 INFO [main] org.apache.druid.java.util.common.lifecycle.Lifecycle$AnnotationBasedHandler - Invoking stop method[public void org.apache.druid.java.util.emitter.service.ServiceEmitter.close() throws java.io.IOException] on object[ServiceEmitter{serviceDimensions={service=druid/middleManager, host=ip-XXX.eu-west-1.compute.internal:8100, version=0.15.0-incubating}, emitter=HttpPostEmitter{config=HttpEmitterConfig{flushMillis=10000, flushCount=500, flushTimeOut=9223372036854775807, basicAuthentication='null', batchingStrategy=ARRAY, maxBatchSize=5191680, contentEncoding=null, batchQueueSizeLimit=9, httpTimeoutAllowanceFactor=2.0, minHttpTimeoutMillis=0, recipientBaseUrl='https://fluentd.common.pps.io/druid_iods'}}}]. 2019-09-08T18:43:49,802 WARN [BasicAuthenticatorCacheManager-Exec--0] org.apache.druid.java.util.common.RetryUtils - Retrying (1 of 9) in 783ms. java.lang.IllegalStateException at com.google.common.base.Preconditions.checkState(Preconditions.java:161) ~[guava-16.0.1.jar:?] at org.apache.druid.discovery.DruidLeaderClient.makeRequest(DruidLeaderClient.java:130) ~[druid-server-0.15.0-incubating.jar:0.15.0-incubating] at org.apache.druid.discovery.DruidLeaderClient.makeRequest(DruidLeaderClient.java:139) ~[druid-server-0.15.0-incubating.jar:0.15.0-incubating] at org.apache.druid.security.basic.authentication.db.cache.CoordinatorPollingBasicAuthenticatorCacheManager.tryFetchUserMapFromCoordinator(CoordinatorPollingBasicAuthenticatorCacheManager.java:250) ~[?:?] at org.apache.druid.security.basic.authentication.db.cache.CoordinatorPollingBasicAuthenticatorCacheManager.lambda$fetchUserMapFromCoordinator$1(CoordinatorPollingBasicAuthenticatorCacheManager.java:191) ~[?:?] at org.apache.druid.java.util.common.RetryUtils.retry(RetryUtils.java:86) ~[druid-core-0.15.0-incubating.jar:0.15.0-incubating] at org.apache.druid.java.util.common.RetryUtils.retry(RetryUtils.java:114) ~[druid-core-0.15.0-incubating.jar:0.15.0-incubating] at org.apache.druid.java.util.common.RetryUtils.retry(RetryUtils.java:104) ~[druid-core-0.15.0-incubating.jar:0.15.0-incubating] at org.apache.druid.security.basic.authentication.db.cache.CoordinatorPollingBasicAuthenticatorCacheManager.fetchUserMapFromCoordinator(CoordinatorPollingBasicAuthenticatorCacheManager.java:189) ~[?:?] at org.apache.druid.security.basic.authentication.db.cache.CoordinatorPollingBasicAuthenticatorCacheManager.lambda$start$0(CoordinatorPollingBasicAuthenticatorCacheManager.java:121) ~[?:?] at org.apache.druid.java.util.common.concurrent.ScheduledExecutors$1.call(ScheduledExecutors.java:55) [druid-core-0.15.0-incubating.jar:0.15.0-incubating] at org.apache.druid.java.util.common.concurrent.ScheduledExecutors$1.call(ScheduledExecutors.java:51) [druid-core-0.15.0-incubating.jar:0.15.0-incubating] at org.apache.druid.java.util.common.concurrent.ScheduledExecutors$2.run(ScheduledExecutors.java:92) [druid-core-0.15.0-incubating.jar:0.15.0-incubating] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_222-ea] at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_222-ea] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) [?:1.8.0_222-ea] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [?:1.8.0_222-ea] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_222-ea] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_222-ea] at java.lang.Thread.run(Thread.java:748) [?:1.8.0_222-ea] . . . 2019-09-08T18:47:45,004 ERROR [CoordinatorPollingBasicAuthorizerCacheManager-Exec--0] org.apache.druid.java.util.common.concurrent.ScheduledExecutors - Uncaught exception. java.util.concurrent.RejectedExecutionException: Service is closed. at org.apache.druid.java.util.emitter.core.HttpPostEmitter.awaitStarted(HttpPostEmitter.java:208) ~[druid-core-0.15.0-incubating.jar:0.15.0-incubating] at org.apache.druid.java.util.emitter.core.HttpPostEmitter.emitAndReturnBatch(HttpPostEmitter.java:233) ~[druid-core-0.15.0-incubating.jar:0.15.0-incubating] at org.apache.druid.java.util.emitter.core.HttpPostEmitter.emit(HttpPostEmitter.java:226) ~[druid-core-0.15.0-incubating.jar:0.15.0-incubating] at org.apache.druid.java.util.emitter.service.ServiceEmitter.emit(ServiceEmitter.java:67) ~[druid-core-0.15.0-incubating.jar:0.15.0-incubating] at org.apache.druid.java.util.emitter.service.ServiceEmitter.emit(ServiceEmitter.java:72) ~[druid-core-0.15.0-incubating.jar:0.15.0-incubating] at org.apache.druid.java.util.emitter.service.AlertBuilder.emit(AlertBuilder.java:88) ~[druid-core-0.15.0-incubating.jar:0.15.0-incubating] at org.apache.druid.java.util.emitter.EmittingLogger$EmittingAlertBuilder.emit(EmittingLogger.java:109) ~[druid-core-0.15.0-incubating.jar:0.15.0-incubating] at org.apache.druid.security.basic.authorization.db.cache.CoordinatorPollingBasicAuthorizerCacheManager.lambda$start$0(CoordinatorPollingBasicAuthorizerCacheManager.java:132) ~[?:?] at org.apache.druid.java.util.common.concurrent.ScheduledExecutors$1.call(ScheduledExecutors.java:55) ~[druid-core-0.15.0-incubating.jar:0.15.0-incubating] at org.apache.druid.java.util.common.concurrent.ScheduledExecutors$1.call(ScheduledExecutors.java:51) ~[druid-core-0.15.0-incubating.jar:0.15.0-incubating] at org.apache.druid.java.util.common.concurrent.ScheduledExecutors$2.run(ScheduledExecutors.java:92) [druid-core-0.15.0-incubating.jar:0.15.0-incubating] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_222-ea] at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_222-ea] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) [?:1.8.0_222-ea] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [?:1.8.0_222-ea] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_222-ea] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_222-ea] at java.lang.Thread.run(Thread.java:748) [?:1.8.0_222-ea] ```
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org