Oliver Draese created HIVE-22113: ------------------------------------ Summary: Prevent LLAP shutdown on AMReporter related RuntimeException Key: HIVE-22113 URL: https://issues.apache.org/jira/browse/HIVE-22113 Project: Hive Issue Type: Bug Components: llap Affects Versions: 3.1.1 Reporter: Oliver Draese Assignee: Oliver Draese
If a task attempt cannot be removed from AMReporter (i.e. task attempt was not found), the AMReporter throws a RuntimeException. This exception is not caught and trickles up, causing an LLAP shutdown: {{2019-08-08T23:34:39,748 ERROR [Wait-Queue-Scheduler-0 ()] org.apache.hadoop.hive.llap.daemon.impl.LlapDaemon: Thread Thread[Wait-Queue-Scheduler-0,5,main] threw an Exception. Shutting down now...}}{{java.lang.RuntimeException: attempt_1563528877295_18872_3728_01_000003_0 was not registered and couldn't be removed}}{{ at org.apache.hadoop.hive.llap.daemon.impl.AMReporter$AMNodeInfo.removeTaskAttempt(AMReporter.java:524) ~[hive-llap-server-3.1.0.3.1.0.103-1.jar:3.1.0.3.1.0.103-1]}}{{ at org.apache.hadoop.hive.llap.daemon.impl.AMReporter.unregisterTask(AMReporter.java:243) ~[hive-llap-server-3.1.0.3.1.0.103-1.jar:3.1.0.3.1.0.103-1]}}{{ at org.apache.hadoop.hive.llap.daemon.impl.TaskRunnerCallable.killTask(TaskRunnerCallable.java:384) ~[hive-llap-server-3.1.0.3.1.0.103-1.jar:3.1.0.3.1.0.103-1]}}{{ at org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService.handleScheduleAttemptedRejection(TaskExecutorService.java:739) ~[hive-llap-server-3.1.0.3.1.0.103-1.jar:3.1.0.3.1.0.103-1]}}{{ at org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService.access$1100(TaskExecutorService.java:91) ~[hive-llap-server-3.1.0.3.1.0.103-1.jar:3.1.0.3.1.0.103-1]}}{{ at org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService$WaitQueueWorker.run(TaskExecutorService.java:396) ~[hive-llap-server-3.1.0.3.1.0.103-1.jar:3.1.0.3.1.0.103-1]}}{{ at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[?:1.8.0_161]}}{{ at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:108) [hive-exec-3.1.0.3.1.0.103-1.jar:3.1.0-SNAPSHOT]}}{{ at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:41) [hive-exec-3.1.0.3.1.0.103-1.jar:3.1.0-SNAPSHOT]}}{{ at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:77) [hive-exec-3.1.0.3.1.0.103-1.jar:3.1.0-SNAPSHOT]}}{{ at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_161]}}{{ at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_161]}}{{ at java.lang.Thread.run(Thread.java:748) [?:1.8.0_161]}} -- This message was sent by Atlassian JIRA (v7.6.14#76016)