[ https://issues.apache.org/jira/browse/STORM-3779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
sanghee park updated STORM-3779: -------------------------------- Description: Hi developers, We met critical issue when kill storm topology. We killed the topology as below. {code:java} Config conf = new Config(); conf.put(Config.NIMBUS_SEEDS, "SOME_NIMBUS_SEED_STRING"); KillOptions opt = new KillOptions(); opt.set_wait_secs_isSet(true); opt.set_wait_secs(10); Nimbus.Iface nimbusClient = NimbusClient.getConfiguredClient(conf).getClient(); nimbusClient.killTopologyWithOpts("TOPOLOGY_NAME", opt); {code} Topology workers were distributed across multiple supervisors. Some supervisor's workers died normally. But the problem is that, h3. *Some supervisor workers never died with error message like below!!* {noformat} 2021-06-29 02:58:44.284 o.a.s.d.s.Container SLOT_6707 [INFO] SET worker-user baef41a4-b5f6-4ea3-8868-5537dfba82f8 root 2021-06-29 02:58:44.284 o.a.s.d.s.Container SLOT_6707 [INFO] Creating symlinks for worker-id: baef41a4-b5f6-4ea3-8868-5537dfba82f8 storm-id: TOPOLOGY_NAME for files(1): [resources] 2021-06-29 02:58:44.284 o.a.s.d.s.BasicContainer SLOT_6707 [INFO] Launching worker with assignment LocalAssignment(topology_id:TOPOLOGY_NAME, executors:[ExecutorInfo(task_start:17, task_end:17), ExecutorInfo(task_start:29, task_end:29), ExecutorInfo(task_start:5, task_end:5)], resources:WorkerResources(mem_on_heap:6272.0, mem_off_heap:0.0, cpu:30.0, shared_mem_on_heap:0.0, shared_mem_off_heap:0.0, resources:{offheap.memory.mb=0.0, onheap.memory.mb=6272.0, cpu.pcore.percent=30.0}, shared_resources:{}), owner:root) for this supervisor d2ee514a-e40e-40fb-b119-59763f3bb95d-10.233.112.14 on port 6707 with id baef41a4-b5f6-4ea3-8868-5537dfba82f8 2021-06-29 02:58:44.285 o.a.s.d.s.Slot SLOT_6708 [INFO] STATE kill-and-relaunch msInState: 6 topo:TOPOLOGY_NAME worker:d06bb5c5-25e2-4557-8996-4d40045022d1 -> waiting-for-worker-start msInState: 0 topo:TOPOLOGY_NAME worker:d06bb5c5-25e2-4557-8996-4d40045022d1 2021-06-29 02:58:44.286 o.a.s.d.s.Slot SLOT_6707 [INFO] STATE kill-and-relaunch msInState: 7 topo:TOPOLOGY_NAME worker:baef41a4-b5f6-4ea3-8868-5537dfba82f8 -> waiting-for-worker-start msInState: 0 topo:TOPOLOGY_NAME worker:baef41a4-b5f6-4ea3-8868-5537dfba82f8 2021-06-29 02:58:46.799 o.a.s.d.s.BasicContainer Thread-7269 [INFO] Worker Process d06bb5c5-25e2-4557-8996-4d40045022d1 exited with code: 254 2021-06-29 02:58:48.065 o.a.s.d.s.BasicContainer Thread-7270 [INFO] Worker Process baef41a4-b5f6-4ea3-8868-5537dfba82f8 exited with code: 254 2021-06-29 02:59:09.234 o.a.s.d.s.t.SupervisorHealthCheck timer [INFO] Running supervisor healthchecks... 2021-06-29 02:59:09.234 o.a.s.h.HealthChecker timer [INFO] The supervisor healthchecks succeeded. 2021-06-29 02:59:39.234 o.a.s.d.s.t.SupervisorHealthCheck timer [INFO] Running supervisor healthchecks... 2021-06-29 02:59:39.234 o.a.s.h.HealthChecker timer [INFO] The supervisor healthchecks succeeded. 2021-06-29 02:59:53.558 o.a.s.d.s.Supervisor pool-11-thread-9 [INFO] Got an assignments from master, will start to sync with assignments: SupervisorAssignments(...) 2021-06-29 02:59:53.936 o.a.s.d.s.Slot SLOT_6702 [INFO] SLOT 6702: Assignment Changed from LocalAssignment(topology_id:TOPOLOGY_NAME, executors:[ExecutorInfo(task_start:23, task_end:23), ExecutorInfo(task_start:11, task_end:11)], resources:WorkerResources(mem_on_heap:3200.0, mem_off_heap:0.0, cpu:20.0, shared_mem_on_heap:0.0, shared_mem_off_heap:0.0, resources:{offheap.memory.mb=0.0, onheap.memory.mb=3200.0, cpu.pcore.percent=20.0}, shared_resources:{}), owner:root) to null 2021-06-29 02:59:53.939 o.a.s.d.s.Container SLOT_6702 [INFO] Killing d2ee514a-e40e-40fb-b119-59763f3bb95d-10.233.112.14:25976cac-9170-44ec-b835-099377cda893 2021-06-29 02:59:54.293 o.a.s.d.s.Slot SLOT_6708 [INFO] SLOT 6708: Assignment Changed from LocalAssignment(topology_id:TOPOLOGY_NAME, executors:[ExecutorInfo(task_start:10, task_end:10), ExecutorInfo(task_start:22, task_end:22)], resources:WorkerResources(mem_on_heap:3200.0, mem_off_heap:0.0, cpu:20.0, shared_mem_on_heap:0.0, shared_mem_off_heap:0.0, resources:{offheap.memory.mb=0.0, onheap.memory.mb=3200.0, cpu.pcore.percent=20.0}, shared_resources:{}), owner:root) to null 2021-06-29 02:59:54.293 o.a.s.d.s.Slot SLOT_6707 [INFO] SLOT 6707: Assignment Changed from LocalAssignment(topology_id:TOPOLOGY_NAME, executors:[ExecutorInfo(task_start:17, task_end:17), ExecutorInfo(task_start:29, task_end:29), ExecutorInfo(task_start:5, task_end:5)], resources:WorkerResources(mem_on_heap:6272.0, mem_off_heap:0.0, cpu:30.0, shared_mem_on_heap:0.0, shared_mem_off_heap:0.0, resources:{offheap.memory.mb=0.0, onheap.memory.mb=6272.0, cpu.pcore.percent=30.0}, shared_resources:{}), owner:root) to null 2021-06-29 02:59:54.296 o.a.s.d.s.Slot SLOT_6708 [INFO] STATE waiting-for-worker-start msInState: 70011 topo:TOPOLOGY_NAME worker:d06bb5c5-25e2-4557-8996-4d40045022d1 -> kill msInState: 0 topo:TOPOLOGY_NAME worker:d06bb5c5-25e2-4557-8996-4d40045022d1 2021-06-29 02:59:54.296 o.a.s.d.s.Slot SLOT_6707 [INFO] STATE waiting-for-worker-start msInState: 70010 topo:TOPOLOGY_NAME worker:baef41a4-b5f6-4ea3-8868-5537dfba82f8 -> kill msInState: 0 topo:TOPOLOGY_NAME worker:baef41a4-b5f6-4ea3-8868-5537dfba82f8 2021-06-29 02:59:54.298 o.a.s.d.s.Slot SLOT_6708 [INFO] SLOT 6708 all processes are dead... 2021-06-29 02:59:54.298 o.a.s.d.s.Container SLOT_6708 [INFO] Cleaning up d2ee514a-e40e-40fb-b119-59763f3bb95d-10.233.112.14:d06bb5c5-25e2-4557-8996-4d40045022d1 2021-06-29 02:59:54.298 o.a.s.d.s.AdvancedFSOps SLOT_6708 [INFO] Deleting path /storm/workers/d06bb5c5-25e2-4557-8996-4d40045022d1/pids/141225 2021-06-29 02:59:54.298 o.a.s.d.s.AdvancedFSOps SLOT_6708 [INFO] Deleting path /storm/workers/d06bb5c5-25e2-4557-8996-4d40045022d1/heartbeats 2021-06-29 03:00:06.452 o.a.s.d.s.AdvancedFSOps AsyncLocalizer Task Executor - 1 [INFO] Deleting path /storm/supervisor/stormdist/TOPOLOGY_NAME/stormjar.jar 2021-06-29 03:00:06.472 o.a.s.d.s.AdvancedFSOps AsyncLocalizer Task Executor - 1 [INFO] Deleting path /storm/supervisor/stormdist/TOPOLOGY_NAME/stormjar.jar.version 2021-06-29 03:00:06.472 o.a.s.d.s.AdvancedFSOps AsyncLocalizer Task Executor - 1 [INFO] Deleting path /storm/supervisor/stormdist/TOPOLOGY_NAME/resources 2021-06-29 03:00:06.472 o.a.s.l.LocalizedResourceRetentionSet AsyncLocalizer Task Executor - 1 [INFO] Deleted blob: TOPOLOGY_NAME-stormjar.jar (REMOVED FROM CLUSTER). 2021-06-29 03:00:06.475 o.a.s.d.s.AdvancedFSOps AsyncLocalizer Task Executor - 1 [INFO] Deleting path /storm/supervisor/stormdist/TOPOLOGY_NAME/stormconf.ser 2021-06-29 03:00:06.475 o.a.s.d.s.AdvancedFSOps AsyncLocalizer Task Executor - 1 [INFO] Deleting path /storm/supervisor/stormdist/TOPOLOGY_NAME/stormconf.ser.version 2021-06-29 03:00:06.475 o.a.s.l.LocalizedResourceRetentionSet AsyncLocalizer Task Executor - 1 [INFO] Deleted blob: TOPOLOGY_NAME-stormconf.ser (REMOVED FROM CLUSTER). 2021-06-29 03:00:06.477 o.a.s.d.s.AdvancedFSOps AsyncLocalizer Task Executor - 1 [INFO] Deleting path /storm/supervisor/stormdist/TOPOLOGY_NAME/stormcode.ser 2021-06-29 03:00:06.477 o.a.s.d.s.AdvancedFSOps AsyncLocalizer Task Executor - 1 [INFO] Deleting path /storm/supervisor/stormdist/TOPOLOGY_NAME/stormcode.ser.version 2021-06-29 03:00:06.478 o.a.s.l.LocalizedResourceRetentionSet AsyncLocalizer Task Executor - 1 [INFO] Deleted blob: TOPOLOGY_NAME-stormcode.ser (REMOVED FROM CLUSTER). 2021-06-29 03:00:06.478 o.a.s.d.s.AdvancedFSOps AsyncLocalizer Task Executor - 1 [INFO] Deleting path /storm/supervisor/stormdist/TOPOLOGY_NAME 2021-06-29 03:00:07.062 o.a.s.d.s.Supervisor pool-11-thread-10 [WARN] Topology config is not localized yet... 2021-06-29 03:00:07.063 o.a.s.t.ProcessFunction pool-11-thread-10 [ERROR] Internal error processing sendSupervisorWorkerHeartbeat org.apache.storm.utils.WrappedNotAliveException: TOPOLOGY_NAME does not appear to be alive, you should probably exit at org.apache.storm.daemon.supervisor.Supervisor$1.sendSupervisorWorkerHeartbeat(Supervisor.java:448) ~[storm-server-2.2.0.jar:2.2.0] at org.apache.storm.generated.Supervisor$Processor$sendSupervisorWorkerHeartbeat.getResult(Supervisor.java:374) ~[storm-client-2.2.0.jar:2.2.0] at org.apache.storm.generated.Supervisor$Processor$sendSupervisorWorkerHeartbeat.getResult(Supervisor.java:353) ~[storm-client-2.2.0.jar:2.2.0] at org.apache.storm.thrift.ProcessFunction.process(ProcessFunction.java:38) [storm-shaded-deps-2.2.0.jar:2.2.0] at org.apache.storm.thrift.TBaseProcessor.process(TBaseProcessor.java:38) [storm-shaded-deps-2.2.0.jar:2.2.0] at org.apache.storm.security.auth.SimpleTransportPlugin$SimpleWrapProcessor.process(SimpleTransportPlugin.java:172) [storm-client-2.2.0.jar:2.2.0] at org.apache.storm.thrift.server.AbstractNonblockingServer$FrameBuffer.invoke(AbstractNonblockingServer.java:524) [storm-shaded-deps-2.2.0.jar:2.2.0] at org.apache.storm.thrift.server.Invocation.run(Invocation.java:18) [storm-shaded-deps-2.2.0.jar:2.2.0] at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) [?:?] at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) [?:?] at java.lang.Thread.run(Unknown Source) [?:?] 2021-06-29 03:00:07.064 o.a.s.t.ProcessFunction pool-11-thread-3 [ERROR] Internal error processing sendSupervisorWorkerHeartbeat org.apache.storm.utils.WrappedNotAliveException: TOPOLOGY_NAME does not appear to be alive, you should probably exit at org.apache.storm.daemon.supervisor.Supervisor$1.sendSupervisorWorkerHeartbeat(Supervisor.java:448) ~[storm-server-2.2.0.jar:2.2.0] at org.apache.storm.generated.Supervisor$Processor$sendSupervisorWorkerHeartbeat.getResult(Supervisor.java:374) ~[storm-client-2.2.0.jar:2.2.0] at org.apache.storm.generated.Supervisor$Processor$sendSupervisorWorkerHeartbeat.getResult(Supervisor.java:353) ~[storm-client-2.2.0.jar:2.2.0] at org.apache.storm.thrift.ProcessFunction.process(ProcessFunction.java:38) [storm-shaded-deps-2.2.0.jar:2.2.0] at org.apache.storm.security.auth.SimpleTransportPlugin$SimpleWrapProcessor.process(SimpleTransportPlugin.java:172) [storm-client-2.2.0.jar:2.2.0] at org.apache.storm.thrift.server.AbstractNonblockingServer$FrameBuffer.invoke(AbstractNonblockingServer.java:524) [storm-shaded-deps-2.2.0.jar:2.2.0] at org.apache.storm.thrift.server.Invocation.run(Invocation.java:18) [storm-shaded-deps-2.2.0.jar:2.2.0] at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) [?:?] at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) [?:?] at java.lang.Thread.run(Unknown Source) [?:?] 2021-06-29 03:00:08.106 o.a.s.d.s.Supervisor pool-11-thread-9 [WARN] Topology config is not localized yet... 2021-06-29 03:00:08.107 o.a.s.t.ProcessFunction pool-11-thread-9 [ERROR] Internal error processing sendSupervisorWorkerHeartbeat org.apache.storm.utils.WrappedNotAliveException: TOPOLOGY_NAME does not appear to be alive, you should probably exit at org.apache.storm.daemon.supervisor.Supervisor$1.sendSupervisorWorkerHeartbeat(Supervisor.java:448) ~[storm-server-2.2.0.jar:2.2.0] at org.apache.storm.generated.Supervisor$Processor$sendSupervisorWorkerHeartbeat.getResult(Supervisor.java:374) ~[storm-client-2.2.0.jar:2.2.0] at org.apache.storm.generated.Supervisor$Processor$sendSupervisorWorkerHeartbeat.getResult(Supervisor.java:353) ~[storm-client-2.2.0.jar:2.2.0] at org.apache.storm.thrift.ProcessFunction.process(ProcessFunction.java:38) [storm-shaded-deps-2.2.0.jar:2.2.0] at org.apache.storm.thrift.TBaseProcessor.process(TBaseProcessor.java:38) [storm-shaded-deps-2.2.0.jar:2.2.0] at org.apache.storm.security.auth.SimpleTransportPlugin$SimpleWrapProcessor.process(SimpleTransportPlugin.java:172) [storm-client-2.2.0.jar:2.2.0] at org.apache.storm.thrift.server.AbstractNonblockingServer$FrameBuffer.invoke(AbstractNonblockingServer.java:524) [storm-shaded-deps-2.2.0.jar:2.2.0] at org.apache.storm.thrift.server.Invocation.run(Invocation.java:18) [storm-shaded-deps-2.2.0.jar:2.2.0] at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) [?:?] at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) [?:?] at java.lang.Thread.run(Unknown Source) [?:?] 2021-06-29 03:00:08.108 o.a.s.d.s.Supervisor pool-11-thread-16 [WARN] Topology config is not localized yet... 2021-06-29 03:00:08.108 o.a.s.t.ProcessFunction pool-11-thread-16 [ERROR] Internal error processing sendSupervisorWorkerHeartbeat org.apache.storm.utils.WrappedNotAliveException: TOPOLOGY_NAME does not appear to be alive, you should probably exit at org.apache.storm.daemon.supervisor.Supervisor$1.sendSupervisorWorkerHeartbeat(Supervisor.java:448) ~[storm-server-2.2.0.jar:2.2.0] at org.apache.storm.generated.Supervisor$Processor$sendSupervisorWorkerHeartbeat.getResult(Supervisor.java:374) ~[storm-client-2.2.0.jar:2.2.0] at org.apache.storm.generated.Supervisor$Processor$sendSupervisorWorkerHeartbeat.getResult(Supervisor.java:353) ~[storm-client-2.2.0.jar:2.2.0] at org.apache.storm.thrift.ProcessFunction.process(ProcessFunction.java:38) [storm-shaded-deps-2.2.0.jar:2.2.0] at org.apache.storm.thrift.TBaseProcessor.process(TBaseProcessor.java:38) [storm-shaded-deps-2.2.0.jar:2.2.0] at org.apache.storm.security.auth.SimpleTransportPlugin$SimpleWrapProcessor.process(SimpleTransportPlugin.java:172) [storm-client-2.2.0.jar:2.2.0] at org.apache.storm.thrift.server.AbstractNonblockingServer$FrameBuffer.invoke(AbstractNonblockingServer.java:524) [storm-shaded-deps-2.2.0.jar:2.2.0] at org.apache.storm.thrift.server.Invocation.run(Invocation.java:18) [storm-shaded-deps-2.2.0.jar:2.2.0] at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) [?:?]{noformat} *This error message repeated forever until we killed that worker process.* was: Hi developers, We met critical issue when kill storm topology. We killed the topology as below. {code:java} Config conf = new Config(); conf.put(Config.NIMBUS_SEEDS, "SOME_NIMBUS_SEED_STRING"); KillOptions opt = new KillOptions(); opt.set_wait_secs_isSet(true); opt.set_wait_secs(10); Nimbus.Iface nimbusClient = NimbusClient.getConfiguredClient(conf).getClient(); nimbusClient.killTopologyWithOpts("TOPOLOGY_NAME", opt); {code} Topology workers were distributed across multiple supervisors. Some supervisor's workers died normally. But the problem is that, h3. *Some supervisor workers never died with error message like below!!* {noformat} 2021-06-29 02:58:44.284 o.a.s.d.s.Container SLOT_6707 [INFO] SET worker-user baef41a4-b5f6-4ea3-8868-5537dfba82f8 root 2021-06-29 02:58:44.284 o.a.s.d.s.Container SLOT_6707 [INFO] Creating symlinks for worker-id: baef41a4-b5f6-4ea3-8868-5537dfba82f8 storm-id: TOPOLOGY_NAME for files(1): [resources] 2021-06-29 02:58:44.284 o.a.s.d.s.BasicContainer SLOT_6707 [INFO] Launching worker with assignment LocalAssignment(topology_id:TOPOLOGY_NAME, executors:[ExecutorInfo(task_start:17, task_end:17), ExecutorInfo(task_start:29, task_end:29), ExecutorInfo(task_start:5, task_end:5)], resources:WorkerResources(mem_on_heap:6272.0, mem_off_heap:0.0, cpu:30.0, shared_mem_on_heap:0.0, shared_mem_off_heap:0.0, resources:{offheap.memory.mb=0.0, onheap.memory.mb=6272.0, cpu.pcore.percent=30.0}, shared_resources:{}), owner:root) for this supervisor d2ee514a-e40e-40fb-b119-59763f3bb95d-10.233.112.14 on port 6707 with id baef41a4-b5f6-4ea3-8868-5537dfba82f8 2021-06-29 02:58:44.285 o.a.s.d.s.Slot SLOT_6708 [INFO] STATE kill-and-relaunch msInState: 6 topo:TOPOLOGY_NAME worker:d06bb5c5-25e2-4557-8996-4d40045022d1 -> waiting-for-worker-start msInState: 0 topo:TOPOLOGY_NAME worker:d06bb5c5-25e2-4557-8996-4d40045022d1 2021-06-29 02:58:44.286 o.a.s.d.s.Slot SLOT_6707 [INFO] STATE kill-and-relaunch msInState: 7 topo:TOPOLOGY_NAME worker:baef41a4-b5f6-4ea3-8868-5537dfba82f8 -> waiting-for-worker-start msInState: 0 topo:TOPOLOGY_NAME worker:baef41a4-b5f6-4ea3-8868-5537dfba82f8 2021-06-29 02:58:46.799 o.a.s.d.s.BasicContainer Thread-7269 [INFO] Worker Process d06bb5c5-25e2-4557-8996-4d40045022d1 exited with code: 254 2021-06-29 02:58:48.065 o.a.s.d.s.BasicContainer Thread-7270 [INFO] Worker Process baef41a4-b5f6-4ea3-8868-5537dfba82f8 exited with code: 254 2021-06-29 02:59:09.234 o.a.s.d.s.t.SupervisorHealthCheck timer [INFO] Running supervisor healthchecks... 2021-06-29 02:59:09.234 o.a.s.h.HealthChecker timer [INFO] The supervisor healthchecks succeeded. 2021-06-29 02:59:39.234 o.a.s.d.s.t.SupervisorHealthCheck timer [INFO] Running supervisor healthchecks... 2021-06-29 02:59:39.234 o.a.s.h.HealthChecker timer [INFO] The supervisor healthchecks succeeded. 2021-06-29 02:59:53.558 o.a.s.d.s.Supervisor pool-11-thread-9 [INFO] Got an assignments from master, will start to sync with assignments: SupervisorAssignments(...) 2021-06-29 02:59:53.936 o.a.s.d.s.Slot SLOT_6702 [INFO] SLOT 6702: Assignment Changed from LocalAssignment(topology_id:TOPOLOGY_NAME, executors:[ExecutorInfo(task_start:23, task_end:23), ExecutorInfo(task_start:11, task_end:11)], resources:WorkerResources(mem_on_heap:3200.0, mem_off_heap:0.0, cpu:20.0, shared_mem_on_heap:0.0, shared_mem_off_heap:0.0, resources:{offheap.memory.mb=0.0, onheap.memory.mb=3200.0, cpu.pcore.percent=20.0}, shared_resources:{}), owner:root) to null 2021-06-29 02:59:53.939 o.a.s.d.s.Container SLOT_6702 [INFO] Killing d2ee514a-e40e-40fb-b119-59763f3bb95d-10.233.112.14:25976cac-9170-44ec-b835-099377cda893 2021-06-29 02:59:54.293 o.a.s.d.s.Slot SLOT_6708 [INFO] SLOT 6708: Assignment Changed from LocalAssignment(topology_id:TOPOLOGY_NAME, executors:[ExecutorInfo(task_start:10, task_end:10), ExecutorInfo(task_start:22, task_end:22)], resources:WorkerResources(mem_on_heap:3200.0, mem_off_heap:0.0, cpu:20.0, shared_mem_on_heap:0.0, shared_mem_off_heap:0.0, resources:{offheap.memory.mb=0.0, onheap.memory.mb=3200.0, cpu.pcore.percent=20.0}, shared_resources:{}), owner:root) to null 2021-06-29 02:59:54.293 o.a.s.d.s.Slot SLOT_6707 [INFO] SLOT 6707: Assignment Changed from LocalAssignment(topology_id:TOPOLOGY_NAME, executors:[ExecutorInfo(task_start:17, task_end:17), ExecutorInfo(task_start:29, task_end:29), ExecutorInfo(task_start:5, task_end:5)], resources:WorkerResources(mem_on_heap:6272.0, mem_off_heap:0.0, cpu:30.0, shared_mem_on_heap:0.0, shared_mem_off_heap:0.0, resources:{offheap.memory.mb=0.0, onheap.memory.mb=6272.0, cpu.pcore.percent=30.0}, shared_resources:{}), owner:root) to null 2021-06-29 02:59:54.296 o.a.s.d.s.Slot SLOT_6708 [INFO] STATE waiting-for-worker-start msInState: 70011 topo:TOPOLOGY_NAME worker:d06bb5c5-25e2-4557-8996-4d40045022d1 -> kill msInState: 0 topo:TOPOLOGY_NAME worker:d06bb5c5-25e2-4557-8996-4d40045022d1 2021-06-29 02:59:54.296 o.a.s.d.s.Slot SLOT_6707 [INFO] STATE waiting-for-worker-start msInState: 70010 topo:TOPOLOGY_NAME worker:baef41a4-b5f6-4ea3-8868-5537dfba82f8 -> kill msInState: 0 topo:TOPOLOGY_NAME worker:baef41a4-b5f6-4ea3-8868-5537dfba82f8 2021-06-29 02:59:54.298 o.a.s.d.s.Slot SLOT_6708 [INFO] SLOT 6708 all processes are dead... 2021-06-29 02:59:54.298 o.a.s.d.s.Container SLOT_6708 [INFO] Cleaning up d2ee514a-e40e-40fb-b119-59763f3bb95d-10.233.112.14:d06bb5c5-25e2-4557-8996-4d40045022d1 2021-06-29 02:59:54.298 o.a.s.d.s.AdvancedFSOps SLOT_6708 [INFO] Deleting path /storm/workers/d06bb5c5-25e2-4557-8996-4d40045022d1/pids/141225 2021-06-29 02:59:54.298 o.a.s.d.s.AdvancedFSOps SLOT_6708 [INFO] Deleting path /storm/workers/d06bb5c5-25e2-4557-8996-4d40045022d1/heartbeats 2021-06-29 03:00:06.452 o.a.s.d.s.AdvancedFSOps AsyncLocalizer Task Executor - 1 [INFO] Deleting path /storm/supervisor/stormdist/TOPOLOGY_NAME/stormjar.jar 2021-06-29 03:00:06.472 o.a.s.d.s.AdvancedFSOps AsyncLocalizer Task Executor - 1 [INFO] Deleting path /storm/supervisor/stormdist/TOPOLOGY_NAME/stormjar.jar.version 2021-06-29 03:00:06.472 o.a.s.d.s.AdvancedFSOps AsyncLocalizer Task Executor - 1 [INFO] Deleting path /storm/supervisor/stormdist/TOPOLOGY_NAME/resources 2021-06-29 03:00:06.472 o.a.s.l.LocalizedResourceRetentionSet AsyncLocalizer Task Executor - 1 [INFO] Deleted blob: TOPOLOGY_NAME-stormjar.jar (REMOVED FROM CLUSTER). 2021-06-29 03:00:06.475 o.a.s.d.s.AdvancedFSOps AsyncLocalizer Task Executor - 1 [INFO] Deleting path /storm/supervisor/stormdist/TOPOLOGY_NAME/stormconf.ser 2021-06-29 03:00:06.475 o.a.s.d.s.AdvancedFSOps AsyncLocalizer Task Executor - 1 [INFO] Deleting path /storm/supervisor/stormdist/TOPOLOGY_NAME/stormconf.ser.version 2021-06-29 03:00:06.475 o.a.s.l.LocalizedResourceRetentionSet AsyncLocalizer Task Executor - 1 [INFO] Deleted blob: TOPOLOGY_NAME-stormconf.ser (REMOVED FROM CLUSTER). 2021-06-29 03:00:06.477 o.a.s.d.s.AdvancedFSOps AsyncLocalizer Task Executor - 1 [INFO] Deleting path /storm/supervisor/stormdist/TOPOLOGY_NAME/stormcode.ser 2021-06-29 03:00:06.477 o.a.s.d.s.AdvancedFSOps AsyncLocalizer Task Executor - 1 [INFO] Deleting path /storm/supervisor/stormdist/TOPOLOGY_NAME/stormcode.ser.version 2021-06-29 03:00:06.478 o.a.s.l.LocalizedResourceRetentionSet AsyncLocalizer Task Executor - 1 [INFO] Deleted blob: TOPOLOGY_NAME-stormcode.ser (REMOVED FROM CLUSTER). 2021-06-29 03:00:06.478 o.a.s.d.s.AdvancedFSOps AsyncLocalizer Task Executor - 1 [INFO] Deleting path /storm/supervisor/stormdist/TOPOLOGY_NAME 2021-06-29 03:00:07.062 o.a.s.d.s.Supervisor pool-11-thread-10 [WARN] Topology config is not localized yet... 2021-06-29 03:00:07.063 o.a.s.t.ProcessFunction pool-11-thread-10 [ERROR] Internal error processing sendSupervisorWorkerHeartbeat org.apache.storm.utils.WrappedNotAliveException: TOPOLOGY_NAME does not appear to be alive, you should probably exit at org.apache.storm.daemon.supervisor.Supervisor$1.sendSupervisorWorkerHeartbeat(Supervisor.java:448) ~[storm-server-2.2.0.jar:2.2.0] at org.apache.storm.generated.Supervisor$Processor$sendSupervisorWorkerHeartbeat.getResult(Supervisor.java:374) ~[storm-client-2.2.0.jar:2.2.0] at org.apache.storm.generated.Supervisor$Processor$sendSupervisorWorkerHeartbeat.getResult(Supervisor.java:353) ~[storm-client-2.2.0.jar:2.2.0] at org.apache.storm.thrift.ProcessFunction.process(ProcessFunction.java:38) [storm-shaded-deps-2.2.0.jar:2.2.0] at org.apache.storm.thrift.TBaseProcessor.process(TBaseProcessor.java:38) [storm-shaded-deps-2.2.0.jar:2.2.0] at org.apache.storm.security.auth.SimpleTransportPlugin$SimpleWrapProcessor.process(SimpleTransportPlugin.java:172) [storm-client-2.2.0.jar:2.2.0] at org.apache.storm.thrift.server.AbstractNonblockingServer$FrameBuffer.invoke(AbstractNonblockingServer.java:524) [storm-shaded-deps-2.2.0.jar:2.2.0] at org.apache.storm.thrift.server.Invocation.run(Invocation.java:18) [storm-shaded-deps-2.2.0.jar:2.2.0] at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) [?:?] at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) [?:?] at java.lang.Thread.run(Unknown Source) [?:?] 2021-06-29 03:00:07.064 o.a.s.t.ProcessFunction pool-11-thread-3 [ERROR] Internal error processing sendSupervisorWorkerHeartbeat org.apache.storm.utils.WrappedNotAliveException: TOPOLOGY_NAME does not appear to be alive, you should probably exit at org.apache.storm.daemon.supervisor.Supervisor$1.sendSupervisorWorkerHeartbeat(Supervisor.java:448) ~[storm-server-2.2.0.jar:2.2.0] at org.apache.storm.generated.Supervisor$Processor$sendSupervisorWorkerHeartbeat.getResult(Supervisor.java:374) ~[storm-client-2.2.0.jar:2.2.0] at org.apache.storm.generated.Supervisor$Processor$sendSupervisorWorkerHeartbeat.getResult(Supervisor.java:353) ~[storm-client-2.2.0.jar:2.2.0] at org.apache.storm.thrift.ProcessFunction.process(ProcessFunction.java:38) [storm-shaded-deps-2.2.0.jar:2.2.0] at org.apache.storm.security.auth.SimpleTransportPlugin$SimpleWrapProcessor.process(SimpleTransportPlugin.java:172) [storm-client-2.2.0.jar:2.2.0] at org.apache.storm.thrift.server.AbstractNonblockingServer$FrameBuffer.invoke(AbstractNonblockingServer.java:524) [storm-shaded-deps-2.2.0.jar:2.2.0] at org.apache.storm.thrift.server.Invocation.run(Invocation.java:18) [storm-shaded-deps-2.2.0.jar:2.2.0] at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) [?:?] at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) [?:?] at java.lang.Thread.run(Unknown Source) [?:?] 2021-06-29 03:00:08.106 o.a.s.d.s.Supervisor pool-11-thread-9 [WARN] Topology config is not localized yet... 2021-06-29 03:00:08.107 o.a.s.t.ProcessFunction pool-11-thread-9 [ERROR] Internal error processing sendSupervisorWorkerHeartbeat org.apache.storm.utils.WrappedNotAliveException: TOPOLOGY_NAME does not appear to be alive, you should probably exit at org.apache.storm.daemon.supervisor.Supervisor$1.sendSupervisorWorkerHeartbeat(Supervisor.java:448) ~[storm-server-2.2.0.jar:2.2.0] at org.apache.storm.generated.Supervisor$Processor$sendSupervisorWorkerHeartbeat.getResult(Supervisor.java:374) ~[storm-client-2.2.0.jar:2.2.0] at org.apache.storm.generated.Supervisor$Processor$sendSupervisorWorkerHeartbeat.getResult(Supervisor.java:353) ~[storm-client-2.2.0.jar:2.2.0] at org.apache.storm.thrift.ProcessFunction.process(ProcessFunction.java:38) [storm-shaded-deps-2.2.0.jar:2.2.0] at org.apache.storm.thrift.TBaseProcessor.process(TBaseProcessor.java:38) [storm-shaded-deps-2.2.0.jar:2.2.0] at org.apache.storm.security.auth.SimpleTransportPlugin$SimpleWrapProcessor.process(SimpleTransportPlugin.java:172) [storm-client-2.2.0.jar:2.2.0] at org.apache.storm.thrift.server.AbstractNonblockingServer$FrameBuffer.invoke(AbstractNonblockingServer.java:524) [storm-shaded-deps-2.2.0.jar:2.2.0] at org.apache.storm.thrift.server.Invocation.run(Invocation.java:18) [storm-shaded-deps-2.2.0.jar:2.2.0] at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) [?:?] at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) [?:?] at java.lang.Thread.run(Unknown Source) [?:?] 2021-06-29 03:00:08.108 o.a.s.d.s.Supervisor pool-11-thread-16 [WARN] Topology config is not localized yet... 2021-06-29 03:00:08.108 o.a.s.t.ProcessFunction pool-11-thread-16 [ERROR] Internal error processing sendSupervisorWorkerHeartbeat org.apache.storm.utils.WrappedNotAliveException: TOPOLOGY_NAME does not appear to be alive, you should probably exit at org.apache.storm.daemon.supervisor.Supervisor$1.sendSupervisorWorkerHeartbeat(Supervisor.java:448) ~[storm-server-2.2.0.jar:2.2.0] at org.apache.storm.generated.Supervisor$Processor$sendSupervisorWorkerHeartbeat.getResult(Supervisor.java:374) ~[storm-client-2.2.0.jar:2.2.0] at org.apache.storm.generated.Supervisor$Processor$sendSupervisorWorkerHeartbeat.getResult(Supervisor.java:353) ~[storm-client-2.2.0.jar:2.2.0] at org.apache.storm.thrift.ProcessFunction.process(ProcessFunction.java:38) [storm-shaded-deps-2.2.0.jar:2.2.0] at org.apache.storm.thrift.TBaseProcessor.process(TBaseProcessor.java:38) [storm-shaded-deps-2.2.0.jar:2.2.0] at org.apache.storm.security.auth.SimpleTransportPlugin$SimpleWrapProcessor.process(SimpleTransportPlugin.java:172) [storm-client-2.2.0.jar:2.2.0] at org.apache.storm.thrift.server.AbstractNonblockingServer$FrameBuffer.invoke(AbstractNonblockingServer.java:524) [storm-shaded-deps-2.2.0.jar:2.2.0] at org.apache.storm.thrift.server.Invocation.run(Invocation.java:18) [storm-shaded-deps-2.2.0.jar:2.2.0] at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) [?:?]{noformat} *This error message repeated forever until we killed that worker process.* > killed topology worker does not removed with warn and error that "Topology > config is not localized yet..." > ---------------------------------------------------------------------------------------------------------- > > Key: STORM-3779 > URL: https://issues.apache.org/jira/browse/STORM-3779 > Project: Apache Storm > Issue Type: Bug > Affects Versions: 2.2.0 > Reporter: sanghee park > Priority: Major > > Hi developers, > We met critical issue when kill storm topology. > > We killed the topology as below. > {code:java} > Config conf = new Config(); > conf.put(Config.NIMBUS_SEEDS, "SOME_NIMBUS_SEED_STRING"); > > KillOptions opt = new KillOptions(); > opt.set_wait_secs_isSet(true); > opt.set_wait_secs(10); > > Nimbus.Iface nimbusClient = > NimbusClient.getConfiguredClient(conf).getClient(); > nimbusClient.killTopologyWithOpts("TOPOLOGY_NAME", opt); > {code} > > Topology workers were distributed across multiple supervisors. > Some supervisor's workers died normally. > > But the problem is that, > h3. *Some supervisor workers never died with error message like below!!* > > {noformat} > 2021-06-29 02:58:44.284 o.a.s.d.s.Container SLOT_6707 [INFO] SET worker-user > baef41a4-b5f6-4ea3-8868-5537dfba82f8 root > 2021-06-29 02:58:44.284 o.a.s.d.s.Container SLOT_6707 [INFO] Creating > symlinks for worker-id: baef41a4-b5f6-4ea3-8868-5537dfba82f8 storm-id: > TOPOLOGY_NAME for files(1): [resources] > 2021-06-29 02:58:44.284 o.a.s.d.s.BasicContainer SLOT_6707 [INFO] Launching > worker with assignment LocalAssignment(topology_id:TOPOLOGY_NAME, > executors:[ExecutorInfo(task_start:17, task_end:17), > ExecutorInfo(task_start:29, task_end:29), ExecutorInfo(task_start:5, > task_end:5)], resources:WorkerResources(mem_on_heap:6272.0, mem_off_heap:0.0, > cpu:30.0, shared_mem_on_heap:0.0, shared_mem_off_heap:0.0, > resources:{offheap.memory.mb=0.0, onheap.memory.mb=6272.0, > cpu.pcore.percent=30.0}, shared_resources:{}), owner:root) for this > supervisor d2ee514a-e40e-40fb-b119-59763f3bb95d-10.233.112.14 on port 6707 > with id baef41a4-b5f6-4ea3-8868-5537dfba82f8 > 2021-06-29 02:58:44.285 o.a.s.d.s.Slot SLOT_6708 [INFO] STATE > kill-and-relaunch msInState: 6 topo:TOPOLOGY_NAME > worker:d06bb5c5-25e2-4557-8996-4d40045022d1 -> waiting-for-worker-start > msInState: 0 topo:TOPOLOGY_NAME worker:d06bb5c5-25e2-4557-8996-4d40045022d1 > 2021-06-29 02:58:44.286 o.a.s.d.s.Slot SLOT_6707 [INFO] STATE > kill-and-relaunch msInState: 7 topo:TOPOLOGY_NAME > worker:baef41a4-b5f6-4ea3-8868-5537dfba82f8 -> waiting-for-worker-start > msInState: 0 topo:TOPOLOGY_NAME worker:baef41a4-b5f6-4ea3-8868-5537dfba82f8 > 2021-06-29 02:58:46.799 o.a.s.d.s.BasicContainer Thread-7269 [INFO] Worker > Process d06bb5c5-25e2-4557-8996-4d40045022d1 exited with code: 254 > 2021-06-29 02:58:48.065 o.a.s.d.s.BasicContainer Thread-7270 [INFO] Worker > Process baef41a4-b5f6-4ea3-8868-5537dfba82f8 exited with code: 254 > 2021-06-29 02:59:09.234 o.a.s.d.s.t.SupervisorHealthCheck timer [INFO] > Running supervisor healthchecks... > 2021-06-29 02:59:09.234 o.a.s.h.HealthChecker timer [INFO] The supervisor > healthchecks succeeded. > 2021-06-29 02:59:39.234 o.a.s.d.s.t.SupervisorHealthCheck timer [INFO] > Running supervisor healthchecks... > 2021-06-29 02:59:39.234 o.a.s.h.HealthChecker timer [INFO] The supervisor > healthchecks succeeded. > 2021-06-29 02:59:53.558 o.a.s.d.s.Supervisor pool-11-thread-9 [INFO] Got an > assignments from master, will start to sync with assignments: > SupervisorAssignments(...) > 2021-06-29 02:59:53.936 o.a.s.d.s.Slot SLOT_6702 [INFO] SLOT 6702: Assignment > Changed from LocalAssignment(topology_id:TOPOLOGY_NAME, > executors:[ExecutorInfo(task_start:23, task_end:23), > ExecutorInfo(task_start:11, task_end:11)], > resources:WorkerResources(mem_on_heap:3200.0, mem_off_heap:0.0, cpu:20.0, > shared_mem_on_heap:0.0, shared_mem_off_heap:0.0, > resources:{offheap.memory.mb=0.0, onheap.memory.mb=3200.0, > cpu.pcore.percent=20.0}, shared_resources:{}), owner:root) to null > 2021-06-29 02:59:53.939 o.a.s.d.s.Container SLOT_6702 [INFO] Killing > d2ee514a-e40e-40fb-b119-59763f3bb95d-10.233.112.14:25976cac-9170-44ec-b835-099377cda893 > 2021-06-29 02:59:54.293 o.a.s.d.s.Slot SLOT_6708 [INFO] SLOT 6708: Assignment > Changed from LocalAssignment(topology_id:TOPOLOGY_NAME, > executors:[ExecutorInfo(task_start:10, task_end:10), > ExecutorInfo(task_start:22, task_end:22)], > resources:WorkerResources(mem_on_heap:3200.0, mem_off_heap:0.0, cpu:20.0, > shared_mem_on_heap:0.0, shared_mem_off_heap:0.0, > resources:{offheap.memory.mb=0.0, onheap.memory.mb=3200.0, > cpu.pcore.percent=20.0}, shared_resources:{}), owner:root) to null > 2021-06-29 02:59:54.293 o.a.s.d.s.Slot SLOT_6707 [INFO] SLOT 6707: Assignment > Changed from LocalAssignment(topology_id:TOPOLOGY_NAME, > executors:[ExecutorInfo(task_start:17, task_end:17), > ExecutorInfo(task_start:29, task_end:29), ExecutorInfo(task_start:5, > task_end:5)], resources:WorkerResources(mem_on_heap:6272.0, mem_off_heap:0.0, > cpu:30.0, shared_mem_on_heap:0.0, shared_mem_off_heap:0.0, > resources:{offheap.memory.mb=0.0, onheap.memory.mb=6272.0, > cpu.pcore.percent=30.0}, shared_resources:{}), owner:root) to null > 2021-06-29 02:59:54.296 o.a.s.d.s.Slot SLOT_6708 [INFO] STATE > waiting-for-worker-start msInState: 70011 topo:TOPOLOGY_NAME > worker:d06bb5c5-25e2-4557-8996-4d40045022d1 -> kill msInState: 0 > topo:TOPOLOGY_NAME worker:d06bb5c5-25e2-4557-8996-4d40045022d1 > 2021-06-29 02:59:54.296 o.a.s.d.s.Slot SLOT_6707 [INFO] STATE > waiting-for-worker-start msInState: 70010 topo:TOPOLOGY_NAME > worker:baef41a4-b5f6-4ea3-8868-5537dfba82f8 -> kill msInState: 0 > topo:TOPOLOGY_NAME worker:baef41a4-b5f6-4ea3-8868-5537dfba82f8 > 2021-06-29 02:59:54.298 o.a.s.d.s.Slot SLOT_6708 [INFO] SLOT 6708 all > processes are dead... > 2021-06-29 02:59:54.298 o.a.s.d.s.Container SLOT_6708 [INFO] Cleaning up > d2ee514a-e40e-40fb-b119-59763f3bb95d-10.233.112.14:d06bb5c5-25e2-4557-8996-4d40045022d1 > 2021-06-29 02:59:54.298 o.a.s.d.s.AdvancedFSOps SLOT_6708 [INFO] Deleting > path /storm/workers/d06bb5c5-25e2-4557-8996-4d40045022d1/pids/141225 > 2021-06-29 02:59:54.298 o.a.s.d.s.AdvancedFSOps SLOT_6708 [INFO] Deleting > path /storm/workers/d06bb5c5-25e2-4557-8996-4d40045022d1/heartbeats > 2021-06-29 03:00:06.452 o.a.s.d.s.AdvancedFSOps AsyncLocalizer Task Executor > - 1 [INFO] Deleting path > /storm/supervisor/stormdist/TOPOLOGY_NAME/stormjar.jar > 2021-06-29 03:00:06.472 o.a.s.d.s.AdvancedFSOps AsyncLocalizer Task Executor > - 1 [INFO] Deleting path > /storm/supervisor/stormdist/TOPOLOGY_NAME/stormjar.jar.version > 2021-06-29 03:00:06.472 o.a.s.d.s.AdvancedFSOps AsyncLocalizer Task Executor > - 1 [INFO] Deleting path /storm/supervisor/stormdist/TOPOLOGY_NAME/resources > 2021-06-29 03:00:06.472 o.a.s.l.LocalizedResourceRetentionSet AsyncLocalizer > Task Executor - 1 [INFO] Deleted blob: TOPOLOGY_NAME-stormjar.jar (REMOVED > FROM CLUSTER). > 2021-06-29 03:00:06.475 o.a.s.d.s.AdvancedFSOps AsyncLocalizer Task Executor > - 1 [INFO] Deleting path > /storm/supervisor/stormdist/TOPOLOGY_NAME/stormconf.ser > 2021-06-29 03:00:06.475 o.a.s.d.s.AdvancedFSOps AsyncLocalizer Task Executor > - 1 [INFO] Deleting path > /storm/supervisor/stormdist/TOPOLOGY_NAME/stormconf.ser.version > 2021-06-29 03:00:06.475 o.a.s.l.LocalizedResourceRetentionSet AsyncLocalizer > Task Executor - 1 [INFO] Deleted blob: TOPOLOGY_NAME-stormconf.ser (REMOVED > FROM CLUSTER). > 2021-06-29 03:00:06.477 o.a.s.d.s.AdvancedFSOps AsyncLocalizer Task Executor > - 1 [INFO] Deleting path > /storm/supervisor/stormdist/TOPOLOGY_NAME/stormcode.ser > 2021-06-29 03:00:06.477 o.a.s.d.s.AdvancedFSOps AsyncLocalizer Task Executor > - 1 [INFO] Deleting path > /storm/supervisor/stormdist/TOPOLOGY_NAME/stormcode.ser.version > 2021-06-29 03:00:06.478 o.a.s.l.LocalizedResourceRetentionSet AsyncLocalizer > Task Executor - 1 [INFO] Deleted blob: TOPOLOGY_NAME-stormcode.ser (REMOVED > FROM CLUSTER). > 2021-06-29 03:00:06.478 o.a.s.d.s.AdvancedFSOps AsyncLocalizer Task Executor > - 1 [INFO] Deleting path /storm/supervisor/stormdist/TOPOLOGY_NAME > 2021-06-29 03:00:07.062 o.a.s.d.s.Supervisor pool-11-thread-10 [WARN] > Topology config is not localized yet... > 2021-06-29 03:00:07.063 o.a.s.t.ProcessFunction pool-11-thread-10 [ERROR] > Internal error processing sendSupervisorWorkerHeartbeat > org.apache.storm.utils.WrappedNotAliveException: TOPOLOGY_NAME does not > appear to be alive, you should probably exit > at > org.apache.storm.daemon.supervisor.Supervisor$1.sendSupervisorWorkerHeartbeat(Supervisor.java:448) > ~[storm-server-2.2.0.jar:2.2.0] > at > org.apache.storm.generated.Supervisor$Processor$sendSupervisorWorkerHeartbeat.getResult(Supervisor.java:374) > ~[storm-client-2.2.0.jar:2.2.0] > at > org.apache.storm.generated.Supervisor$Processor$sendSupervisorWorkerHeartbeat.getResult(Supervisor.java:353) > ~[storm-client-2.2.0.jar:2.2.0] > at > org.apache.storm.thrift.ProcessFunction.process(ProcessFunction.java:38) > [storm-shaded-deps-2.2.0.jar:2.2.0] > at > org.apache.storm.thrift.TBaseProcessor.process(TBaseProcessor.java:38) > [storm-shaded-deps-2.2.0.jar:2.2.0] > at > org.apache.storm.security.auth.SimpleTransportPlugin$SimpleWrapProcessor.process(SimpleTransportPlugin.java:172) > [storm-client-2.2.0.jar:2.2.0] > at > org.apache.storm.thrift.server.AbstractNonblockingServer$FrameBuffer.invoke(AbstractNonblockingServer.java:524) > [storm-shaded-deps-2.2.0.jar:2.2.0] > at org.apache.storm.thrift.server.Invocation.run(Invocation.java:18) > [storm-shaded-deps-2.2.0.jar:2.2.0] > at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) > [?:?] > at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) > [?:?] > at java.lang.Thread.run(Unknown Source) [?:?] > 2021-06-29 03:00:07.064 o.a.s.t.ProcessFunction pool-11-thread-3 [ERROR] > Internal error processing sendSupervisorWorkerHeartbeat > org.apache.storm.utils.WrappedNotAliveException: TOPOLOGY_NAME does not > appear to be alive, you should probably exit > at > org.apache.storm.daemon.supervisor.Supervisor$1.sendSupervisorWorkerHeartbeat(Supervisor.java:448) > ~[storm-server-2.2.0.jar:2.2.0] > at > org.apache.storm.generated.Supervisor$Processor$sendSupervisorWorkerHeartbeat.getResult(Supervisor.java:374) > ~[storm-client-2.2.0.jar:2.2.0] > at > org.apache.storm.generated.Supervisor$Processor$sendSupervisorWorkerHeartbeat.getResult(Supervisor.java:353) > ~[storm-client-2.2.0.jar:2.2.0] > at > org.apache.storm.thrift.ProcessFunction.process(ProcessFunction.java:38) > [storm-shaded-deps-2.2.0.jar:2.2.0] > at > org.apache.storm.security.auth.SimpleTransportPlugin$SimpleWrapProcessor.process(SimpleTransportPlugin.java:172) > [storm-client-2.2.0.jar:2.2.0] > at > org.apache.storm.thrift.server.AbstractNonblockingServer$FrameBuffer.invoke(AbstractNonblockingServer.java:524) > [storm-shaded-deps-2.2.0.jar:2.2.0] > at org.apache.storm.thrift.server.Invocation.run(Invocation.java:18) > [storm-shaded-deps-2.2.0.jar:2.2.0] > at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) > [?:?] > at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) > [?:?] > at java.lang.Thread.run(Unknown Source) [?:?] > 2021-06-29 03:00:08.106 o.a.s.d.s.Supervisor pool-11-thread-9 [WARN] Topology > config is not localized yet... > 2021-06-29 03:00:08.107 o.a.s.t.ProcessFunction pool-11-thread-9 [ERROR] > Internal error processing sendSupervisorWorkerHeartbeat > org.apache.storm.utils.WrappedNotAliveException: TOPOLOGY_NAME does not > appear to be alive, you should probably exit > at > org.apache.storm.daemon.supervisor.Supervisor$1.sendSupervisorWorkerHeartbeat(Supervisor.java:448) > ~[storm-server-2.2.0.jar:2.2.0] > at > org.apache.storm.generated.Supervisor$Processor$sendSupervisorWorkerHeartbeat.getResult(Supervisor.java:374) > ~[storm-client-2.2.0.jar:2.2.0] > at > org.apache.storm.generated.Supervisor$Processor$sendSupervisorWorkerHeartbeat.getResult(Supervisor.java:353) > ~[storm-client-2.2.0.jar:2.2.0] > at > org.apache.storm.thrift.ProcessFunction.process(ProcessFunction.java:38) > [storm-shaded-deps-2.2.0.jar:2.2.0] > at > org.apache.storm.thrift.TBaseProcessor.process(TBaseProcessor.java:38) > [storm-shaded-deps-2.2.0.jar:2.2.0] > at > org.apache.storm.security.auth.SimpleTransportPlugin$SimpleWrapProcessor.process(SimpleTransportPlugin.java:172) > [storm-client-2.2.0.jar:2.2.0] > at > org.apache.storm.thrift.server.AbstractNonblockingServer$FrameBuffer.invoke(AbstractNonblockingServer.java:524) > [storm-shaded-deps-2.2.0.jar:2.2.0] > at org.apache.storm.thrift.server.Invocation.run(Invocation.java:18) > [storm-shaded-deps-2.2.0.jar:2.2.0] > at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) > [?:?] > at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) > [?:?] > at java.lang.Thread.run(Unknown Source) [?:?] > 2021-06-29 03:00:08.108 o.a.s.d.s.Supervisor pool-11-thread-16 [WARN] > Topology config is not localized yet... > 2021-06-29 03:00:08.108 o.a.s.t.ProcessFunction pool-11-thread-16 [ERROR] > Internal error processing sendSupervisorWorkerHeartbeat > org.apache.storm.utils.WrappedNotAliveException: TOPOLOGY_NAME does not > appear to be alive, you should probably exit > at > org.apache.storm.daemon.supervisor.Supervisor$1.sendSupervisorWorkerHeartbeat(Supervisor.java:448) > ~[storm-server-2.2.0.jar:2.2.0] > at > org.apache.storm.generated.Supervisor$Processor$sendSupervisorWorkerHeartbeat.getResult(Supervisor.java:374) > ~[storm-client-2.2.0.jar:2.2.0] > at > org.apache.storm.generated.Supervisor$Processor$sendSupervisorWorkerHeartbeat.getResult(Supervisor.java:353) > ~[storm-client-2.2.0.jar:2.2.0] > at > org.apache.storm.thrift.ProcessFunction.process(ProcessFunction.java:38) > [storm-shaded-deps-2.2.0.jar:2.2.0] > at > org.apache.storm.thrift.TBaseProcessor.process(TBaseProcessor.java:38) > [storm-shaded-deps-2.2.0.jar:2.2.0] > at > org.apache.storm.security.auth.SimpleTransportPlugin$SimpleWrapProcessor.process(SimpleTransportPlugin.java:172) > [storm-client-2.2.0.jar:2.2.0] > at > org.apache.storm.thrift.server.AbstractNonblockingServer$FrameBuffer.invoke(AbstractNonblockingServer.java:524) > [storm-shaded-deps-2.2.0.jar:2.2.0] > at org.apache.storm.thrift.server.Invocation.run(Invocation.java:18) > [storm-shaded-deps-2.2.0.jar:2.2.0] > at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) > [?:?]{noformat} > *This error message repeated forever until we killed that worker process.* > > > -- This message was sent by Atlassian Jira (v8.3.4#803005)