[ https://issues.apache.org/jira/browse/FLINK-17514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
ASF GitHub Bot updated FLINK-17514: ----------------------------------- Labels: pull-request-available (was: ) > TaskCancelerWatchdog does not kill TaskManager > ---------------------------------------------- > > Key: FLINK-17514 > URL: https://issues.apache.org/jira/browse/FLINK-17514 > Project: Flink > Issue Type: Bug > Components: Runtime / Task > Affects Versions: 1.10.1, 1.11.0 > Reporter: Aljoscha Krettek > Assignee: Till Rohrmann > Priority: Blocker > Labels: pull-request-available > Fix For: 1.10.1, 1.11.0 > > > The watchdog reports a fatal error using {{taskManager.notifyFatalError(msg, > null)}}. This should normally lead to the TaskManager being terminated. The > code introduced in FLINK-16225 > tries to look at the passed exception and will eventually fail with a > {{NullPointerException}}, which prevents the TaskManager from being > terminated. > Stacktrace: > {code:java} > 2020-05-05 09:43:01,588 ERROR org.apache.flink.runtime.taskmanager.Task > - Task did not exit gracefully within 180 + seconds. > 2020-05-05 09:43:01,588 ERROR > org.apache.flink.runtime.taskexecutor.TaskExecutor - Task did not > exit gracefully within 180 + seconds. > 2020-05-05 09:43:01,588 ERROR org.apache.flink.runtime.taskmanager.Task > - Error in Task Cancellation Watch Dog > java.lang.NullPointerException > at > org.apache.flink.util.ExceptionUtils.isOutOfMemoryErrorWithMessageStartingWith(ExceptionUtils.java:186) > at > org.apache.flink.util.ExceptionUtils.isMetaspaceOutOfMemoryError(ExceptionUtils.java:170) > at > org.apache.flink.util.ExceptionUtils.enrichTaskManagerOutOfMemoryError(ExceptionUtils.java:144) > at > org.apache.flink.runtime.taskexecutor.TaskManagerRunner.onFatalError(TaskManagerRunner.java:249) > at > org.apache.flink.runtime.taskexecutor.TaskExecutor$TaskManagerActionsImpl.notifyFatalError(TaskExecutor.java:1751) > at > org.apache.flink.runtime.taskmanager.Task$TaskCancelerWatchDog.run(Task.java:1514) > at java.lang.Thread.run(Thread.java:748) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)