[
https://issues.apache.org/jira/browse/FLINK-5759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15861024#comment-15861024
]
ASF GitHub Bot commented on FLINK-5759:
---------------------------------------
Github user StefanRRichter commented on a diff in the pull request:
https://github.com/apache/flink/pull/3290#discussion_r100502133
--- Diff:
flink-mesos/src/main/java/org/apache/flink/mesos/runtime/clusterframework/MesosApplicationMasterRunner.java
---
@@ -216,11 +220,11 @@ protected int runPrivileged(Configuration config,
Configuration dynamicPropertie
futureExecutor = Executors.newScheduledThreadPool(
numberProcessors,
- new
NamedThreadFactory("mesos-jobmanager-future-", "-thread-"));
+ new
ExecutorThreadFactory("mesos-jobmanager-future"));
--- End diff --
Just wondering if 'akkaExecutor' and 'mesos-jobmanager-akka' (or
'coordinationFutureExecutor' if we want to be more general than 'akka') would
carry more information for people not familiar with the code. As far as I can
see, this pool is only used by Akka, whereas the name could imply that it is
somehow used for general futures or even async user code.
> Set an UncaughtExceptionHandler for all Thread Pools in JobManager
> ------------------------------------------------------------------
>
> Key: FLINK-5759
> URL: https://issues.apache.org/jira/browse/FLINK-5759
> Project: Flink
> Issue Type: Bug
> Components: JobManager
> Affects Versions: 1.2.0
> Reporter: Stephan Ewen
> Assignee: Stephan Ewen
> Fix For: 1.3.0
>
>
> Currently, the thread pools of the {{JobManager}} do not have any
> {{UncaughtExceptionHandler}}.
> While uncaught exceptions are rare (Flink handles exceptions aggressively in
> most places), when exceptions slip through in these threads (which execute
> future responses and delayed actions), the JobManager may be in an
> inconsistent state and not function properly any more.
> We should add a handler that results in a process kill in the case of
> uncaught exceptions. Letting the JobManager be restarted by the respective
> cluster framework is the only guaranteed way to be safe.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)