[
https://issues.apache.org/jira/browse/FLINK-21053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Weijie Guo updated FLINK-21053:
-------------------------------
Affects Version/s: 2.1.0
> Prevent potential RejectedExecutionExceptions in CheckpointCoordinator
> failing JM
> ---------------------------------------------------------------------------------
>
> Key: FLINK-21053
> URL: https://issues.apache.org/jira/browse/FLINK-21053
> Project: Flink
> Issue Type: Improvement
> Components: Runtime / Checkpointing
> Affects Versions: 2.1.0
> Reporter: Roman Khachatryan
> Priority: Minor
> Labels: auto-unassigned
> Fix For: 2.0.0
>
>
> In the past, there were multiple bugs caused by throwing/handling
> RejectedExecutionException in CheckpointCoordinator (FLINK-18290,
> FLINK-20992).
>
> And I think it's still possible as there are many places where an executor is
> passed to calls to CompletableFuture.xxxAsync while it can already be shut
> down.
>
> In FLINK-20992 we discussed two approaches to fix this.
> One approach is to check executor state inside a synchronized block every
> time when it is used.
> Second approach is to
> # Create executors inside CheckpointCoordinator (both io & timer thread
> pools)
> # Check isShutdown() in their RejectedExecution handlers (if yes and it's
> RejectedExecutionException then just log; otherwise delegate to
> FatalExitExceptionHandler)
> # (this will allow to remove such RejectedExecutionException checks from
> coordinator code)
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)