[ 
https://issues.apache.org/jira/browse/FLINK-6319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15976628#comment-15976628
 ] 

Stephan Ewen commented on FLINK-6319:
-------------------------------------

Under successful shutdown, this happens anyways, because we first call 
{{quiesceAndAwaitPending()}}, which waits for in-flight timers.
The {{shutdownNow()}} only comes in cancellation / failure situations, where no 
correctness guarantees are given. It does matter, though, to shut down as fast 
as possible. That was the initial thinking.

I think the fact that the {{LocalBufferPool}} is destroyed before the latency 
marker emission timer task has been completed should not matter on cancellation.
If the issue is about polluted logs, then my take is that the logging is in the 
wrong place - it is in a place unaware of the context (does the error mean 
something or not).

> Add timeout when shutting SystemProcessingTimeService down
> ----------------------------------------------------------
>
>                 Key: FLINK-6319
>                 URL: https://issues.apache.org/jira/browse/FLINK-6319
>             Project: Flink
>          Issue Type: Improvement
>          Components: Local Runtime
>    Affects Versions: 1.3.0
>            Reporter: Till Rohrmann
>            Priority: Minor
>
> A user noted that we simply call {{shutdownNow}} on the 
> {{SystemProcessingTimeService's}} {{ScheduledThreadpoolExecutor}} when 
> calling {{SystemProcessingTimeService.shutdownService}}. {{shutdowNow}} will 
> halt all waiting tasks but it won't wait until the currently running tasks 
> have been completed. This can lead to unwanted runtime behaviours such as 
> wrong termination orders when shutting down tasks (as reported in 
> https://issues.apache.org/jira/browse/FLINK-4973?focusedCommentId=15965884&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15965884).
> I propose to add a small timeout to wait for currently running tasks to 
> complete. Even though this problem cannot be completely solved since timer 
> tasks might take longer than the specified timeout, a timeout for waiting for 
> running tasks to complete will mitigate the problem.
> We can do this by calling {{timerServicer.awaitTermination(timeout, 
> timeoutUnit);}} after the {{shutdowNow}} call.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to