[jira] [Updated] (FLINK-24053) stop with savepoint timeout

Jira Mon, 30 Aug 2021 04:47:23 -0700


     [ 
https://issues.apache.org/jira/browse/FLINK-24053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


刘方奇 updated FLINK-24053:
------------------------
    Description: 
Hello, when we use the "stop with savepoint" feature, we always meet a bug.

We will always cost 5 mins waiting the application to end, then the application 
will throw a timeout exception.

 
{code:java}
java.util.concurrent.TimeoutException: null 
at 
org.apache.flink.runtime.concurrent.FutureUtils$Timeout.run(FutureUtils.java:1036)
 ~[classes/:?] 
at 
org.apache.flink.runtime.concurrent.DirectExecutorService.execute(DirectExecutorService.java:211)
 ~[classes/:?] 
at 
org.apache.flink.runtime.concurrent.FutureUtils.lambda$orTimeout$14(FutureUtils.java:445)
 ~[classes/:?] 
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
~[?:1.8.0_251] 
at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_251] 
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
 ~[?:1.8.0_251] 
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
 ~[?:1.8.0_251] 
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
~[?:1.8.0_251] 
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
~[?:1.8.0_251] 
at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_251]
{code}
And we found there was always the function called 
org.apache.flink.runtime.rest.handler.job.savepoints.SavepointHandlers.SavepointStatusHandler.closeHandlerAsync()
 run timeout, and its timeout setting is 5mins.

There was a question that the handler 's close may be not important, cause the 
handler serves other handler called 
org.apache.flink.runtime.rest.handler.job.savepoints.SavepointHandlers.StopWithSavepointHandler
 which was already closed.So should we skip this close ?

PS : There was no problem when we test the code that skip the handler 's close.

 

 

  was:
Hello, when we use the "stop with savepoint" feature, we always meet a bug.

We will always cost 5 mins waiting the application to end, then the application 
will throw a timeout exception.

 
{code:java}
//代码占位符
java.util.concurrent.TimeoutException: 
nulljava.util.concurrent.TimeoutException: null at 
org.apache.flink.runtime.concurrent.FutureUtils$Timeout.run(FutureUtils.java:1036)
 ~[classes/:?] at 
org.apache.flink.runtime.concurrent.DirectExecutorService.execute(DirectExecutorService.java:211)
 ~[classes/:?] at 
org.apache.flink.runtime.concurrent.FutureUtils.lambda$orTimeout$14(FutureUtils.java:445)
 ~[classes/:?] at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
~[?:1.8.0_251] at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
~[?:1.8.0_251] at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
 ~[?:1.8.0_251] at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
 ~[?:1.8.0_251] at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
~[?:1.8.0_251] at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
~[?:1.8.0_251] at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_251]
{code}
And we found there was always the function called 
org.apache.flink.runtime.rest.handler.job.savepoints.SavepointHandlers.SavepointStatusHandler.closeHandlerAsync()
 run timeout, and its timeout setting is 5mins.

There was a question that the handler 's close may be not important, cause the 
handler serves other handler called 
org.apache.flink.runtime.rest.handler.job.savepoints.SavepointHandlers.StopWithSavepointHandler
 which was already closed.So should we skip this close ?

PS : There was no problem when we test the code that skip the handler 's close.

 

 


> stop with savepoint timeout
> ---------------------------
>
>                 Key: FLINK-24053
>                 URL: https://issues.apache.org/jira/browse/FLINK-24053
>             Project: Flink
>          Issue Type: Bug
>          Components: Runtime / Checkpointing, Runtime / REST
>    Affects Versions: 1.11.0, 1.12.0, 1.13.0
>            Reporter: 刘方奇
>            Priority: Major
>
> Hello, when we use the "stop with savepoint" feature, we always meet a bug.
> We will always cost 5 mins waiting the application to end, then the 
> application will throw a timeout exception.
>  
> {code:java}
> java.util.concurrent.TimeoutException: null 
> at 
> org.apache.flink.runtime.concurrent.FutureUtils$Timeout.run(FutureUtils.java:1036)
>  ~[classes/:?] 
> at 
> org.apache.flink.runtime.concurrent.DirectExecutorService.execute(DirectExecutorService.java:211)
>  ~[classes/:?] 
> at 
> org.apache.flink.runtime.concurrent.FutureUtils.lambda$orTimeout$14(FutureUtils.java:445)
>  ~[classes/:?] 
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[?:1.8.0_251] 
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_251] 
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>  ~[?:1.8.0_251] 
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>  ~[?:1.8.0_251] 
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  ~[?:1.8.0_251] 
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  ~[?:1.8.0_251] 
> at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_251]
> {code}
> And we found there was always the function called 
> org.apache.flink.runtime.rest.handler.job.savepoints.SavepointHandlers.SavepointStatusHandler.closeHandlerAsync()
>  run timeout, and its timeout setting is 5mins.
> There was a question that the handler 's close may be not important, cause 
> the handler serves other handler called 
> org.apache.flink.runtime.rest.handler.job.savepoints.SavepointHandlers.StopWithSavepointHandler
>  which was already closed.So should we skip this close ?
> PS : There was no problem when we test the code that skip the handler 's 
> close.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-24053) stop with savepoint timeout

Reply via email to