[ 
https://issues.apache.org/jira/browse/FLINK-18663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17164229#comment-17164229
 ] 

Till Rohrmann edited comment on FLINK-18663 at 7/24/20, 7:42 AM:
-----------------------------------------------------------------

One suspicion I have is the following: When calling 
{{AbstractHandler.respondAsLeader}} we don't check whether the handler has been 
terminated or not. Hence the following might happen:

1. We receive a REST request and we call into {{AbstractHandler.channelRead0}} 
(inherited from {{LeaderRetrievalHandler}})
2. The {{RestServerEndpoint}} is being shut down which closes all handlers
3. Since no requests are registered in 
{{AbstractHandler.inFlightRequestTracker}}, we immediately close the handlers
4. After having obtained the leader gateway, we call into 
{{AbstractHandler.respondAsLeader}} which registers the request in the 
{{inFlightRequestTracker}} but does not check whether the handler has been shut 
down.

If this should indeed be the problem, then I would suggest that we check under 
the {{lock}} whether {{terminationFuture}} has been set and also add the 
request to the {{inFlightRequestTracker}} under the lock.


was (Author: till.rohrmann):
One suspicion I have is the following: When calling 
{{AbstractHandler.respondAsLeader}} we don't check whether the handler has been 
terminated or not. Hence the following might happen:

1. We receive a REST request and we call into {{AbstractHandler.channelRead0}} 
(inherited from {{LeaderRetrievalHandler}})
2. The {{RestServerEndpoint}} is being shut down which closes all handlers
3. Since no requests are registered in 
{{AbstractHandler.inFlightRequestTracker}}, we immediately close the handlers
4. After having obtained the leader gateway, we call into 
{{AbstractHandler.respondAsLeader}} which registers the request in the 
{{inFlightRequestTracker}} but does not check whether the handler has been shut 
down.

> Fix Flink On YARN AM not exit
> -----------------------------
>
>                 Key: FLINK-18663
>                 URL: https://issues.apache.org/jira/browse/FLINK-18663
>             Project: Flink
>          Issue Type: Bug
>          Components: Runtime / REST
>    Affects Versions: 1.10.0, 1.10.1, 1.11.0
>            Reporter: tartarus
>            Priority: Critical
>              Labels: pull-request-available
>         Attachments: 110.png, 111.png, 
> C49A7310-F932-451B-A203-6D17F3140C0D.png, e18e00dd6664485c2ff55284fe969474.png
>
>
> AbstractHandler throw NPE cause by FlinkHttpObjectAggregator is null
> when rest throw exception, it will do this code
> {code:java}
> private CompletableFuture<Void> handleException(Throwable throwable, 
> ChannelHandlerContext ctx, HttpRequest httpRequest) {
>       FlinkHttpObjectAggregator flinkHttpObjectAggregator = 
> ctx.pipeline().get(FlinkHttpObjectAggregator.class);
>       int maxLength = flinkHttpObjectAggregator.maxContentLength() - 
> OTHER_RESP_PAYLOAD_OVERHEAD;
>       if (throwable instanceof RestHandlerException) {
>               RestHandlerException rhe = (RestHandlerException) throwable;
>               String stackTrace = ExceptionUtils.stringifyException(rhe);
>               String truncatedStackTrace = Ascii.truncate(stackTrace, 
> maxLength, "...");
>               if (log.isDebugEnabled()) {
>                       log.error("Exception occurred in REST handler.", rhe);
>               } else {
>                       log.error("Exception occurred in REST handler: {}", 
> rhe.getMessage());
>               }
>               return HandlerUtils.sendErrorResponse(
>                       ctx,
>                       httpRequest,
>                       new ErrorResponseBody(truncatedStackTrace),
>                       rhe.getHttpResponseStatus(),
>                       responseHeaders);
>       } else {
>               log.error("Unhandled exception.", throwable);
>               String stackTrace = String.format("<Exception on server 
> side:%n%s%nEnd of exception on server side>",
>                       ExceptionUtils.stringifyException(throwable));
>               String truncatedStackTrace = Ascii.truncate(stackTrace, 
> maxLength, "...");
>               return HandlerUtils.sendErrorResponse(
>                       ctx,
>                       httpRequest,
>                       new ErrorResponseBody(Arrays.asList("Internal server 
> error.", truncatedStackTrace)),
>                       HttpResponseStatus.INTERNAL_SERVER_ERROR,
>                       responseHeaders);
>       }
> }
> {code}
> but flinkHttpObjectAggregator some case is null,so this will throw NPE,but 
> this method called by  AbstractHandler#respondAsLeader
> {code:java}
> requestProcessingFuture
>       .whenComplete((Void ignored, Throwable throwable) -> {
>               if (throwable != null) {
>                       
> handleException(ExceptionUtils.stripCompletionException(throwable), ctx, 
> httpRequest)
>                               .whenComplete((Void ignored2, Throwable 
> throwable2) -> finalizeRequestProcessing(finalUploadedFiles));
>               } else {
>                       finalizeRequestProcessing(finalUploadedFiles);
>               }
>       });
> {code}
>  the result is InFlightRequestTracker Cannot be cleared.
> so the CompletableFuture does‘t complete that handler's closeAsync returned
> !C49A7310-F932-451B-A203-6D17F3140C0D.png!
> !e18e00dd6664485c2ff55284fe969474.png!
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to