Github user zentol commented on a diff in the pull request:

    https://github.com/apache/flink/pull/6222#discussion_r199248513
  
    --- Diff: 
flink-runtime/src/main/java/org/apache/flink/runtime/rest/handler/job/JobSubmitHandler.java
 ---
    @@ -66,6 +67,9 @@ public JobSubmitHandler(
                }
     
                return gateway.submitJob(jobGraph, timeout)
    -                   .thenApply(ack -> new JobSubmitResponseBody("/jobs/" + 
jobGraph.getJobID()));
    +                   .thenApply(ack -> new JobSubmitResponseBody("/jobs/" + 
jobGraph.getJobID()))
    +                   .exceptionally(exception -> {
    +                           throw new CompletionException(new 
RestHandlerException("Job submission failed.", 
HttpResponseStatus.INTERNAL_SERVER_ERROR, exception));
    --- End diff --
    
    I'm not quite fond of the idea. As alluded in the PR description and JIRA I 
agree that the existing error message isn't _helpful_, yet better than the 
current state.
    
    I rather like that so far the REST API has control over the error messages. 
This ensures that the user only sees messages that were actually meant for him.
    
    In contrast, exception messages are pretty much arbitrary. They may change 
at will, the audience isn't defined (user vs dev), may only helpful if the 
fully stack trace is present, often don't have any message at all (see usages 
of `Preconditions`, or NPEs) and typically only describe what went wrong, not 
why, how to fix it or if it even was a user-error. Given that this would break 
down the barrier between internal/user-facing messages you obviously also run 
into cases where users have _no idea_ what the message even means. Finally you 
end up with mismatches between the error message and error code.
    
    To me the underlying issue is that `submitJob` funnels all manner of 
exceptions into a `FlinkException/JobSubmissionException` that we can't do much 
with. Neither can we categorize them in any way, nor distinguish between who's 
responsible (user vs Flink) nor when in the process the failure occurred.
    Without diving into the implementation you don't _even know which 
exceptions are thrown_, but i suppose this is a general issue of 
`CompletableFutures`.


---

Reply via email to