[
https://issues.apache.org/jira/browse/FLINK-1556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14322884#comment-14322884
]
ASF GitHub Bot commented on FLINK-1556:
---------------------------------------
GitHub user tillrohrmann opened a pull request:
https://github.com/apache/flink/pull/406
[FLINK-1556] Corrects faulty JobClient behaviour in case of a submission
failure
If an error occurred during job submission, a ```SubmissionFailure``` is
sent to the ```JobClient```. As a reaction, the ```JobClient``` terminated
itself and sent the failure to the ```Client```. However, this does not
necessarily mean that the job has reached a terminal state, because the failing
procedure is executed asynchronously.
The ```JobClient``` now waits until it receives a ```JobResult``` message
indicating that the job has completed and all resources are properly returned.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/tillrohrmann/flink minorFixes
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/flink/pull/406.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #406
----
commit 2f32e9c6b87b8e295f792c04306d78fbb858f80d
Author: Till Rohrmann <[email protected]>
Date: 2015-02-16T09:17:21Z
[FLINK-1556] [runtime] Corrects faulty JobClient behaviour in case of a
submission failure
----
> JobClient does not wait until a job failed completely if submission exception
> -----------------------------------------------------------------------------
>
> Key: FLINK-1556
> URL: https://issues.apache.org/jira/browse/FLINK-1556
> Project: Flink
> Issue Type: Bug
> Reporter: Till Rohrmann
>
> If an exception occurs during job submission the {{JobClient}} received a
> {{SubmissionFailure}}. Upon receiving this message, the {{JobClient}}
> terminates itself and returns the error to the {{Client}}. This indicates to
> the user that the job has been completely failed which is not necessarily
> true.
> If the user directly after such a failure submits another job, then it might
> be the case that not all slots of the formerly failed job are returned. This
> can lead to a {{NoRessourceAvailableException}}.
> We can solve this problem by waiting for the completion of the job failure in
> the {{JobClient}}.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)