GitHub user mateiz opened a pull request:
https://github.com/apache/incubator-spark/pull/641
SPARK-1124: Fix infinite retries of reduce stage when a map stage failed
In the previous code, if you had a failing map stage and then tried to run
reduce stages on it repeatedly, the first reduce stage would fail correctly,
but the later ones would mistakenly believe that all map outputs are available
and start failing infinitely with fetch failures from "null". See
https://spark-project.atlassian.net/browse/SPARK-1124 for an example.
This PR also cleans up code style slightly where there was a variable named
"s" and some weird map manipulation.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/mateiz/incubator-spark spark-1124-master
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/incubator-spark/pull/641.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #641
----
commit cd32d5e4dee1291e4509e5965322b7ffe620b1f3
Author: Matei Zaharia <[email protected]>
Date: 2014-02-24T07:45:48Z
SPARK-1124: Fix infinite retries of reduce stage when a map stage failed
In the previous code, if you had a failing map stage and then tried to
run reduce stages on it repeatedly, the first reduce stage would fail
correctly, but the later ones would mistakenly believe that all map
outputs are available and start failing infinitely with fetch failures
from "null".
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. To do so, please top-post your response.
If your project does not have this feature enabled and wishes so, or if the
feature is enabled but not working, please contact infrastructure at
[email protected] or file a JIRA ticket with INFRA.
---