Github user squito commented on the pull request:

    https://github.com/apache/spark/pull/8090#issuecomment-129862901
  
    Thanks @carsonwang , I think I see -- it seems that the key part is that 
the stage is skipped (because the shuffle map output is computed by a different 
stage, and the output can then get reused).  But on retry, the skipped stage is 
the one which computes the missing partitions.  I'm really glad you caught this 
case, its pretty complicated, definitely not something I think about regularly, 
and I don't think we evan have any tests which cover it.
    
    I'd like to try to add a test which covers this case.  It may be a day or 
two before I can, though, if anyone else would like to.  We could merge a fix 
without a test case if we really have to, but I think we should at least think 
carefully about what the right fix is.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to