Github user JoshRosen commented on the pull request:

    https://github.com/apache/spark/pull/4155#issuecomment-71956823
  
    > The users on my side have been able to reproduce the missing files issue 
reliably, so we may just have to live with an empirical verification and be 
done with it.
    
    Given that the actual bug is non-deterministic, I think we could be okay 
without a regression test that reliably reproduces this issue.  There might be 
some value in a non-deterministic regression test, though, as long as it 
detects the bug with sufficiently high probability, since we'd eventually catch 
any regression by noticing that the test had become flaky in Jenkins.
    
    Unless we can come up with a better test, in the immediate term I'm okay 
with having unit tests for the individual components and an empirical 
verification using your reproduction.  Even though they aren't regression 
tests, the new tests added here will be helpful for preventing regressions if 
anyone changes the OutputCommitCoordinator logic.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to