Github user squito commented on the pull request:

    https://github.com/apache/spark/pull/5572#issuecomment-105257051
  
    @viirya @cloud-fan good point, I hadn't thought about multiple tasks on one 
executor that are all pulling the same partition of `rdd2`.  Still, I'm very 
worried about having the extra local caching, if we don't have an effective way 
of undoing, because I think it will be very confusing to have these extra 
blocks stuck in the cache.  I agree that "idea 1" is not as general as a 
solution, but I was hoping it was simple enough to fit your narrow need here.
    
    In any case, this is just my opinion --  I'm not adamantly against this, 
but I would really like to get some other reviewers that weigh in before we 
would merge in those changes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to