Github user cloud-fan commented on the pull request:

    https://github.com/apache/spark/pull/791#issuecomment-43582474
  
    @mridulm Sorry I may misunderstood you because of my poor english :(
    Let me list things one by one so that we can make it clear
    1) Currently spark MEMORY_AND_DISK mode is slower than DISK_ONLY mode 
sometimes because of the lock on IO(dropping blocks)
    2) As the TODO says, the solution is: just synchronize the selecting of 
to-be-dropped blocks and do dropping in parallel
    3) My solution is fragile, but it works if nothing goes wrong
    4) My solution is not MT safe. For example, if a block is being dropped by 
one thread and another thread is trying to remove it, oops.
    5) There could be any number of reasons for dropping block to fail, but 
wouldn't be any KINDS of them. As far as I know, one is exception(including 
disk issue, etc), one is executor lost, one is stage cancelled.
    I do appreciate if you could discuss with me one by one as I listed above. 
Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

Reply via email to