Github user mateiz commented on the pull request:

    https://github.com/apache/spark/pull/1609#issuecomment-50929329
  
    So I looked through this and I also think it would be good to split it into 
smaller patches for 1.1. As far as I can see there are several orthogonal 
improvements here:
    - Shuffle file consolidation fixes that Aaron copied in 
https://github.com/apache/spark/pull/1678
    - ExternalAppendOnlyMap fixes to deal with writes past end of stream; we 
also need these in ExternalSorter
    - Fixes to directory creation in DiskBlockManager (I'm still not sure when 
this would be a problem actually if all accesses to these directories are 
through getFile; needs some investigation)
    - Fixes to isSymlink (though as is this seems like it would only compile on 
Java 7)
    - Improvements to the API of DiskBlockObjectWriter
    
    Of these, the first two are most critical. So I'd like to get those into 
1.1, and then we can do API refactoring and the other fixes on the master 
branch. For the directory creation fix I'd still like to understand when that 
can be a problem (I'm probably just missing something), but it's also one we 
can add in 1.1 during the QA window.
    
    I'm going to update the JIRA to create sub-tasks for these things so we can 
track where each one is fixed. Thanks again for putting this together Mridul, 
this is very helpful.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

Reply via email to