[jira] [Updated] (YARN-7244) ShuffleHandler is not aware of disks that are added
[ https://issues.apache.org/jira/browse/YARN-7244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kuhu Shukla updated YARN-7244: -- Attachment: YARN-7244-branch-2.8.002.patch Fixing minor new line checkstyle. > ShuffleHandler is not aware of disks that are added > --- > > Key: YARN-7244 > URL: https://issues.apache.org/jira/browse/YARN-7244 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Fix For: 2.9.0, 3.0.0 > > Attachments: YARN-7244-branch-2.8.001.patch, > YARN-7244-branch-2.8.002.patch, YARN-7244.001.patch, YARN-7244.002.patch, > YARN-7244.003.patch, YARN-7244.004.patch, YARN-7244.005.patch, > YARN-7244.006.patch, YARN-7244.007.patch, YARN-7244.008.patch, > YARN-7244.009.patch, YARN-7244.010.patch, YARN-7244.011.patch, > YARN-7244.012.patch, YARN-7244.013.patch > > > The ShuffleHandler permanently remembers the list of "good" disks on NM > startup. If disks later are added to the node then map tasks will start using > them but the ShuffleHandler will not be aware of them. The end result is that > the data cannot be shuffled from the node leading to fetch failures and > re-runs of the map tasks. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7244) ShuffleHandler is not aware of disks that are added
[ https://issues.apache.org/jira/browse/YARN-7244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kuhu Shukla updated YARN-7244: -- Attachment: YARN-7244-branch-2.8.001.patch Attaching 2.8 version of the patch which needed some extra changes. The important one is in LocalDirsHandlerService which was missing getLocalPathForRead() method from trunk which went in as part of YARN-3998. I have added just that method rather than change the visibility of getPathToRead(). > ShuffleHandler is not aware of disks that are added > --- > > Key: YARN-7244 > URL: https://issues.apache.org/jira/browse/YARN-7244 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Fix For: 2.9.0, 3.0.0 > > Attachments: YARN-7244-branch-2.8.001.patch, YARN-7244.001.patch, > YARN-7244.002.patch, YARN-7244.003.patch, YARN-7244.004.patch, > YARN-7244.005.patch, YARN-7244.006.patch, YARN-7244.007.patch, > YARN-7244.008.patch, YARN-7244.009.patch, YARN-7244.010.patch, > YARN-7244.011.patch, YARN-7244.012.patch, YARN-7244.013.patch > > > The ShuffleHandler permanently remembers the list of "good" disks on NM > startup. If disks later are added to the node then map tasks will start using > them but the ShuffleHandler will not be aware of them. The end result is that > the data cannot be shuffled from the node leading to fetch failures and > re-runs of the map tasks. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7244) ShuffleHandler is not aware of disks that are added
[ https://issues.apache.org/jira/browse/YARN-7244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kuhu Shukla updated YARN-7244: -- Attachment: YARN-7244.013.patch The build failed with 2 separate and unrelated issues I believe. The second time the cache seems to be picking up the old package for AuxiliaryLocalPathHandler. Re-triggering by uploading the same patch again. Please let me know if I missed something. > ShuffleHandler is not aware of disks that are added > --- > > Key: YARN-7244 > URL: https://issues.apache.org/jira/browse/YARN-7244 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Attachments: YARN-7244.001.patch, YARN-7244.002.patch, > YARN-7244.003.patch, YARN-7244.004.patch, YARN-7244.005.patch, > YARN-7244.006.patch, YARN-7244.007.patch, YARN-7244.008.patch, > YARN-7244.009.patch, YARN-7244.010.patch, YARN-7244.011.patch, > YARN-7244.012.patch, YARN-7244.013.patch > > > The ShuffleHandler permanently remembers the list of "good" disks on NM > startup. If disks later are added to the node then map tasks will start using > them but the ShuffleHandler will not be aware of them. The end result is that > the data cannot be shuffled from the node leading to fetch failures and > re-runs of the map tasks. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7244) ShuffleHandler is not aware of disks that are added
[ https://issues.apache.org/jira/browse/YARN-7244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kuhu Shukla updated YARN-7244: -- Attachment: YARN-7244.012.patch Fixing minor checkstyle issues. :( > ShuffleHandler is not aware of disks that are added > --- > > Key: YARN-7244 > URL: https://issues.apache.org/jira/browse/YARN-7244 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Attachments: YARN-7244.001.patch, YARN-7244.002.patch, > YARN-7244.003.patch, YARN-7244.004.patch, YARN-7244.005.patch, > YARN-7244.006.patch, YARN-7244.007.patch, YARN-7244.008.patch, > YARN-7244.009.patch, YARN-7244.010.patch, YARN-7244.011.patch, > YARN-7244.012.patch > > > The ShuffleHandler permanently remembers the list of "good" disks on NM > startup. If disks later are added to the node then map tasks will start using > them but the ShuffleHandler will not be aware of them. The end result is that > the data cannot be shuffled from the node leading to fetch failures and > re-runs of the map tasks. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7244) ShuffleHandler is not aware of disks that are added
[ https://issues.apache.org/jira/browse/YARN-7244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kuhu Shukla updated YARN-7244: -- Attachment: YARN-7244.011.patch Updated patch addressing comments from [~sunilg]. > ShuffleHandler is not aware of disks that are added > --- > > Key: YARN-7244 > URL: https://issues.apache.org/jira/browse/YARN-7244 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Attachments: YARN-7244.001.patch, YARN-7244.002.patch, > YARN-7244.003.patch, YARN-7244.004.patch, YARN-7244.005.patch, > YARN-7244.006.patch, YARN-7244.007.patch, YARN-7244.008.patch, > YARN-7244.009.patch, YARN-7244.010.patch, YARN-7244.011.patch > > > The ShuffleHandler permanently remembers the list of "good" disks on NM > startup. If disks later are added to the node then map tasks will start using > them but the ShuffleHandler will not be aware of them. The end result is that > the data cannot be shuffled from the node leading to fetch failures and > re-runs of the map tasks. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7244) ShuffleHandler is not aware of disks that are added
[ https://issues.apache.org/jira/browse/YARN-7244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kuhu Shukla updated YARN-7244: -- Attachment: YARN-7244.010.patch Attaching revised patch that address review comments. Thanks a lot ! > ShuffleHandler is not aware of disks that are added > --- > > Key: YARN-7244 > URL: https://issues.apache.org/jira/browse/YARN-7244 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Attachments: YARN-7244.001.patch, YARN-7244.002.patch, > YARN-7244.003.patch, YARN-7244.004.patch, YARN-7244.005.patch, > YARN-7244.006.patch, YARN-7244.007.patch, YARN-7244.008.patch, > YARN-7244.009.patch, YARN-7244.010.patch > > > The ShuffleHandler permanently remembers the list of "good" disks on NM > startup. If disks later are added to the node then map tasks will start using > them but the ShuffleHandler will not be aware of them. The end result is that > the data cannot be shuffled from the node leading to fetch failures and > re-runs of the map tasks. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7244) ShuffleHandler is not aware of disks that are added
[ https://issues.apache.org/jira/browse/YARN-7244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kuhu Shukla updated YARN-7244: -- Attachment: YARN-7244.009.patch > ShuffleHandler is not aware of disks that are added > --- > > Key: YARN-7244 > URL: https://issues.apache.org/jira/browse/YARN-7244 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Attachments: YARN-7244.001.patch, YARN-7244.002.patch, > YARN-7244.003.patch, YARN-7244.004.patch, YARN-7244.005.patch, > YARN-7244.006.patch, YARN-7244.007.patch, YARN-7244.008.patch, > YARN-7244.009.patch > > > The ShuffleHandler permanently remembers the list of "good" disks on NM > startup. If disks later are added to the node then map tasks will start using > them but the ShuffleHandler will not be aware of them. The end result is that > the data cannot be shuffled from the node leading to fetch failures and > re-runs of the map tasks. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7244) ShuffleHandler is not aware of disks that are added
[ https://issues.apache.org/jira/browse/YARN-7244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kuhu Shukla updated YARN-7244: -- Attachment: YARN-7244.007.patch Updated patch that fixes checkstyles (almost all.. there is one to add getter in a test that seems excessive to me) and test failure for testMapFileAccess. My setup did not allow for that test to run and required overhauling. Verified that it passes now. > ShuffleHandler is not aware of disks that are added > --- > > Key: YARN-7244 > URL: https://issues.apache.org/jira/browse/YARN-7244 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Attachments: YARN-7244.001.patch, YARN-7244.002.patch, > YARN-7244.003.patch, YARN-7244.004.patch, YARN-7244.005.patch, > YARN-7244.006.patch, YARN-7244.007.patch > > > The ShuffleHandler permanently remembers the list of "good" disks on NM > startup. If disks later are added to the node then map tasks will start using > them but the ShuffleHandler will not be aware of them. The end result is that > the data cannot be shuffled from the node leading to fetch failures and > re-runs of the map tasks. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7244) ShuffleHandler is not aware of disks that are added
[ https://issues.apache.org/jira/browse/YARN-7244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kuhu Shukla updated YARN-7244: -- Attachment: YARN-7244.006.patch Thank you for the comments/review [~jlowe]! Updated patch. Will wait for PreCommit before any review requests. > ShuffleHandler is not aware of disks that are added > --- > > Key: YARN-7244 > URL: https://issues.apache.org/jira/browse/YARN-7244 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Attachments: YARN-7244.001.patch, YARN-7244.002.patch, > YARN-7244.003.patch, YARN-7244.004.patch, YARN-7244.005.patch, > YARN-7244.006.patch > > > The ShuffleHandler permanently remembers the list of "good" disks on NM > startup. If disks later are added to the node then map tasks will start using > them but the ShuffleHandler will not be aware of them. The end result is that > the data cannot be shuffled from the node leading to fetch failures and > re-runs of the map tasks. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7244) ShuffleHandler is not aware of disks that are added
[ https://issues.apache.org/jira/browse/YARN-7244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kuhu Shukla updated YARN-7244: -- Attachment: YARN-7244.005.patch Fixing TestShuffleHandler failures. The TestDistributedScheduler failure is documented in YARN-7299. > ShuffleHandler is not aware of disks that are added > --- > > Key: YARN-7244 > URL: https://issues.apache.org/jira/browse/YARN-7244 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Attachments: YARN-7244.001.patch, YARN-7244.002.patch, > YARN-7244.003.patch, YARN-7244.004.patch, YARN-7244.005.patch > > > The ShuffleHandler permanently remembers the list of "good" disks on NM > startup. If disks later are added to the node then map tasks will start using > them but the ShuffleHandler will not be aware of them. The end result is that > the data cannot be shuffled from the node leading to fetch failures and > re-runs of the map tasks. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7244) ShuffleHandler is not aware of disks that are added
[ https://issues.apache.org/jira/browse/YARN-7244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kuhu Shukla updated YARN-7244: -- Attachment: YARN-7244.004.patch Rebasing patch on trunk. > ShuffleHandler is not aware of disks that are added > --- > > Key: YARN-7244 > URL: https://issues.apache.org/jira/browse/YARN-7244 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Attachments: YARN-7244.001.patch, YARN-7244.002.patch, > YARN-7244.003.patch, YARN-7244.004.patch > > > The ShuffleHandler permanently remembers the list of "good" disks on NM > startup. If disks later are added to the node then map tasks will start using > them but the ShuffleHandler will not be aware of them. The end result is that > the data cannot be shuffled from the node leading to fetch failures and > re-runs of the map tasks. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7244) ShuffleHandler is not aware of disks that are added
[ https://issues.apache.org/jira/browse/YARN-7244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kuhu Shukla updated YARN-7244: -- Attachment: YARN-7244.003.patch Updated patch closer to the design Jason mentioned earlier. Adds a new Path Handler that is passed from the Containermanager -> AuxServices -> AuxiliaryService -> ShuffleHandler. Appreciate any comments on the approach/patch. Thanks a lot! > ShuffleHandler is not aware of disks that are added > --- > > Key: YARN-7244 > URL: https://issues.apache.org/jira/browse/YARN-7244 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Attachments: YARN-7244.001.patch, YARN-7244.002.patch, > YARN-7244.003.patch > > > The ShuffleHandler permanently remembers the list of "good" disks on NM > startup. If disks later are added to the node then map tasks will start using > them but the ShuffleHandler will not be aware of them. The end result is that > the data cannot be shuffled from the node leading to fetch failures and > re-runs of the map tasks. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7244) ShuffleHandler is not aware of disks that are added
[ https://issues.apache.org/jira/browse/YARN-7244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kuhu Shukla updated YARN-7244: -- Attachment: YARN-7244.002.patch Fixing minor test issue for newly added yarn config key. > ShuffleHandler is not aware of disks that are added > --- > > Key: YARN-7244 > URL: https://issues.apache.org/jira/browse/YARN-7244 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Attachments: YARN-7244.001.patch, YARN-7244.002.patch > > > The ShuffleHandler permanently remembers the list of "good" disks on NM > startup. If disks later are added to the node then map tasks will start using > them but the ShuffleHandler will not be aware of them. The end result is that > the data cannot be shuffled from the node leading to fetch failures and > re-runs of the map tasks. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7244) ShuffleHandler is not aware of disks that are added
[ https://issues.apache.org/jira/browse/YARN-7244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kuhu Shukla updated YARN-7244: -- Attachment: YARN-7244.001.patch v1 patch that adds a new LocalDirAllocator#getLocalPathToRead() that decides to filter the bad directories based on a boolean. Changing the original call would be more pervasive. The patch does modify the AllocatorPerContext#getLocalPathToRead() signature since that is a private static class to LocalDirAllocator. The ShuffleHandler uses a yarn config to decide whether or not to filter bad dirs. This value , when false will never take out bad directories and hence any changes to local dirs would not impact the shuffle handler reads. Even if the mkdirs and exists check fails we want the dirs to be listed in the localdirs member when the config is false. For testing reasons, I have added a getter to the lDirAllocator which is package private. Appreciate any comments/corrections to this patch. Another way to handle this would have been to change the AuxiliaryServices to pass the NMContext or the LocalDirAllocator from the NM . The former approach needs nodemanager dependencies to be added and the latter is tricky as I am not sure how the AuxServices class would pass the object without adding that it as a member. Would appreciate any suggestions on any alternative approaches as well. > ShuffleHandler is not aware of disks that are added > --- > > Key: YARN-7244 > URL: https://issues.apache.org/jira/browse/YARN-7244 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Attachments: YARN-7244.001.patch > > > The ShuffleHandler permanently remembers the list of "good" disks on NM > startup. If disks later are added to the node then map tasks will start using > them but the ShuffleHandler will not be aware of them. The end result is that > the data cannot be shuffled from the node leading to fetch failures and > re-runs of the map tasks. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org