[ https://issues.apache.org/jira/browse/YARN-7244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16182678#comment-16182678 ]
Jason Lowe commented on YARN-7244: ---------------------------------- Thanks for the patch! The core issue here is that the NM is handing out directories to tasks that the shuffle manager is unaware of. This filtering-or-not approach doesn't completely solve the issue, since the ShuffleHandler will still attempt to visit disks that the NM has already determined are bad. That could cause performance problems if the ShuffleHandler tries to read a particularly problematic disk over and over as it searches for outputs to shuffle for every shuffle request. It would be more ideal if the NM could convey to aux services what directories are in use. Then the ShuffleHandler and NM would be in sync with respect to what disks should or should not be used. bq. Another way to handle this would have been to change the AuxiliaryServices to pass the NMContext or the LocalDirAllocator from the NM . That would be nice, as there are probably other things in the NMContext that aux services may want to know about. However we could always go with a much more direct route. We could add an API to AuxiliaryService that can set a callback object that can be leveraged to retrieve the current list of paths that are good for reading or writing, or we can an API to AuxiliaryService that the NM can call to update that service on the list of paths good for reading and writing. (i.e.: either a 'pull' or 'push' model for exposing the current good directories to aux services). The 'pull' model requires an interface or abstract class in yarn-api that defines the API aux services can call to retrieve the directories, and we would put the actual implementation of that interface in yarn-server-nodemanager. Ideally the interface would look a lot like the existing getLocalDirsForRead(), getLocalDirsForWrite(), etc. of the LocalDirsHandlerService so it's an easy pass-through to implement on the nodemanager side. The 'push' model requires adding a listener interface to LocalDIrsHandlerService so we know when a disk is added or removed and can callback into each aux service to update them on the current list of dirs for reading and writing. Haven't had a lot of time to figure out which would be more ideal in practice in terms of ease-of-use and performance, but I think I'd rather see the aux services be more in sync with the rest of the NM wrt. local dirs being actively used. > ShuffleHandler is not aware of disks that are added > --------------------------------------------------- > > Key: YARN-7244 > URL: https://issues.apache.org/jira/browse/YARN-7244 > Project: Hadoop YARN > Issue Type: Bug > Reporter: Kuhu Shukla > Assignee: Kuhu Shukla > Attachments: YARN-7244.001.patch, YARN-7244.002.patch > > > The ShuffleHandler permanently remembers the list of "good" disks on NM > startup. If disks later are added to the node then map tasks will start using > them but the ShuffleHandler will not be aware of them. The end result is that > the data cannot be shuffled from the node leading to fetch failures and > re-runs of the map tasks. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org