[ 
https://issues.apache.org/jira/browse/YARN-7244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16182678#comment-16182678
 ] 

Jason Lowe commented on YARN-7244:
----------------------------------

Thanks for the patch!

The core issue here is that the NM is handing out directories to tasks that the 
shuffle manager is unaware of.  This filtering-or-not approach doesn't 
completely solve the issue, since the ShuffleHandler will still attempt to 
visit disks that the NM has already determined are bad.  That could cause 
performance problems if the ShuffleHandler tries to read a particularly 
problematic disk over and over as it searches for outputs to shuffle for every 
shuffle request.

It would be more ideal if the NM could convey to aux services what directories 
are in use.  Then the ShuffleHandler and NM would be in sync with respect to 
what disks should or should not be used.

bq. Another way to handle this would have been to change the AuxiliaryServices 
to pass the NMContext or the LocalDirAllocator from the NM .

That would be nice, as there are probably other things in the NMContext that 
aux services may want to know about.  However we could always go with a much 
more direct route.  We could add an API to AuxiliaryService that can set a 
callback object that can be leveraged to retrieve the current list of paths 
that are good for reading or writing, or we can an API to AuxiliaryService that 
the NM can call to update that service on the list of paths good for reading 
and writing.  (i.e.: either a 'pull' or 'push' model for exposing the current 
good directories to aux services).

The 'pull' model requires an interface or abstract class in yarn-api that 
defines the API aux services can call to retrieve the directories, and we would 
put the actual implementation of that interface in yarn-server-nodemanager.  
Ideally the interface would look a lot like the existing getLocalDirsForRead(), 
getLocalDirsForWrite(), etc. of the LocalDirsHandlerService so it's an easy 
pass-through to implement on the nodemanager side.

The 'push' model requires adding a listener interface to 
LocalDIrsHandlerService so we know when a disk is added or removed and can 
callback into each aux service to update them on the current list of dirs for 
reading and writing.

Haven't had a lot of time to figure out which would be more ideal in practice 
in terms of ease-of-use and performance, but I think I'd rather see the aux 
services be more in sync with the rest of the NM wrt. local dirs being actively 
used.


> ShuffleHandler is not aware of disks that are added
> ---------------------------------------------------
>
>                 Key: YARN-7244
>                 URL: https://issues.apache.org/jira/browse/YARN-7244
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: Kuhu Shukla
>            Assignee: Kuhu Shukla
>         Attachments: YARN-7244.001.patch, YARN-7244.002.patch
>
>
> The ShuffleHandler permanently remembers the list of "good" disks on NM 
> startup. If disks later are added to the node then map tasks will start using 
> them but the ShuffleHandler will not be aware of them. The end result is that 
> the data cannot be shuffled from the node leading to fetch failures and 
> re-runs of the map tasks.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to