[ 
https://issues.apache.org/jira/browse/YARN-7244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16184319#comment-16184319
 ] 

Jason Lowe commented on YARN-7244:
----------------------------------

bq.  rather a new api as you mentioned in LocalDirAllocator named 
getLocalDirsForRead could enough to get valid dirs as it pulls all configured 
NM_LOCAL_DIRS and validates same.

LocalDirAllocator should not have the new API, IMHO.  That class is in 
hadoop-common and shouldn't be involved in solving this nodemanager-specific 
problem.

I'm thinking we go with the pull approach with something like the following.  
Note that I'm not stuck on the specific names of new interfaces/classes, 
they're just examples for reference.
# In AuxiliaryService add new methods to get and set the API object to interact 
with the NM's local dirs management, e.g.:
{code}
  public AuxiliaryLocalPathHandler getAuxiliaryLocalPathHandler();
  public void setAuxiliaryLocalPathHandler(AuxiliaryLocalPathHandler);
{code}
# The new AuxiliaryLocalPathHandler object would be in hadoop-yarn-api and look 
something like this:
{code}
  public interface AuxiliaryLocalPathHandler {
    Path getLocalPathForRead(String);
    Path getLocalPathForWrite(String);
    Path getLocalPathForWrite(String, long);
  }
{code}
# AuxiliaryService would implement a LocalDirsHandler that maps the 
AuxiliarlyLocalDirsHandler calls to the NMs LocalDirsHandlerService.
# The ShuffleHandler can leverage the new AuxiliaryLocalPathHandler to find 
shuffle input files rather than manage its own LocalDirAllocator.

> ShuffleHandler is not aware of disks that are added
> ---------------------------------------------------
>
>                 Key: YARN-7244
>                 URL: https://issues.apache.org/jira/browse/YARN-7244
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: Kuhu Shukla
>            Assignee: Kuhu Shukla
>         Attachments: YARN-7244.001.patch, YARN-7244.002.patch
>
>
> The ShuffleHandler permanently remembers the list of "good" disks on NM 
> startup. If disks later are added to the node then map tasks will start using 
> them but the ShuffleHandler will not be aware of them. The end result is that 
> the data cannot be shuffled from the node leading to fetch failures and 
> re-runs of the map tasks.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to