[ 
https://issues.apache.org/jira/browse/HADOOP-10048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated HADOOP-10048:
--------------------------------
    Attachment: HADOOP-10048.006.patch

bq. Otherwise, accessing of disks could be aggregated on particular disk. 
Thoughts?

I'm not worried about the dirNumLastAccessed being somewhat random -- it 
already is random if someone needs a write location without specifying a size.  
What is more concerning is the thundering herd problem where a bunch of threads 
all need write locations with a size at the same time.  All or most of the 
threads could end up theoretically clustering on the same disk which is less 
than ideal.  Attaching a new patch that uses an AtomicInteger to make sure that 
simultaneous threads won't get the same starting point when searching the 
directories.

This approach doesn't completely solve the clustering issue when one or more 
disks gets full enough to not satisfy the requests.  An alternative approach 
would be to use a random starting location like is done when the size is not 
specified.  I went with this approach since it is closer to the original 
semantics without adding the undesired locking necessary to guarantee it.

> LocalDirAllocator should avoid holding locks while accessing the filesystem
> ---------------------------------------------------------------------------
>
>                 Key: HADOOP-10048
>                 URL: https://issues.apache.org/jira/browse/HADOOP-10048
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 2.3.0
>            Reporter: Jason Lowe
>            Assignee: Jason Lowe
>         Attachments: HADOOP-10048.003.patch, HADOOP-10048.004.patch, 
> HADOOP-10048.005.patch, HADOOP-10048.006.patch, HADOOP-10048.patch, 
> HADOOP-10048.trunk.patch
>
>
> As noted in MAPREDUCE-5584 and HADOOP-7016, LocalDirAllocator can be a 
> bottleneck for multithreaded setups like the ShuffleHandler.  We should 
> consider moving to a lockless design or minimizing the critical sections to a 
> very small amount of time that does not involve I/O operations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to