[jira] [Assigned] (MAPREDUCE-6631) shuffle handler would benefit from per-local-dir threads

2017-10-20 Thread Haibo Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen reassigned MAPREDUCE-6631:
-

Assignee: (was: Haibo Chen)

> shuffle handler would benefit from per-local-dir threads
> 
>
> Key: MAPREDUCE-6631
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6631
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: nodemanager
>Affects Versions: 2.7.2, 3.0.0-alpha1
>Reporter: Nathan Roberts
>
> [~jlowe] and I discussed this while investigating I/O starvation we have been 
> seeing on our clusters lately (possibly amplified by increased tez 
> workloads). 
> If a particular disk is being slow, it is very likely that all shuffle netty 
> threads will be blocked on the read side of sendfile(). (sendfile() is 
> asynchronous on the outbound socket side, but not on the read side.) This 
> causes the entire shuffle subsystem to slow down. 
> It seems like we could make the netty threads more asynchronous by 
> introducing a small set of threads per local-dir that are responsible for the 
> actual sendfile() invocations.
> This would not only improve shuffles that span drives, but also improve 
> situations where there is a single large shuffle from a single local-dir. It 
> would allow other drives to continue serving shuffle requests, AND avoid a 
> large number of readers (2X number_of_cores by default) all fighting for the 
> same drive, which becomes unfair to everything else on the system.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Assigned] (MAPREDUCE-6631) shuffle handler would benefit from per-local-dir threads

2016-07-20 Thread Haibo Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen reassigned MAPREDUCE-6631:
-

Assignee: Haibo Chen

> shuffle handler would benefit from per-local-dir threads
> 
>
> Key: MAPREDUCE-6631
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6631
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: nodemanager
>Affects Versions: 2.7.2, 3.0.0-alpha1
>Reporter: Nathan Roberts
>Assignee: Haibo Chen
>
> [~jlowe] and I discussed this while investigating I/O starvation we have been 
> seeing on our clusters lately (possibly amplified by increased tez 
> workloads). 
> If a particular disk is being slow, it is very likely that all shuffle netty 
> threads will be blocked on the read side of sendfile(). (sendfile() is 
> asynchronous on the outbound socket side, but not on the read side.) This 
> causes the entire shuffle subsystem to slow down. 
> It seems like we could make the netty threads more asynchronous by 
> introducing a small set of threads per local-dir that are responsible for the 
> actual sendfile() invocations.
> This would not only improve shuffles that span drives, but also improve 
> situations where there is a single large shuffle from a single local-dir. It 
> would allow other drives to continue serving shuffle requests, AND avoid a 
> large number of readers (2X number_of_cores by default) all fighting for the 
> same drive, which becomes unfair to everything else on the system.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org