[ 
https://issues.apache.org/jira/browse/HADOOP-8198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13251709#comment-13251709
 ] 

Eli Collins commented on HADOOP-8198:
-------------------------------------

@Nathan, thanks for chiming in, answers follow..

- Wiring up multiple interfaces does mean you need 2x the port count, more 
cable management issues, and potentially additional switch configuration. 
That's true today for people who use host-level bonding.
- For use case #1 supporting multiple interfaces is like supporting multiple of 
any host resource (eg disks). You get improved performance and the ability to 
tolerate more failures at the cost of additional code complexity. We already 
have to tolerate client <-> worker connection failures, we can leave the 
current behavior as is, or attempt to better tolerate them by eg working around 
them (eg see HDFS-3149). Like tolerating disk failures this means some hosts 
may more resources than others (if by default only one interface is reported 
then this only affects the multi-interface case). I'm also considering the 
impact on MR, where you'd want the shuffle to be able to take advantage of this 
as well, and more importantly, if it didn't then you could potentially have 
more imbalanced network traffic.
- For use case #2 supporting multiple interfaces is simpler because clients 
don't necessarily get multiple interfaces, different clients just end up 
getting different interfaces, in the same way the NN can bind to the wildcard 
today, causing it to be available on multiple interfaces, and clients can 
access it via any of them. Note that both are independent, you can support #2 
w/o #1 and vice versa.
- Wrt host-level bonding and 10gige, see my comment above to Sanjay, these both 
help use case #1, they don't address use case #2, the primary motivation.
                
> Support multiple network interfaces
> -----------------------------------
>
>                 Key: HADOOP-8198
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8198
>             Project: Hadoop Common
>          Issue Type: New Feature
>          Components: io, performance
>            Reporter: Eli Collins
>            Assignee: Eli Collins
>         Attachments: MultipleNifsv1.pdf, MultipleNifsv2.pdf, 
> MultipleNifsv3.pdf
>
>
> Hadoop does not currently utilize multiple network interfaces, which is a 
> common user request, and important in enterprise environments. This jira 
> covers a proposal for enhancements to Hadoop so it better utilizes multiple 
> network interfaces. The primary motivation being improved performance, 
> performance isolation, resource utilization and fault tolerance. The attached 
> design doc covers the high-level use cases, requirements, a proposal for 
> trunk/0.23, discussion on related features, and a proposal for Hadoop 1.x 
> that covers a subset of the functionality of the trunk/0.23 proposal.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to