[jira] [Commented] (HBASE-10827) Making HBase use multiple ethernet cards will improve the performance
[ https://issues.apache.org/jira/browse/HBASE-10827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13949050#comment-13949050 ] zhaojianbo commented on HBASE-10827: {quote} Yes. Why not use that mechanism instead? {quote} As far as I know, there are some limitation in that mechanism. It seems that both of the two ethernet cards need to be the same configuration, and in addition, the configuration of network structure and switch are complicated, and It needs some operation and maintenance work. So I think the implementation in software is more adaptable, flexible and needs less operation work. correct me if I am wrong. :-) Making HBase use multiple ethernet cards will improve the performance - Key: HBASE-10827 URL: https://issues.apache.org/jira/browse/HBASE-10827 Project: HBase Issue Type: New Feature Affects Versions: 0.99.0 Reporter: zhaojianbo Assignee: zhaojianbo Attachments: HBASE-10827-0.98-branch.patch In our online cluster, usually there are multiple ethernet cards in one machine, one for outer network, one for inner network. But the current version of HBase can not use all of them which waste the network bandwidth of one ethernet card. If we make HBase use multiple ethernet cards concurrently, the performance of HBase will be improved. So I did the work, and test a simple scenario: 8 clients scan the same region data from a different machine with two ethernet cards.(machine of regionserver also with two ethernet cards) The Environment is: * I start HBase cluster with a master, a regionserver, a zookeeper in a machine. * HDFS cluster with a Namenode, a datanode, a secondary namenode is also started in the same machine. * 8 client run on different machine. * all data local * 22GB data size I measure the performance before and after the optimization. The results are: ||client||time before optimization||time after optimization|| | 8 | 1665.07s | 1242.45s | The patch is uploaded. What I did is the following: # create new RPC getAllServerAddress which obtain all the addresses of regionserver # client call the RPC to obtain the addresses, choose one of them randomly, validate the address and use the address as the regionLocation address # add a cache serverAddressMap to avoid redundant RPC. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-10827) Making HBase use multiple ethernet cards will improve the performance
[ https://issues.apache.org/jira/browse/HBASE-10827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13949523#comment-13949523 ] Hadoop QA commented on HBASE-10827: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12636663/HBASE-10827-0.98-branch.patch against trunk revision . ATTACHMENT ID: 12636663 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 6 new or modified tests. {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/9113//console This message is automatically generated. Making HBase use multiple ethernet cards will improve the performance - Key: HBASE-10827 URL: https://issues.apache.org/jira/browse/HBASE-10827 Project: HBase Issue Type: New Feature Affects Versions: 0.99.0 Reporter: zhaojianbo Assignee: zhaojianbo Attachments: HBASE-10827-0.98-branch.patch In our online cluster, usually there are multiple ethernet cards in one machine, one for outer network, one for inner network. But the current version of HBase can not use all of them which waste the network bandwidth of one ethernet card. If we make HBase use multiple ethernet cards concurrently, the performance of HBase will be improved. So I did the work, and test a simple scenario: 8 clients scan the same region data from a different machine with two ethernet cards.(machine of regionserver also with two ethernet cards) The Environment is: * I start HBase cluster with a master, a regionserver, a zookeeper in a machine. * HDFS cluster with a Namenode, a datanode, a secondary namenode is also started in the same machine. * 8 client run on different machine. * all data local * 22GB data size I measure the performance before and after the optimization. The results are: ||client||time before optimization||time after optimization|| | 8 | 1665.07s | 1242.45s | The patch is uploaded. What I did is the following: # create new RPC getAllServerAddress which obtain all the addresses of regionserver # client call the RPC to obtain the addresses, choose one of them randomly, validate the address and use the address as the regionLocation address # add a cache serverAddressMap to avoid redundant RPC. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-10827) Making HBase use multiple ethernet cards will improve the performance
[ https://issues.apache.org/jira/browse/HBASE-10827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13947945#comment-13947945 ] Jonathan Hsieh commented on HBASE-10827: Yes. Why not use that mechanism instead? Making HBase use multiple ethernet cards will improve the performance - Key: HBASE-10827 URL: https://issues.apache.org/jira/browse/HBASE-10827 Project: HBase Issue Type: New Feature Affects Versions: 0.99.0 Reporter: zhaojianbo Assignee: zhaojianbo Attachments: HBASE-10827-0.98-branch.patch In our online cluster, usually there are multiple ethernet cards in one machine, one for outer network, one for inner network. But the current version of HBase can not use all of them which waste the network bandwidth of one ethernet card. If we make HBase use multiple ethernet cards concurrently, the performance of HBase will be improved. So I did the work, and test a simple scenario: 8 clients scan the same region data from a different machine with two ethernet cards.(machine of regionserver also with two ethernet cards) The Environment is: * I start HBase cluster with a master, a regionserver, a zookeeper in a machine. * HDFS cluster with a Namenode, a datanode, a secondary namenode is also started in the same machine. * 8 client run on different machine. * all data local * 22GB data size I measure the performance before and after the optimization. The results are: ||client||time before optimization||time after optimization|| | 8 | 1665.07s | 1242.45s | The patch is uploaded. What I did is the following: # create new RPC getAllServerAddress which obtain all the addresses of regionserver # client call the RPC to obtain the addresses, choose one of them randomly, validate the address and use the address as the regionLocation address # add a cache serverAddressMap to avoid redundant RPC. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-10827) Making HBase use multiple ethernet cards will improve the performance
[ https://issues.apache.org/jira/browse/HBASE-10827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13946613#comment-13946613 ] Jonathan Hsieh commented on HBASE-10827: Why not just use port bonding / LACP? Making HBase use multiple ethernet cards will improve the performance - Key: HBASE-10827 URL: https://issues.apache.org/jira/browse/HBASE-10827 Project: HBase Issue Type: New Feature Affects Versions: 0.99.0 Reporter: zhaojianbo Attachments: HBASE-10827-0.98-branch.patch In our online cluster, usually there are multiple ethernet cards in one machine, one for outer network, one for inner network. But the current version of HBase can not use all of them which waste the network bandwidth of one ethernet card. If we make HBase use multiple ethernet cards concurrently, the performance of HBase will be improved. So I did the work, and test a simple scenario: 8 clients scan the same region data from a different machine with two ethernet cards.(machine of regionserver also with two ethernet cards) The Environment is: * I start HBase cluster with a master, a regionserver, a zookeeper in a machine. * HDFS cluster with a Namenode, a datanode, a secondary namenode is also started in the same machine. * 8 client run on different machine. * all data local * 22GB data size I measure the performance before and after the optimization. The results are: ||client||time before optimization||time after optimization|| | 8 | 1665.07s | 1242.45s | The patch is uploaded. What I did is the following: # create new RPC getAllServerAddress which obtain all the addresses of regionserver # client call the RPC to obtain the addresses, choose one of them randomly, validate the address and use the address as the regionLocation address # add a cache serverAddressMap to avoid redundant RPC. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-10827) Making HBase use multiple ethernet cards will improve the performance
[ https://issues.apache.org/jira/browse/HBASE-10827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13947470#comment-13947470 ] zhaojianbo commented on HBASE-10827: {quote} Why not just use port bonding / LACP? {quote} You mean that two ethernet cards are bound together. Logically, use them as a ethernet card? Making HBase use multiple ethernet cards will improve the performance - Key: HBASE-10827 URL: https://issues.apache.org/jira/browse/HBASE-10827 Project: HBase Issue Type: New Feature Affects Versions: 0.99.0 Reporter: zhaojianbo Attachments: HBASE-10827-0.98-branch.patch In our online cluster, usually there are multiple ethernet cards in one machine, one for outer network, one for inner network. But the current version of HBase can not use all of them which waste the network bandwidth of one ethernet card. If we make HBase use multiple ethernet cards concurrently, the performance of HBase will be improved. So I did the work, and test a simple scenario: 8 clients scan the same region data from a different machine with two ethernet cards.(machine of regionserver also with two ethernet cards) The Environment is: * I start HBase cluster with a master, a regionserver, a zookeeper in a machine. * HDFS cluster with a Namenode, a datanode, a secondary namenode is also started in the same machine. * 8 client run on different machine. * all data local * 22GB data size I measure the performance before and after the optimization. The results are: ||client||time before optimization||time after optimization|| | 8 | 1665.07s | 1242.45s | The patch is uploaded. What I did is the following: # create new RPC getAllServerAddress which obtain all the addresses of regionserver # client call the RPC to obtain the addresses, choose one of them randomly, validate the address and use the address as the regionLocation address # add a cache serverAddressMap to avoid redundant RPC. -- This message was sent by Atlassian JIRA (v6.2#6252)