[jira] [Updated] (HDFS-2243) DataXceiver per accept seems to be a bottleneck in HBase/YCSB test
[ https://issues.apache.org/jira/browse/HDFS-2243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Caspole updated HDFS-2243: --- Resolution: Won't Fix Status: Resolved (was: Patch Available) Doesn't show any benefit after HDFS-941 fix, closing. > DataXceiver per accept seems to be a bottleneck in HBase/YCSB test > -- > > Key: HDFS-2243 > URL: https://issues.apache.org/jira/browse/HDFS-2243 > Project: Hadoop HDFS > Issue Type: Bug > Components: data-node >Affects Versions: 0.23.0 > Environment: Using Fedora 14 on a quad core phenom system >Reporter: Eric Caspole >Priority: Minor > Fix For: 0.24.0 > > Attachments: HDFS-2234-branch-0.20-append.patch, > HDFS-2243-0.23-110909.patch, HDFS-2243-0.23-110909.txt, > datanode-perf-110808.gif > > > I am running the YCSB benchmark against HBase, sometimes against a single > node, sometimes against a cluster of 6 systems. As the load increases into > thousands of TPS, especially on the single node, I can see that the datanode > runs very high system time and seems to be bottlenecked by how fast it can > create the threads to handle the new connections in DataXceiverServer.run. By > "perf top" I can see the process spends about 12% of all its time in > pthread_create, and in hprof profiles I can see there are tens of thousands > of threads created in just a few minutes of test execution. > Does anyone else observe this bottleneck? Is there a major challenge to using > a thread pool of DataXceivers in this situation? -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-2243) DataXceiver per accept seems to be a bottleneck in HBase/YCSB test
[ https://issues.apache.org/jira/browse/HDFS-2243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Caspole updated HDFS-2243: --- Status: Patch Available (was: Open) The HDFS-2243-0.23-110909.patch should apply to trunk > DataXceiver per accept seems to be a bottleneck in HBase/YCSB test > -- > > Key: HDFS-2243 > URL: https://issues.apache.org/jira/browse/HDFS-2243 > Project: Hadoop HDFS > Issue Type: Bug > Components: data-node >Affects Versions: 0.23.0 > Environment: Using Fedora 14 on a quad core phenom system >Reporter: Eric Caspole >Priority: Minor > Fix For: 0.24.0 > > Attachments: HDFS-2234-branch-0.20-append.patch, > HDFS-2243-0.23-110909.patch, HDFS-2243-0.23-110909.txt, > datanode-perf-110808.gif > > > I am running the YCSB benchmark against HBase, sometimes against a single > node, sometimes against a cluster of 6 systems. As the load increases into > thousands of TPS, especially on the single node, I can see that the datanode > runs very high system time and seems to be bottlenecked by how fast it can > create the threads to handle the new connections in DataXceiverServer.run. By > "perf top" I can see the process spends about 12% of all its time in > pthread_create, and in hprof profiles I can see there are tens of thousands > of threads created in just a few minutes of test execution. > Does anyone else observe this bottleneck? Is there a major challenge to using > a thread pool of DataXceivers in this situation? -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-2243) DataXceiver per accept seems to be a bottleneck in HBase/YCSB test
[ https://issues.apache.org/jira/browse/HDFS-2243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Caspole updated HDFS-2243: --- Attachment: HDFS-2243-0.23-110909.patch > DataXceiver per accept seems to be a bottleneck in HBase/YCSB test > -- > > Key: HDFS-2243 > URL: https://issues.apache.org/jira/browse/HDFS-2243 > Project: Hadoop HDFS > Issue Type: Bug > Components: data-node >Affects Versions: 0.23.0 > Environment: Using Fedora 14 on a quad core phenom system >Reporter: Eric Caspole >Priority: Minor > Fix For: 0.24.0 > > Attachments: HDFS-2234-branch-0.20-append.patch, > HDFS-2243-0.23-110909.patch, HDFS-2243-0.23-110909.txt, > datanode-perf-110808.gif > > > I am running the YCSB benchmark against HBase, sometimes against a single > node, sometimes against a cluster of 6 systems. As the load increases into > thousands of TPS, especially on the single node, I can see that the datanode > runs very high system time and seems to be bottlenecked by how fast it can > create the threads to handle the new connections in DataXceiverServer.run. By > "perf top" I can see the process spends about 12% of all its time in > pthread_create, and in hprof profiles I can see there are tens of thousands > of threads created in just a few minutes of test execution. > Does anyone else observe this bottleneck? Is there a major challenge to using > a thread pool of DataXceivers in this situation? -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-2243) DataXceiver per accept seems to be a bottleneck in HBase/YCSB test
[ https://issues.apache.org/jira/browse/HDFS-2243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Caspole updated HDFS-2243: --- Status: Open (was: Patch Available) > DataXceiver per accept seems to be a bottleneck in HBase/YCSB test > -- > > Key: HDFS-2243 > URL: https://issues.apache.org/jira/browse/HDFS-2243 > Project: Hadoop HDFS > Issue Type: Bug > Components: data-node >Affects Versions: 0.23.0 > Environment: Using Fedora 14 on a quad core phenom system >Reporter: Eric Caspole >Priority: Minor > Fix For: 0.24.0 > > Attachments: HDFS-2234-branch-0.20-append.patch, > HDFS-2243-0.23-110909.txt, datanode-perf-110808.gif > > > I am running the YCSB benchmark against HBase, sometimes against a single > node, sometimes against a cluster of 6 systems. As the load increases into > thousands of TPS, especially on the single node, I can see that the datanode > runs very high system time and seems to be bottlenecked by how fast it can > create the threads to handle the new connections in DataXceiverServer.run. By > "perf top" I can see the process spends about 12% of all its time in > pthread_create, and in hprof profiles I can see there are tens of thousands > of threads created in just a few minutes of test execution. > Does anyone else observe this bottleneck? Is there a major challenge to using > a thread pool of DataXceivers in this situation? -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-2243) DataXceiver per accept seems to be a bottleneck in HBase/YCSB test
[ https://issues.apache.org/jira/browse/HDFS-2243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Caspole updated HDFS-2243: --- Fix Version/s: 0.24.0 Status: Patch Available (was: Open) > DataXceiver per accept seems to be a bottleneck in HBase/YCSB test > -- > > Key: HDFS-2243 > URL: https://issues.apache.org/jira/browse/HDFS-2243 > Project: Hadoop HDFS > Issue Type: Bug > Components: data-node >Affects Versions: 0.23.0 > Environment: Using Fedora 14 on a quad core phenom system >Reporter: Eric Caspole >Priority: Minor > Fix For: 0.24.0 > > Attachments: HDFS-2234-branch-0.20-append.patch, > HDFS-2243-0.23-110909.txt, datanode-perf-110808.gif > > > I am running the YCSB benchmark against HBase, sometimes against a single > node, sometimes against a cluster of 6 systems. As the load increases into > thousands of TPS, especially on the single node, I can see that the datanode > runs very high system time and seems to be bottlenecked by how fast it can > create the threads to handle the new connections in DataXceiverServer.run. By > "perf top" I can see the process spends about 12% of all its time in > pthread_create, and in hprof profiles I can see there are tens of thousands > of threads created in just a few minutes of test execution. > Does anyone else observe this bottleneck? Is there a major challenge to using > a thread pool of DataXceivers in this situation? -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-2243) DataXceiver per accept seems to be a bottleneck in HBase/YCSB test
[ https://issues.apache.org/jira/browse/HDFS-2243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Caspole updated HDFS-2243: --- Attachment: HDFS-2243-0.23-110909.txt Here is the patch merged into trunk. I switched to use j.u.c.ThreadPoolExecutor which seemed a better way then my first version. In YCSB I don't see any performance difference over the trunk, it just uses the available Java API to avoid launching our own threads. > DataXceiver per accept seems to be a bottleneck in HBase/YCSB test > -- > > Key: HDFS-2243 > URL: https://issues.apache.org/jira/browse/HDFS-2243 > Project: Hadoop HDFS > Issue Type: Bug > Components: data-node >Affects Versions: 0.23.0 > Environment: Using Fedora 14 on a quad core phenom system >Reporter: Eric Caspole >Priority: Minor > Attachments: HDFS-2234-branch-0.20-append.patch, > HDFS-2243-0.23-110909.txt, datanode-perf-110808.gif > > > I am running the YCSB benchmark against HBase, sometimes against a single > node, sometimes against a cluster of 6 systems. As the load increases into > thousands of TPS, especially on the single node, I can see that the datanode > runs very high system time and seems to be bottlenecked by how fast it can > create the threads to handle the new connections in DataXceiverServer.run. By > "perf top" I can see the process spends about 12% of all its time in > pthread_create, and in hprof profiles I can see there are tens of thousands > of threads created in just a few minutes of test execution. > Does anyone else observe this bottleneck? Is there a major challenge to using > a thread pool of DataXceivers in this situation? -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-2243) DataXceiver per accept seems to be a bottleneck in HBase/YCSB test
[ https://issues.apache.org/jira/browse/HDFS-2243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Caspole updated HDFS-2243: --- Attachment: HDFS-2234-branch-0.20-append.patch Here is a thread pool patch based on branch-0.20-append, since that is what is documented to work with HBase 0.90.3. I reused the dfs.datanode.max.xcievers to indicate the thread pool size, which is probably cheating. I tested this patch with YCSB workloadb on 3 AMD and 2 Intel systems I have handy. It gives at least equal if not better response times and reduces cpu consumption of the datanode process by 5-10% depending on the system and load. Is this a viable approach? If so, I'll forward port it to 0.23 and I can retest if I can find out which HBase works with 0.23 hdfs. > DataXceiver per accept seems to be a bottleneck in HBase/YCSB test > -- > > Key: HDFS-2243 > URL: https://issues.apache.org/jira/browse/HDFS-2243 > Project: Hadoop HDFS > Issue Type: Bug > Components: data-node >Affects Versions: 0.23.0 > Environment: Using Fedora 14 on a quad core phenom system >Reporter: Eric Caspole >Priority: Minor > Attachments: HDFS-2234-branch-0.20-append.patch, > datanode-perf-110808.gif > > > I am running the YCSB benchmark against HBase, sometimes against a single > node, sometimes against a cluster of 6 systems. As the load increases into > thousands of TPS, especially on the single node, I can see that the datanode > runs very high system time and seems to be bottlenecked by how fast it can > create the threads to handle the new connections in DataXceiverServer.run. By > "perf top" I can see the process spends about 12% of all its time in > pthread_create, and in hprof profiles I can see there are tens of thousands > of threads created in just a few minutes of test execution. > Does anyone else observe this bottleneck? Is there a major challenge to using > a thread pool of DataXceivers in this situation? -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-2243) DataXceiver per accept seems to be a bottleneck in HBase/YCSB test
[ https://issues.apache.org/jira/browse/HDFS-2243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Caspole updated HDFS-2243: --- Attachment: datanode-perf-110808.gif perf top of datanode at 1000 tps. This was using hdfs from the branch-0.20-append. > DataXceiver per accept seems to be a bottleneck in HBase/YCSB test > -- > > Key: HDFS-2243 > URL: https://issues.apache.org/jira/browse/HDFS-2243 > Project: Hadoop HDFS > Issue Type: Bug > Components: data-node >Affects Versions: 0.23.0 > Environment: Using Fedora 14 on a quad core phenom system >Reporter: Eric Caspole >Priority: Minor > Attachments: datanode-perf-110808.gif > > > I am running the YCSB benchmark against HBase, sometimes against a single > node, sometimes against a cluster of 6 systems. As the load increases into > thousands of TPS, especially on the single node, I can see that the datanode > runs very high system time and seems to be bottlenecked by how fast it can > create the threads to handle the new connections in DataXceiverServer.run. By > "perf top" I can see the process spends about 12% of all its time in > pthread_create, and in hprof profiles I can see there are tens of thousands > of threads created in just a few minutes of test execution. > Does anyone else observe this bottleneck? Is there a major challenge to using > a thread pool of DataXceivers in this situation? -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira