[jira] [Updated] (HDFS-2243) DataXceiver per accept seems to be a bottleneck in HBase/YCSB test

2011-09-26 Thread Eric Caspole (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Caspole updated HDFS-2243:
---

Resolution: Won't Fix
Status: Resolved  (was: Patch Available)

Doesn't show any benefit after HDFS-941 fix, closing.

> DataXceiver per accept seems to be a bottleneck in HBase/YCSB test
> --
>
> Key: HDFS-2243
> URL: https://issues.apache.org/jira/browse/HDFS-2243
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: data-node
>Affects Versions: 0.23.0
> Environment: Using Fedora 14 on a quad core phenom system
>Reporter: Eric Caspole
>Priority: Minor
> Fix For: 0.24.0
>
> Attachments: HDFS-2234-branch-0.20-append.patch, 
> HDFS-2243-0.23-110909.patch, HDFS-2243-0.23-110909.txt, 
> datanode-perf-110808.gif
>
>
> I am running the YCSB benchmark against HBase, sometimes against a single 
> node, sometimes against a cluster of 6 systems. As the load increases into 
> thousands of TPS, especially on the single node, I can see that the datanode 
> runs very high system time and seems to be bottlenecked by how fast it can 
> create the threads to handle the new connections in DataXceiverServer.run. By 
> "perf top" I can see the process spends about 12% of all its time in 
> pthread_create, and in hprof profiles I can see there are tens of thousands 
> of threads created in just a few minutes of test execution.
> Does anyone else observe this bottleneck? Is there a major challenge to using 
> a thread pool of DataXceivers in this situation?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2243) DataXceiver per accept seems to be a bottleneck in HBase/YCSB test

2011-09-16 Thread Eric Caspole (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Caspole updated HDFS-2243:
---

Status: Patch Available  (was: Open)

The HDFS-2243-0.23-110909.patch should apply to trunk

> DataXceiver per accept seems to be a bottleneck in HBase/YCSB test
> --
>
> Key: HDFS-2243
> URL: https://issues.apache.org/jira/browse/HDFS-2243
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: data-node
>Affects Versions: 0.23.0
> Environment: Using Fedora 14 on a quad core phenom system
>Reporter: Eric Caspole
>Priority: Minor
> Fix For: 0.24.0
>
> Attachments: HDFS-2234-branch-0.20-append.patch, 
> HDFS-2243-0.23-110909.patch, HDFS-2243-0.23-110909.txt, 
> datanode-perf-110808.gif
>
>
> I am running the YCSB benchmark against HBase, sometimes against a single 
> node, sometimes against a cluster of 6 systems. As the load increases into 
> thousands of TPS, especially on the single node, I can see that the datanode 
> runs very high system time and seems to be bottlenecked by how fast it can 
> create the threads to handle the new connections in DataXceiverServer.run. By 
> "perf top" I can see the process spends about 12% of all its time in 
> pthread_create, and in hprof profiles I can see there are tens of thousands 
> of threads created in just a few minutes of test execution.
> Does anyone else observe this bottleneck? Is there a major challenge to using 
> a thread pool of DataXceivers in this situation?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2243) DataXceiver per accept seems to be a bottleneck in HBase/YCSB test

2011-09-16 Thread Eric Caspole (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Caspole updated HDFS-2243:
---

Attachment: HDFS-2243-0.23-110909.patch

> DataXceiver per accept seems to be a bottleneck in HBase/YCSB test
> --
>
> Key: HDFS-2243
> URL: https://issues.apache.org/jira/browse/HDFS-2243
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: data-node
>Affects Versions: 0.23.0
> Environment: Using Fedora 14 on a quad core phenom system
>Reporter: Eric Caspole
>Priority: Minor
> Fix For: 0.24.0
>
> Attachments: HDFS-2234-branch-0.20-append.patch, 
> HDFS-2243-0.23-110909.patch, HDFS-2243-0.23-110909.txt, 
> datanode-perf-110808.gif
>
>
> I am running the YCSB benchmark against HBase, sometimes against a single 
> node, sometimes against a cluster of 6 systems. As the load increases into 
> thousands of TPS, especially on the single node, I can see that the datanode 
> runs very high system time and seems to be bottlenecked by how fast it can 
> create the threads to handle the new connections in DataXceiverServer.run. By 
> "perf top" I can see the process spends about 12% of all its time in 
> pthread_create, and in hprof profiles I can see there are tens of thousands 
> of threads created in just a few minutes of test execution.
> Does anyone else observe this bottleneck? Is there a major challenge to using 
> a thread pool of DataXceivers in this situation?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2243) DataXceiver per accept seems to be a bottleneck in HBase/YCSB test

2011-09-16 Thread Eric Caspole (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Caspole updated HDFS-2243:
---

Status: Open  (was: Patch Available)

> DataXceiver per accept seems to be a bottleneck in HBase/YCSB test
> --
>
> Key: HDFS-2243
> URL: https://issues.apache.org/jira/browse/HDFS-2243
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: data-node
>Affects Versions: 0.23.0
> Environment: Using Fedora 14 on a quad core phenom system
>Reporter: Eric Caspole
>Priority: Minor
> Fix For: 0.24.0
>
> Attachments: HDFS-2234-branch-0.20-append.patch, 
> HDFS-2243-0.23-110909.txt, datanode-perf-110808.gif
>
>
> I am running the YCSB benchmark against HBase, sometimes against a single 
> node, sometimes against a cluster of 6 systems. As the load increases into 
> thousands of TPS, especially on the single node, I can see that the datanode 
> runs very high system time and seems to be bottlenecked by how fast it can 
> create the threads to handle the new connections in DataXceiverServer.run. By 
> "perf top" I can see the process spends about 12% of all its time in 
> pthread_create, and in hprof profiles I can see there are tens of thousands 
> of threads created in just a few minutes of test execution.
> Does anyone else observe this bottleneck? Is there a major challenge to using 
> a thread pool of DataXceivers in this situation?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2243) DataXceiver per accept seems to be a bottleneck in HBase/YCSB test

2011-09-16 Thread Eric Caspole (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Caspole updated HDFS-2243:
---

Fix Version/s: 0.24.0
   Status: Patch Available  (was: Open)

> DataXceiver per accept seems to be a bottleneck in HBase/YCSB test
> --
>
> Key: HDFS-2243
> URL: https://issues.apache.org/jira/browse/HDFS-2243
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: data-node
>Affects Versions: 0.23.0
> Environment: Using Fedora 14 on a quad core phenom system
>Reporter: Eric Caspole
>Priority: Minor
> Fix For: 0.24.0
>
> Attachments: HDFS-2234-branch-0.20-append.patch, 
> HDFS-2243-0.23-110909.txt, datanode-perf-110808.gif
>
>
> I am running the YCSB benchmark against HBase, sometimes against a single 
> node, sometimes against a cluster of 6 systems. As the load increases into 
> thousands of TPS, especially on the single node, I can see that the datanode 
> runs very high system time and seems to be bottlenecked by how fast it can 
> create the threads to handle the new connections in DataXceiverServer.run. By 
> "perf top" I can see the process spends about 12% of all its time in 
> pthread_create, and in hprof profiles I can see there are tens of thousands 
> of threads created in just a few minutes of test execution.
> Does anyone else observe this bottleneck? Is there a major challenge to using 
> a thread pool of DataXceivers in this situation?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2243) DataXceiver per accept seems to be a bottleneck in HBase/YCSB test

2011-09-12 Thread Eric Caspole (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Caspole updated HDFS-2243:
---

Attachment: HDFS-2243-0.23-110909.txt

Here is the patch merged into trunk. I switched to use j.u.c.ThreadPoolExecutor 
which seemed a better way then my first version. In YCSB I don't see any 
performance difference over the trunk, it just uses the available Java API to 
avoid launching our own threads.

> DataXceiver per accept seems to be a bottleneck in HBase/YCSB test
> --
>
> Key: HDFS-2243
> URL: https://issues.apache.org/jira/browse/HDFS-2243
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: data-node
>Affects Versions: 0.23.0
> Environment: Using Fedora 14 on a quad core phenom system
>Reporter: Eric Caspole
>Priority: Minor
> Attachments: HDFS-2234-branch-0.20-append.patch, 
> HDFS-2243-0.23-110909.txt, datanode-perf-110808.gif
>
>
> I am running the YCSB benchmark against HBase, sometimes against a single 
> node, sometimes against a cluster of 6 systems. As the load increases into 
> thousands of TPS, especially on the single node, I can see that the datanode 
> runs very high system time and seems to be bottlenecked by how fast it can 
> create the threads to handle the new connections in DataXceiverServer.run. By 
> "perf top" I can see the process spends about 12% of all its time in 
> pthread_create, and in hprof profiles I can see there are tens of thousands 
> of threads created in just a few minutes of test execution.
> Does anyone else observe this bottleneck? Is there a major challenge to using 
> a thread pool of DataXceivers in this situation?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2243) DataXceiver per accept seems to be a bottleneck in HBase/YCSB test

2011-08-23 Thread Eric Caspole (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Caspole updated HDFS-2243:
---

Attachment: HDFS-2234-branch-0.20-append.patch

Here is a thread pool patch based on branch-0.20-append, since that is what is 
documented to work with HBase 0.90.3. I reused the dfs.datanode.max.xcievers to 
indicate the thread pool size, which is probably cheating.
I tested this patch with YCSB workloadb on 3 AMD and 2 Intel systems I have 
handy. It gives at least equal if not better response times and reduces cpu 
consumption of the datanode process by 5-10% depending on the system and load.
Is this a viable approach? If so, I'll forward port it to 0.23 and I can retest 
if I can find out which HBase works with 0.23 hdfs.

> DataXceiver per accept seems to be a bottleneck in HBase/YCSB test
> --
>
> Key: HDFS-2243
> URL: https://issues.apache.org/jira/browse/HDFS-2243
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: data-node
>Affects Versions: 0.23.0
> Environment: Using Fedora 14 on a quad core phenom system
>Reporter: Eric Caspole
>Priority: Minor
> Attachments: HDFS-2234-branch-0.20-append.patch, 
> datanode-perf-110808.gif
>
>
> I am running the YCSB benchmark against HBase, sometimes against a single 
> node, sometimes against a cluster of 6 systems. As the load increases into 
> thousands of TPS, especially on the single node, I can see that the datanode 
> runs very high system time and seems to be bottlenecked by how fast it can 
> create the threads to handle the new connections in DataXceiverServer.run. By 
> "perf top" I can see the process spends about 12% of all its time in 
> pthread_create, and in hprof profiles I can see there are tens of thousands 
> of threads created in just a few minutes of test execution.
> Does anyone else observe this bottleneck? Is there a major challenge to using 
> a thread pool of DataXceivers in this situation?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2243) DataXceiver per accept seems to be a bottleneck in HBase/YCSB test

2011-08-10 Thread Eric Caspole (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Caspole updated HDFS-2243:
---

Attachment: datanode-perf-110808.gif

perf top of datanode at 1000 tps. This was using hdfs from the 
branch-0.20-append.

> DataXceiver per accept seems to be a bottleneck in HBase/YCSB test
> --
>
> Key: HDFS-2243
> URL: https://issues.apache.org/jira/browse/HDFS-2243
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: data-node
>Affects Versions: 0.23.0
> Environment: Using Fedora 14 on a quad core phenom system
>Reporter: Eric Caspole
>Priority: Minor
> Attachments: datanode-perf-110808.gif
>
>
> I am running the YCSB benchmark against HBase, sometimes against a single 
> node, sometimes against a cluster of 6 systems. As the load increases into 
> thousands of TPS, especially on the single node, I can see that the datanode 
> runs very high system time and seems to be bottlenecked by how fast it can 
> create the threads to handle the new connections in DataXceiverServer.run. By 
> "perf top" I can see the process spends about 12% of all its time in 
> pthread_create, and in hprof profiles I can see there are tens of thousands 
> of threads created in just a few minutes of test execution.
> Does anyone else observe this bottleneck? Is there a major challenge to using 
> a thread pool of DataXceivers in this situation?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira