[jira] Updated: (HADOOP-2144) Data node process consumes 180% cpu

Runping Qi (JIRA) Fri, 02 Nov 2007 06:09:13 -0800

     [ 
https://issues.apache.org/jira/browse/HADOOP-2144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Runping Qi updated HADOOP-2144:
-------------------------------

    Component/s: dfs
    Description: 
I did a test on DFS read throughput and found that the data node 
process consumes up to 180% cpu when it is under heavi load. Here are the 
details:

The cluster has 380+ machines, each with 3GB mem and 4 cpus and 4 disks.
I copied a 10GB file to dfs from one machine with a data node running there.
Based on the dfs block placement policy, that machine has one replica for each 
block of the file.
then I run 4 of the following commands in parellel:

hadoop dfs -cat thefile > /dev/null &

Since all the blocks have a local replica, all the read requests went to the 
local data node.
I observed that:

    The data node process's cpu usage was around 180% for most of the time .

    The clients's cpu usage was moderate (as it should be).

    All the four disks were working concurrently with comparable read 
throughput.

    The total read throughput was maxed at 90MB/Sec, about 60% of the expected 
total 
    aggregated max read throughput of 4 disks (160MB/Sec). Thus disks were not 
a bottleneck
    in this case.

The data node's cpu usage seems unreasonably high.




  was:

I did a test on DFS read throughput and found that the data node process 
consumes up to 180% cpu when it is under heavi load. Here are the details:

The cluster has 380+ machines, each with 3GB mem and 4 cpus and 4 disks.
I copied a 10GB file to dfs from one machine with a data node running there.
Based on the dfs block placement policy, that machine has one replica for each 
block of the file.
then I run 4 of the following commands in parellel:

hadoop dfs -cat thefile > /dev/null &

Since all the blocks have a local replica, all the read requests went to the 
local data node.
I observed that:

    The data node process's cpu usage was around 180% for most of the time .

    The clients's cpu usage was moderate (as it should be).

    All the four disks were working concurrently with comparable read 
throughput.

    The total read throughput was maxed at 90MB/Sec, about 60% of the expected 
total 
    aggregated max read throughput of 4 disks (160MB/Sec)

The data node's cpu usage seems unreasonably high.





> Data node process consumes 180% cpu 
> ------------------------------------
>
>                 Key: HADOOP-2144
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2144
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: dfs
>            Reporter: Runping Qi
>
> I did a test on DFS read throughput and found that the data node 
> process consumes up to 180% cpu when it is under heavi load. Here are the 
> details:
> The cluster has 380+ machines, each with 3GB mem and 4 cpus and 4 disks.
> I copied a 10GB file to dfs from one machine with a data node running there.
> Based on the dfs block placement policy, that machine has one replica for 
> each block of the file.
> then I run 4 of the following commands in parellel:
> hadoop dfs -cat thefile > /dev/null &
> Since all the blocks have a local replica, all the read requests went to the 
> local data node.
> I observed that:
>     The data node process's cpu usage was around 180% for most of the time .
>     The clients's cpu usage was moderate (as it should be).
>     All the four disks were working concurrently with comparable read 
> throughput.
>     The total read throughput was maxed at 90MB/Sec, about 60% of the 
> expected total 
>     aggregated max read throughput of 4 disks (160MB/Sec). Thus disks were 
> not a bottleneck
>     in this case.
> The data node's cpu usage seems unreasonably high.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-2144) Data node process consumes 180% cpu

Reply via email to