Re: extremely imbalance in the hdfs cluster

2011-06-29 Thread 茅旭峰
Thanks Edward! It seems like we could only live with this issue.

On Wed, Jun 29, 2011 at 11:24 PM, Edward Capriolo wrote:

> We have run into this issue as well. Since hadoop is RR writing different
> size disks really screw things up royally especially if you are running at
> high capacity. We have found that decommissioning hosts for stretches of
> time is more effective then the balancer in extreme situations. Another
> hokey trick is that nodes that launch a job always use that node as the
> first replica. You can leverage that by launching jobs from your bigger
> machines which makes data more likely to be saved there. Super hokey
> solution is moving blocks around with rsync! (block reports later happen
> and
> deal with this (I do not suggest this)).
>
> Hadoop really does need a more intelligent system then Round Robin writing
> for heterogeneous systems, there might be a jira open on this somewhere.
> But
> if you are on 0.20.X you have to work with it.
>
> Edward
>
> On Wed, Jun 29, 2011 at 9:06 AM, 茅旭峰  wrote:
>
> > Hi,
> >
> > I'm running a 37 DN hdfs cluster. There are 12 nodes have 20TB capacity
> > each
> > node, and the other 25 nodes have 24TB each node.Unfortunately, there are
> > several nodes that contain much more data than others, and I can still
> see
> > the data increasing crazy. The 'dstat' shows
> >
> > dstat -ta 2
> > -time- total-cpu-usage -dsk/total- -net/total-
> ---paging--
> > ---system--
> >  date/time   |usr sys idl wai hiq siq| read  writ| recv  send|  in   out
> |
> > int   csw
> > 24-06 00:42:43|  1   1  95   2   0   0|  25M   62M|   0 0 |   0   0.1
> > |3532  5644
> > 24-06 00:42:45|  7   1  91   0   0   0|  16k  176k|8346B 1447k|   0 0
> > |1201   365
> > 24-06 00:42:47|  7   1  91   0   0   0|  12k  172k|9577B 1493k|   0 0
> > |1223   334
> > 24-06 00:42:49| 11   3  83   1   0   1|  26M   11M|  78M   66M|   0 0
> |
> >  12k   18k
> > 24-06 00:42:51|  4   3  90   1   0   2|  17M  181M| 117M   53M|   0 0
> |
> >  15k   26k
> > 24-06 00:42:53|  4   3  87   4   0   2|  15M  375M| 117M   55M|   0 0
> |
> >  16k   26k
> > 24-06 00:42:55|  3   2  94   1   0   1|  15M   37M|  80M   17M|   0 0
> |
> >  10k   15k
> > 24-06 00:42:57|  0   0  98   1   0   0|  18M   23M|7259k 5988k|   0 0
> > |1932  1066
> > 24-06 00:42:59|  0   0  98   1   0   0|  16M  132M| 708k  106k|   0 0
> > |1484   491
> > 24-06 00:43:01|  4   2  91   2   0   1|  23M   64M|  76M   41M|   0 0
> > |844113k
> > 24-06 00:43:03|  4   3  88   3   0   1|  17M  207M|  91M   48M|   0 0
> |
> >  11k   16k
> >
> > From the result of dstat, we can see that the throughput of write is much
> > more than read.
> > I've started a balancer processor, with dfs.balance.bandwidthPerSec set
> to
> > bytes. From
> > the balancer log, I can see the balancer works well. But the balance
> > operation can not
> > catch up with the write operation.
> >
> > Now I can only stop the mad increase of data size by stopping the
> datanode,
> > and setting
> > dfs.datanode.du.reserved 300GB, then starting the datanode again. Until
> the
> > total size
> > reaches the 300GB reservation line, the increase stopped.
> >
> > The output of 'hadoop dfsadmin -report' shows for the crazy nodes,
> >
> > Name: 10.150.161.88:50010
> > Decommission Status : Normal
> > Configured Capacity: 20027709382656 (18.22 TB)
> > DFS Used: 14515387866480 (13.2 TB)
> > Non DFS Used: 0 (0 KB)
> > DFS Remaining: 5512321516176(5.01 TB)
> > DFS Used%: 72.48%
> > DFS Remaining%: 27.52%
> > Last contact: Wed Jun 29 21:03:01 CST 2011
> >
> >
> > Name: 10.150.161.76:50010
> > Decommission Status : Normal
> > Configured Capacity: 20027709382656 (18.22 TB)
> > DFS Used: 16554450730194 (15.06 TB)
> > Non DFS Used: 0 (0 KB)
> > DFS Remaining: 3473258652462(3.16 TB)
> > DFS Used%: 82.66%
> > DFS Remaining%: 17.34%
> > Last contact: Wed Jun 29 21:03:02 CST 2011
> >
> > while the other normal datanode, it just like
> >
> > Name: 10.150.161.65:50010
> > Decommission Status : Normal
> > Configured Capacity: 23627709382656 (21.49 TB)
> > DFS Used: 5953984552236 (5.42 TB)
> > Non DFS Used: 1200643810004 (1.09 TB)
> > DFS Remaining: 16473081020416(14.98 TB)
> > DFS Used%: 25.2%
> > DFS Remaining%: 69.72%
> > Last contact: Wed Jun 29 21:03:01 CST 2011
> >
> >
> > Name: 10.150.161.80:50010
> > Decommission Status : Normal
> > Configured Capacity: 23627709382656 (21.49 TB)
> > DFS Used: 5982565373592 (5.44 TB)
> > Non DFS Used: 1202701691240 (1.09 TB)
> > DFS Remaining: 16442442317824(14.95 TB)
> > DFS Used%: 25.32%
> > DFS Remaining%: 69.59%
> > Last contact: Wed Jun 29 21:03:02 CST 2011
> >
> > Any hint on this issue? We are using 0.20.2-cdh3u0.
> >
> > Thanks and regards,
> >
> > Mao Xu-Feng
> >
>


Re: extremely imbalance in the hdfs cluster

2011-06-29 Thread Edward Capriolo
We have run into this issue as well. Since hadoop is RR writing different
size disks really screw things up royally especially if you are running at
high capacity. We have found that decommissioning hosts for stretches of
time is more effective then the balancer in extreme situations. Another
hokey trick is that nodes that launch a job always use that node as the
first replica. You can leverage that by launching jobs from your bigger
machines which makes data more likely to be saved there. Super hokey
solution is moving blocks around with rsync! (block reports later happen and
deal with this (I do not suggest this)).

Hadoop really does need a more intelligent system then Round Robin writing
for heterogeneous systems, there might be a jira open on this somewhere. But
if you are on 0.20.X you have to work with it.

Edward

On Wed, Jun 29, 2011 at 9:06 AM, 茅旭峰  wrote:

> Hi,
>
> I'm running a 37 DN hdfs cluster. There are 12 nodes have 20TB capacity
> each
> node, and the other 25 nodes have 24TB each node.Unfortunately, there are
> several nodes that contain much more data than others, and I can still see
> the data increasing crazy. The 'dstat' shows
>
> dstat -ta 2
> -time- total-cpu-usage -dsk/total- -net/total- ---paging--
> ---system--
>  date/time   |usr sys idl wai hiq siq| read  writ| recv  send|  in   out |
> int   csw
> 24-06 00:42:43|  1   1  95   2   0   0|  25M   62M|   0 0 |   0   0.1
> |3532  5644
> 24-06 00:42:45|  7   1  91   0   0   0|  16k  176k|8346B 1447k|   0 0
> |1201   365
> 24-06 00:42:47|  7   1  91   0   0   0|  12k  172k|9577B 1493k|   0 0
> |1223   334
> 24-06 00:42:49| 11   3  83   1   0   1|  26M   11M|  78M   66M|   0 0 |
>  12k   18k
> 24-06 00:42:51|  4   3  90   1   0   2|  17M  181M| 117M   53M|   0 0 |
>  15k   26k
> 24-06 00:42:53|  4   3  87   4   0   2|  15M  375M| 117M   55M|   0 0 |
>  16k   26k
> 24-06 00:42:55|  3   2  94   1   0   1|  15M   37M|  80M   17M|   0 0 |
>  10k   15k
> 24-06 00:42:57|  0   0  98   1   0   0|  18M   23M|7259k 5988k|   0 0
> |1932  1066
> 24-06 00:42:59|  0   0  98   1   0   0|  16M  132M| 708k  106k|   0 0
> |1484   491
> 24-06 00:43:01|  4   2  91   2   0   1|  23M   64M|  76M   41M|   0 0
> |844113k
> 24-06 00:43:03|  4   3  88   3   0   1|  17M  207M|  91M   48M|   0 0 |
>  11k   16k
>
> From the result of dstat, we can see that the throughput of write is much
> more than read.
> I've started a balancer processor, with dfs.balance.bandwidthPerSec set to
> bytes. From
> the balancer log, I can see the balancer works well. But the balance
> operation can not
> catch up with the write operation.
>
> Now I can only stop the mad increase of data size by stopping the datanode,
> and setting
> dfs.datanode.du.reserved 300GB, then starting the datanode again. Until the
> total size
> reaches the 300GB reservation line, the increase stopped.
>
> The output of 'hadoop dfsadmin -report' shows for the crazy nodes,
>
> Name: 10.150.161.88:50010
> Decommission Status : Normal
> Configured Capacity: 20027709382656 (18.22 TB)
> DFS Used: 14515387866480 (13.2 TB)
> Non DFS Used: 0 (0 KB)
> DFS Remaining: 5512321516176(5.01 TB)
> DFS Used%: 72.48%
> DFS Remaining%: 27.52%
> Last contact: Wed Jun 29 21:03:01 CST 2011
>
>
> Name: 10.150.161.76:50010
> Decommission Status : Normal
> Configured Capacity: 20027709382656 (18.22 TB)
> DFS Used: 16554450730194 (15.06 TB)
> Non DFS Used: 0 (0 KB)
> DFS Remaining: 3473258652462(3.16 TB)
> DFS Used%: 82.66%
> DFS Remaining%: 17.34%
> Last contact: Wed Jun 29 21:03:02 CST 2011
>
> while the other normal datanode, it just like
>
> Name: 10.150.161.65:50010
> Decommission Status : Normal
> Configured Capacity: 23627709382656 (21.49 TB)
> DFS Used: 5953984552236 (5.42 TB)
> Non DFS Used: 1200643810004 (1.09 TB)
> DFS Remaining: 16473081020416(14.98 TB)
> DFS Used%: 25.2%
> DFS Remaining%: 69.72%
> Last contact: Wed Jun 29 21:03:01 CST 2011
>
>
> Name: 10.150.161.80:50010
> Decommission Status : Normal
> Configured Capacity: 23627709382656 (21.49 TB)
> DFS Used: 5982565373592 (5.44 TB)
> Non DFS Used: 1202701691240 (1.09 TB)
> DFS Remaining: 16442442317824(14.95 TB)
> DFS Used%: 25.32%
> DFS Remaining%: 69.59%
> Last contact: Wed Jun 29 21:03:02 CST 2011
>
> Any hint on this issue? We are using 0.20.2-cdh3u0.
>
> Thanks and regards,
>
> Mao Xu-Feng
>


extremely imbalance in the hdfs cluster

2011-06-29 Thread 茅旭峰
Hi,

I'm running a 37 DN hdfs cluster. There are 12 nodes have 20TB capacity each
node, and the other 25 nodes have 24TB each node.Unfortunately, there are
several nodes that contain much more data than others, and I can still see
the data increasing crazy. The 'dstat' shows

dstat -ta 2
-time- total-cpu-usage -dsk/total- -net/total- ---paging--
---system--
  date/time   |usr sys idl wai hiq siq| read  writ| recv  send|  in   out |
int   csw
24-06 00:42:43|  1   1  95   2   0   0|  25M   62M|   0 0 |   0   0.1
|3532  5644
24-06 00:42:45|  7   1  91   0   0   0|  16k  176k|8346B 1447k|   0 0
|1201   365
24-06 00:42:47|  7   1  91   0   0   0|  12k  172k|9577B 1493k|   0 0
|1223   334
24-06 00:42:49| 11   3  83   1   0   1|  26M   11M|  78M   66M|   0 0 |
 12k   18k
24-06 00:42:51|  4   3  90   1   0   2|  17M  181M| 117M   53M|   0 0 |
 15k   26k
24-06 00:42:53|  4   3  87   4   0   2|  15M  375M| 117M   55M|   0 0 |
 16k   26k
24-06 00:42:55|  3   2  94   1   0   1|  15M   37M|  80M   17M|   0 0 |
 10k   15k
24-06 00:42:57|  0   0  98   1   0   0|  18M   23M|7259k 5988k|   0 0
|1932  1066
24-06 00:42:59|  0   0  98   1   0   0|  16M  132M| 708k  106k|   0 0
|1484   491
24-06 00:43:01|  4   2  91   2   0   1|  23M   64M|  76M   41M|   0 0
|844113k
24-06 00:43:03|  4   3  88   3   0   1|  17M  207M|  91M   48M|   0 0 |
 11k   16k

>From the result of dstat, we can see that the throughput of write is much
more than read.
I've started a balancer processor, with dfs.balance.bandwidthPerSec set to
bytes. From
the balancer log, I can see the balancer works well. But the balance
operation can not
catch up with the write operation.

Now I can only stop the mad increase of data size by stopping the datanode,
and setting
dfs.datanode.du.reserved 300GB, then starting the datanode again. Until the
total size
reaches the 300GB reservation line, the increase stopped.

The output of 'hadoop dfsadmin -report' shows for the crazy nodes,

Name: 10.150.161.88:50010
Decommission Status : Normal
Configured Capacity: 20027709382656 (18.22 TB)
DFS Used: 14515387866480 (13.2 TB)
Non DFS Used: 0 (0 KB)
DFS Remaining: 5512321516176(5.01 TB)
DFS Used%: 72.48%
DFS Remaining%: 27.52%
Last contact: Wed Jun 29 21:03:01 CST 2011


Name: 10.150.161.76:50010
Decommission Status : Normal
Configured Capacity: 20027709382656 (18.22 TB)
DFS Used: 16554450730194 (15.06 TB)
Non DFS Used: 0 (0 KB)
DFS Remaining: 3473258652462(3.16 TB)
DFS Used%: 82.66%
DFS Remaining%: 17.34%
Last contact: Wed Jun 29 21:03:02 CST 2011

while the other normal datanode, it just like

Name: 10.150.161.65:50010
Decommission Status : Normal
Configured Capacity: 23627709382656 (21.49 TB)
DFS Used: 5953984552236 (5.42 TB)
Non DFS Used: 1200643810004 (1.09 TB)
DFS Remaining: 16473081020416(14.98 TB)
DFS Used%: 25.2%
DFS Remaining%: 69.72%
Last contact: Wed Jun 29 21:03:01 CST 2011


Name: 10.150.161.80:50010
Decommission Status : Normal
Configured Capacity: 23627709382656 (21.49 TB)
DFS Used: 5982565373592 (5.44 TB)
Non DFS Used: 1202701691240 (1.09 TB)
DFS Remaining: 16442442317824(14.95 TB)
DFS Used%: 25.32%
DFS Remaining%: 69.59%
Last contact: Wed Jun 29 21:03:02 CST 2011

Any hint on this issue? We are using 0.20.2-cdh3u0.

Thanks and regards,

Mao Xu-Feng