2suresh:
> When you brought down the DN, the blocks in it were
> replicated to the remaining DNs. When the DN was
> added back, the blocks in it were over replicated, resulting
> in deletion of the extra replica.

Hm, this makes sense if after starting the DN which has some blocks data on
it, the existing blocks are caught up.

2Stack:
> Could it be that the du was counting the downed DNs blocks for a
> while.

Do you mean du may still count blocks of DN which was considered as "dead"
by NN (and NN had already started replication of under-replicated blocks)?
Sounds weird: if NN sees under-replicated blocks then du should also see
them as under-replicated (i.e. not allocated on dead DN), right?

> When you brought back the old DN, NN told it
> clean up blocks it had replicated elsewhere?

Not sure about this. I watched a bit for the status of DN and blocks on it
via NN web UI: there's a table for that (shows DN list with block number on
each). I *think* after the DN was back the table showed 0 blocks for it.
Although, I think at some point after DN stop (or was it after DN already
started back?) I noticed that total number of blocks is bigger than it was
before I stopped it. In the long run hdfs status check told me that all
blocks are replicated exactly 2 times.

Unfortunately I haven't watched for the status & stats closely during the
procedure of DN reconfiguring as I haven't expected smth goes weird. Will
watch more closely next time.

Alex.

On Tue, Mar 15, 2011 at 10:32 AM, suresh srinivas <srini30...@gmail.com>wrote:

> When you brought down the DN, the blocks in it were replicated to the
> remaining DNs. When the DN was added back, the blocks in it were over
> replicated, resulting in deletion of the extra replica.
>
> On Mon, Mar 14, 2011 at 7:34 AM, Alex Baranau <alex.barano...@gmail.com>wrote:
>
>> Hello,
>>
>> As far as I understand, since "hadoop fs -du" command uses Linux' "du"
>> internally this mean that the number of replicas (at the moment of command
>> run) affect the result. Is that correct?
>>
>> I have the following case.
>> I have a small (1 master + 5 slaves each with DN, TT & RS) test HBase
>> cluster with replication set to 2. The tables data size is monitoried with
>> the help of "hadoop fs -du" command. There's a table which is constantly
>> written to: data is only added in it.
>> At some point I decided to reconfigure one of the slaves and shut it down.
>> After reconfiguration (HBase already marked it as dead one) I brought it up
>> again. Things went smoothly. However on the table size graph (I drew from
>> data fetched with "hadoop fs -du" command) I noticed a little spike up on
>> data size and then it went down to the normal/expected values. Can it be so
>> that at some point of the taking out/reconfiguring/adding back node
>> procedure at some point blocks were over-replicated? I'd expect them to be
>> under-replicated for some time (as DN is down) and I'd expect to see the
>> inverted spike: small decrease in data amount and then back to "expected"
>> rate (after all blocks got replicated again). Any ideas?
>>
>> Thank you,
>>
>> Alex Baranau
>> ----
>> Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - Hadoop -
>> HBase
>>
>
>
>
> --
> Regards,
> Suresh
>
>

Reply via email to