Re: Could only be replicated to 0 nodes, instead of 1

Stas Oskin Thu, 21 May 2009 13:09:37 -0700

Hi.

I think you should file a jira on this. Most likely this is what is
> happening :
>


Will do - this goes to DFS section, correct?


>
>  * two out of 3 dns can not take anymore blocks.
>  * While picking nodes for a new block, NN mostly skips the third dn as
> well since '# active writes' on it is larger than '2 * avg'.
>  * Even if there is one other block is being written on the 3rd, it is
> still greater than (2 * 1/3).
>

Frankly I'm not so familiar with Hadoop inner workings to understand this
completely, but from what I digest, NN doesn't like the 3rd DN because there
is too many blocks on it, compared to other servers?


>
> To test this, if you write just one block to an idle cluster it should
> succeed.
>

What exactly is "idle cluster"? Something that nothing is being written to
(including the 3rd DN)?


>
> Writing from the client on the 3rd dn succeeds since local node is always
> favored.


Makes sense.


>
> This particular problem is not that severe on a large cluster but HDFS
> should do the sensible thing.
>

Yes, I agree that this is a non-standard situation, but IMHO the best way of
action would be write anyway, but throw a warning. There is one already
appearing when there is not enough space for replication, and it explains
quite well the matter. So similar one would be great.

Re: Could only be replicated to 0 nodes, instead of 1

Reply via email to