Hi all,

I remember there is a parameter that we can turn this off. I mean we do not
allow tasktracker to keep the blocks from other datanode after a MapReduce
job finished.

I met a problem when I using hadoop-0.21.0.

First of all, I balanced cluster according to number of blocks on every
datanode. That's to say, for example, under "/user/test/", I have 100 blocks
data. The replication number is 2. Then there are total 200 block under
"/user/test". I have 10 datanodes. What I do is to let every datanode to
have 20 blocks of the total.

However, after about 300 MapReduce jobs finished. I found out the number of
blocks in datanodes changed. It is not 20 for every datanode. someone got 21
and someone got 19. I turned off the hadoop balancer.

What is the reason caused this problem? Any suggestion will be appreciated!

Best wishes!

Chen

Reply via email to