one more thing I forgot :
Another useful tool is 'bin/hadoop dfsadmin -metaSave'
This lists all the blocks that are under replicated. This is not very
official and not used much. It needs to be more user friendly (currently
it writes to a file on the NameNode)... I might improve (and/or rename)
it as a follow up to HADOOP-4103.
Stats like dfs.FSNamesystem.UnderReplicatedBlocks that Brain mentioned
are useful to find if there is some problem. And the commands like the
one above help in finding which specific blocks are in these states.
Raghu.
Bill Au wrote:
I am in the process of setting up remote monitoring of my Hadoop cluster. I
seems to me that the replication status can only be obtained from the
command line by the fsck command. Has anyone though about adding
replication status to the NameNode web UI in dfshealth.jsp? Or is that
something that I really shouldn't worry about since Hadoop will fix things
all by itself?
Bill