On 2011-12-07 11:43, Erik Forsberg wrote:
Hi!

I'm facing the problem where datanodes are marked as down due to them
being to slow in doing blockreports, which in turn is due to too many
blocks per node. I.e. https://issues.apache.org/jira/browse/HADOOP-4584,
but I can't easily upgrade to 0.21.

So I came up with a possible workaround - run multiple datanode
instances on each physical node, each handling a subset of the disks on
that node. Not sure it will work, but could be worth a try.

So I configured a second datanode on one of my nodes, configured to run
on a different set of ports, and configured the two datanode instances
to use half of the disks each.

However, when starting up this configuration, I get the below exception
(UnregisteredDatanodeException) in the namenode log, and the datanode
then shuts down after reporting the same.

I found a way:

1) Configure second datanode with a set of fresh empty directories.
2) Start second datanode, let it register with namenode.
3) Shut down first and second datanode, then move blk* and subdir dirs from data dirs of first node to data dirs of second datanode.
4) Start first and second datanode.

This seems to work as intended. However, after some thinking I came to worry about the replication. HDFS will now consider the two datanode instances on the same host as two different hosts, which may cause replication to put two copies of the same file on the same host.

It's probably not going to happen very often given that there's some randomness involved. And in my case there's always a third copy on another rack.

Still, it's less than optimal. Are there any ways to fool HDFS into always placing all copies on different physical hosts in this rather messed up configuration?

Thanks,
\EF

Reply via email to