Angel,
Much of what you're seeing is part of the replication problem.
1) The "Replicated " message is when a successful replication
happens. It's not surprising that you see a lot of them.
2) The "Block XX is valid, and cannot be written to" happens when one
node tries to replicate
Hi,
Great. Thanks for the tips.
I've tried the following startup sequences:
* Start NameNode. Wait until CPU goes to 0. Wait 2 extra minutes.
Start all DataNodes.
* Start NameNode. Wait until CPU goes to 0. Wait 2 extra minutes.
Start each DataNode with a 10 minutes pause between them.
* Star
Hi,
This is very interesting, thanks, Angel.
Doug's right about the datanode startup and replication problem. I believe
there's a simple fix for the problem you describe when starting up all the
datanodes.
He's also probably right about the namenode startup. A Namenode logs all its
Thanks for the report!
400,000 is a larger number of files than I have yet tested NDFS with,
and it looks like there are some issues caused by this. Mike has built
the largest NDFS systems that I know of (several terabytes spread over
around 20 machines) but these probably had less than a thous