Re: [Nutch-dev] Re: NameNode scalibility

2005-03-08 Thread Michael Cafarella
Angel, Much of what you're seeing is part of the replication problem. 1) The "Replicated " message is when a successful replication happens. It's not surprising that you see a lot of them. 2) The "Block XX is valid, and cannot be written to" happens when one node tries to replicate

Re: [Nutch-dev] Re: NameNode scalibility

2005-03-08 Thread Angel Faus
Hi, Great. Thanks for the tips. I've tried the following startup sequences: * Start NameNode. Wait until CPU goes to 0. Wait 2 extra minutes. Start all DataNodes. * Start NameNode. Wait until CPU goes to 0. Wait 2 extra minutes. Start each DataNode with a 10 minutes pause between them. * Star

Re: [Nutch-dev] Re: NameNode scalibility

2005-03-07 Thread michael_cafarella
Hi, This is very interesting, thanks, Angel. Doug's right about the datanode startup and replication problem. I believe there's a simple fix for the problem you describe when starting up all the datanodes. He's also probably right about the namenode startup. A Namenode logs all its

[Nutch-dev] Re: NameNode scalibility

2005-03-07 Thread Doug Cutting
Thanks for the report! 400,000 is a larger number of files than I have yet tested NDFS with, and it looks like there are some issues caused by this. Mike has built the largest NDFS systems that I know of (several terabytes spread over around 20 machines) but these probably had less than a thous