Re: Block reports: memory vs. file system, and Dividing offerService into 2 threads

2008-05-01 Thread Cagdas Gerede
As far as I understand, the current focus is on how to reduce namenode's CPU time to process block reports from a lot of datanodes. Don't we miss another issue? Doesn't the way a block report is computed delays the master startup time. I have to make sure the master is up as quick as possible

RE: Block reports: memory vs. file system, and Dividing offerService into 2 threads

2008-04-30 Thread dhruba Borthakur
You bring up a good point. The creating and processing of block reports do take a lot of resources. It affects DFS Scalability and performance to some extent. Here are some more details: http://issues.apache.org/jira/browse/HADOOP-1079 There is one thread in the Datanode that sends block

RE: Block reports: memory vs. file system, and Dividing offerService into 2 threads

2008-04-30 Thread dhruba Borthakur
reports: memory vs. file system, and Dividing offerService into 2 threads dhruba Borthakur wrote: My current thinking is that block report processing should compare the blkxxx files on disk with the data structure in the Datanode memory. If and only if there is some discrepancy between these two