Nigel Daley wrote:
As you realized below, the test was using raw methods before HADOOP-928. I don't understand your reference to "undocumented" and "unsupported", but I'm not sure it matters.

The 'raw' methods were only intended to be used by FileSystem implementations.

One of the design goals of the test is to remove the effects of DataNodes as much as possible since this is a NameNode benchmark. That's why we used the raw methods (therefore no crc's). We run it with 1 byte files with 1 byte blocks with a replication factor of 1, all designed to maximize the load on the NameNode and minimize the effects of the DataNodes.

In this case, the current checksum implementation should simply double the number of both namenode and datanode calls over raw calls. So it should still be a fine benchmark, you just need to double the rates to make them comparable. The new checksum implementation should be nearly as fast as raw calls were.

Doug

Reply via email to