I agree with Jay B. Checksumming is usually the culprit for high CPU on clients and datanodes. Plus, a checksum of 4 bytes for every 512, means for 64MB block, the checksum will be 512KB, i.e. 128 ext3 blocks. Changing it to generate 1 ext3 checksum block per DFS block will speedup read/write without any loss of reliability.
- milind --- Milind Bhandarkar (mbhandar...@linkedin.com) (650-776-3236)