If end-to-end is a concern, we could let the client generate the checksums and send it to the data node following the block data.
Hairong -----Original Message----- From: Doug Cutting [mailto:[EMAIL PROTECTED] Sent: Tuesday, January 23, 2007 8:26 PM To: [email protected] Subject: Re: inline checksums Hairong Kuang wrote: > Another option is to create a checksum file per block at the data node > where the block is placed. Yes, but then we'd need a separate checksum implementation for intermediate data, and for other distributed filesystems that don't already guarantee end-to-end data integrity. Also, a checksum per block would not permit checksums on randomly accessed data without re-checksumming the entire block. Finally, the checksum wouldn't be end-to-end. We really want to checksum data as close to its source as possible, then validate that checksum as close to its use as possible. Doug
