Todd Lipcon has posted comments on this change. Change subject: WIP: KUDU-463. Add checksumming to cfile ......................................................................
Patch Set 15: (13 comments) http://gerrit.cloudera.org:8080/#/c/6630/15/docs/design-docs/cfile.md File docs/design-docs/cfile.md: Line 31: <header length>: 32-bit unsigned integer length delimiter specify whether this includes the checksum or just the following PB Line 33: <checksum>: An optional Crc32 checksum of the entire header does this include the magic? Line 43: <checksum>: An optional Crc32 checksum of the entire footer same - including the footer, magic, and pb len? Line 211: When checksums for the header, data, and footer are written in the CFile nit: add comma at end of line http://gerrit.cloudera.org:8080/#/c/6630/15/src/kudu/cfile/cfile-test.cc File src/kudu/cfile/cfile-test.cc: Line 223: opts.storage_attributes.compression = compression; oh, that's interesting, we forgot to actually use the compression here? http://gerrit.cloudera.org:8080/#/c/6630/15/src/kudu/cfile/cfile_reader.cc File src/kudu/cfile/cfile_reader.cc: Line 141: // Parse Footer first to find unsupported features. does this mean that an old server will have problems with checksummed headers, because it will fail while reading the header rather than seeing the incompatible feature flag in the footer first? Line 146: if (PREDICT_FALSE(footer_->incompatible_features() > IncompatibleFeatures::ALL)) { '>' strikes me as strange here vs doing & ~IncompatibleFeatures::ALL PS15, Line 150: ALL instead of ALL, maybe a better name would be 'SUPPORTED'? Line 239: if (footer_size >= file_size_ - kMagicAndLengthSize) { do we want to have some kind of max size here? am afraid a flipped bit in a large block would cause a huge stack alloc below. alernatively we could use faststring for the buffer below, which is still stack-allocated for small sizes but avoids blowing out the stack in the case of a corruption PS15, Line 416: uint32_t checksum_size = sizeof(uint32_t); you have this in a few places. Maybe better to just have a global constant kChecksumSize? http://gerrit.cloudera.org:8080/#/c/6630/15/src/kudu/cfile/cfile_writer.cc File src/kudu/cfile/cfile_writer.cc: PS15, Line 230: = maybe |= here so it doesn't need to change when we add some other feature PS15, Line 500: checksum data checksum typo? http://gerrit.cloudera.org:8080/#/c/6630/15/src/kudu/util/crc-test.cc File src/kudu/util/crc-test.cc: Line 50: const uint64_t expected_crc = 0xa9421b7; // Known value from crcutil usage test program. nit: kExpectedCrc -- To view, visit http://gerrit.cloudera.org:8080/6630 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I6756834cd7f27af258797a3654a95244abeb0976 Gerrit-PatchSet: 15 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Grant Henke <granthe...@gmail.com> Gerrit-Reviewer: Adar Dembo <a...@cloudera.com> Gerrit-Reviewer: Grant Henke <granthe...@gmail.com> Gerrit-Reviewer: Jean-Daniel Cryans <jdcry...@apache.org> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy <mpe...@apache.org> Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon <t...@apache.org> Gerrit-HasComments: Yes