Hello Adar Dembo,
I'd like you to do a code review. Please visit
http://gerrit.cloudera.org:8080/2815
to review the following change.
Change subject: KUDU-1377 (part 2): Add version 2 of protobuf container file
......................................................................
KUDU-1377 (part 2): Add version 2 of protobuf container file
The protobuf container file format V1 was missing some features that
made particular corruption cases impossible to differentiate:
1. Since the length did not have its own checksum, a corrupted length
field (huge number) can look like a truncated file.
2. The file header itself, including the version field, did not have a
checksum. This may compromise version-specific logic.
Version 2 of the protobuf container file format adds these fields to the
file format. While the default version of protobuf container files has
been changed to version 2 with this patch, support for writing the
original V1 format has not been dropped. The API for creating new V1
format files is contrained only to tests, but an existing V1 format file
may be opened, read, and appended to.
Changes in this patch:
* Document and implement the version 2 file format.
* Return Status::Incomplete for partial record read at end of file
(for V2 format only).
* This allows the protobuf container file API to express the
difference between clean EOF and partial-record EOF.
* Rearrange code necessary for reading the file header and version in
order to share it between the Readable* and Writable* classes.
* Change from WritableFile to RWFile in the pb writer to allow reading
the version information from the file header before appending
additional records to the file.
* Add Reopen() method to the writer so it can detect its format version.
Callers that wish to append to an existing PB container file must now
call Reopen() first. Callers creating new files still call Init().
* Rename the reader Init() method to Open() since Init() in the writer
is for creating new files and this makes the classes more symmetric.
* Parameterize the pb_util unit tests to test writing and reading both
version 1 and version 2 of the file format.
* Add tests for partial record truncation and continuing to append when
using the PB container format in an append-only, log-style use case.
Change-Id: I239c7b99a55a74a6a658ff418830c1f9be13eaa6
---
M src/kudu/fs/log_block_manager.cc
M src/kudu/tools/pbc-dump.cc
M src/kudu/util/pb_util-test.cc
M src/kudu/util/pb_util.cc
M src/kudu/util/pb_util.h
5 files changed, 748 insertions(+), 282 deletions(-)
git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/15/2815/1
--
To view, visit http://gerrit.cloudera.org:8080/2815
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-MessageType: newchange
Gerrit-Change-Id: I239c7b99a55a74a6a658ff418830c1f9be13eaa6
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy <[email protected]>
Gerrit-Reviewer: Adar Dembo <[email protected]>