Todd Lipcon has submitted this change and it was merged.

Change subject: KUDU-1600 (part 1): bump to CFile version 2
......................................................................


KUDU-1600 (part 1): bump to CFile version 2

This is preparatory yak-shaving work for KUDU-1600, an effort to prevent
double-compression of inherently-compressed encodings such as
bitshuffle. In order to enable or disable compression on a per-block
basis, we need to change the header of the compressed blocks to indicate
whether that block was compressed. Doing so, however, is obviously an
incompatible change that earlier versions of Kudu would be confused by.

This is not itself a blocker: we haven't guaranteed yet that we will
always support downgrade between successive versions of Kudu. However,
if someone _does_ downgrade, we would prefer not to crash in confusing
ways or silently return corrupt data. Instead, we would like to give an
error message that clearly traces back to an incompatible file format.

At first glance, it would seem that the route to make such an
incompatible change would be to bump the 'major_version' field present
in the CFileHeaderPB. However, it turns out, despite this field existing
since the very first ever commit to Kudu, we never implemented any code
that actually _reads_ the field. So, bumping the major version would not
actually prevent an incompatible (earlier) Kudu server from reading a
file that it can't understand. Woops!

So, in order to make the new CFiles incompatible with the old, the only
solution is to actually bump the CFile magic string itself. This commit
changes it from 'kuducfil' to 'kuducfl2'. An incompatible server trying
to read such a file will now result in a "bad magic" error message. It's
still not 100% obvious to the average user, but it's better than a weird
error about mismatched compression lengths or a crash.

While bumping the magic string solves the immediate issue at hand, this
is also a good opportunity to fix the structural problem and add a
better mechanism to make such changes in the future.

Rather than implement the original plan of major/minor version, this
patch takes a "feature flags" approach modeled after the 'ext' family of
file systems. The advantages of feature flags over a linear 'version
number' scheme are:

- simplifies non-linear backporting of features to minor releases. For
  example, if Kudu 2.0 adds feature X, and 2.1 adds feature Y, this
  makes it possible to backport Y without X to Kudu 1.8.
- allows granular enablement of new features. For example, imagine that
  we introduce an experimental feature (eg an optimization enabled by a
  flag) in Kudu 1.3, and a user upgrades to 1.3 without enabling that
  flag. We would then avoid writing the flag into any cfiles, and we
  would allow compatible downgrade back to 1.2.
- allows distinguishing behavior between incompatible and compatible
  changes

I manually validated that, if I wrote data with a new server, and tried
to read it with an old one, it gave me an error with "bad magic" as
expected.

Change-Id: I13999de3d406fb184cea3e6e8d52ffd80b8d8ee3
Reviewed-on: http://gerrit.cloudera.org:8080/5678
Reviewed-by: Alexey Serbin <aser...@cloudera.com>
Tested-by: Kudu Jenkins
---
M docs/design-docs/cfile.md
M src/kudu/cfile/cfile.proto
M src/kudu/cfile/cfile_reader.cc
M src/kudu/cfile/cfile_reader.h
M src/kudu/cfile/cfile_writer.cc
M src/kudu/cfile/cfile_writer.h
M src/kudu/tools/kudu-tool-test.cc
7 files changed, 81 insertions(+), 24 deletions(-)

Approvals:
  Alexey Serbin: Looks good to me, approved
  Kudu Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/5678
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I13999de3d406fb184cea3e6e8d52ffd80b8d8ee3
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <t...@apache.org>
Gerrit-Reviewer: Adar Dembo <a...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <aser...@cloudera.com>
Gerrit-Reviewer: Jean-Daniel Cryans <jdcry...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <t...@apache.org>

Reply via email to