On Dec 7, 2010, at 9:26 AM, Doug Cutting wrote:

On 12/07/2010 08:12 AM, Arun C Murthy wrote:
Blocking extensions to SequenceFile is unreasonable as has been noted by
several folks, there is no *technical* reason to do that.

The change to SequenceFile is incompatible with older versions of Hadoop. It changes the file's version number so that older versions will not be able to read data written by newer versions. This is a technical issue.

The new code reads the new or old versions of SequenceFile seamlessly using auto-detection of the version. The old code fails with an explicit message saying that it can't read this version. This is the only mechanism available when upgrading a file format with a single version number and is the mechanism that we've used 6 times in the past.

If we'd used ProtocolBuffers for the SequenceFile header, we'd have more options for backwards compatibility, but we didn't.

-- Owen

Reply via email to