Some additional details...

A few years ago, we moved from Protocol Buffers 2.4.1 to 2.5.0.  There
were some challenges with that upgrade, because 2.5.0 was not
backwards-compatible with 2.4.1.  We needed to coordinate carefully with
projects downstream of Hadoop that receive our protobuf classes through
transitive dependency.  Here are a few issues with more background:

https://issues.apache.org/jira/browse/HADOOP-9845

https://issues.apache.org/jira/browse/HBASE-8165

https://issues.apache.org/jira/browse/HIVE-5112

It was important to complete this upgrade before Hadoop 2.x came out of
beta.  After that, we committed to a policy of backwards-compatibility
within the 2.x release line.  I can't find a statement about whether or
not Protocol Buffers 2.6.1 is backwards-compatible with 2.5.0 (both at
compile time and on the wire).  Do you know the answer?  If it's
backwards-incompatible, then we wouldn't be able to do this upgrade within
Hadoop 2.x, though we could consider it for 3.x (trunk).

In general, we upgrade dependencies when a new release offers a compelling
benefit, not solely to keep up with the latest.  In the case of 2.5.0,
there was a performance benefit.  Looking at the release notes for 2.6.0
and 2.6.1, I don't see anything particularly compelling.  (That's just my
opinion though, and others might disagree.)

https://github.com/google/protobuf/blob/master/CHANGES.txt

BTW, if anyone is curious, it's possible to try a custom build right now
linked against 2.6.1.  You'd pass -Dprotobuf.version=2.6.1 and
-Dprotoc.path=<path to protoc 2.6.1 binary> when you run the mvn command.


--Chris Nauroth




On 5/13/15, 8:59 AM, "Allen Wittenauer" <a...@altiscale.com> wrote:

>
>On May 13, 2015, at 5:02 AM, Alan Burlison <alan.burli...@oracle.com>
>wrote:
>
>> The current version of Protocol Buffers is 2.6.1 but the current
>>version required by Hadoop is 2.5.0. Is there any reason for this, or
>>should I log a JIRA to get it updated?
>
>       The story of protocol buffers is part of a shameful past where Hadoop
>trusted Google.  This was a terrible mistake, based upon the last time
>the project upgraded.  2.4->2.5 required some source level, non-backward
>compatible, and completely-avoidable-but-G-made-us-do-it-anyway surgery
>to make work. This also ended up being a flag day for every single
>developer who not only worked with Hadoop but all of the downstream
>projects as well.  Big disaster.
>
>       The fact that when Google shut down Google Code, they didn't even tag
>previous releases  in the github source tree without significant amount
>of pressure from the open source community was just adding insult to
>injury.  As a result, I believe the collective opinion is to just flat
>out avoid adding any more Google bits into the system.
>
>       See also: guava, which suffers from the same shortsightedness.
>
>       At some point, we'll either upgrade, switch to a different protocol
>serialization format, or fork protobuf.
>

Reply via email to