Re: Protocol Buffers version
On 16/06/2015 10:54, Steve Loughran wrote: One reason at least: PB 2.5.0 has no support for Solaris SPARC. 2.6.1 does. to be ruthless, that's not enough reason to upgrade branch-2, due to the transitive pain it makes all the way down. I completely get your point, however we are faced with two pretty equally unpalatable options, either fork PB 2.5.0 and add support for Solaris SPARC or switch to 2.6.1. Although as I've found out, even though 2.6.1 claims to support Solaris SPARC it doesn't, and needs a patch (albeit a small one) to get it to work :-/ From what I can gather, cross-platform support in PB breaks fairly regularly, -- Alan Burlison --
Re: Protocol Buffers version
On Jun 16, 2015, at 2:54 AM, Steve Loughran wrote: One reason at least: PB 2.5.0 has no support for Solaris SPARC. 2.6.1 does. > > to be ruthless, that's not enough reason to upgrade branch-2, due to the > transitive pain it makes all the way down. Not in branch-2, but certainly in trunk.
Re: Protocol Buffers version
> On 15 Jun 2015, at 22:31, Colin P. McCabe wrote: > > On Mon, Jun 15, 2015 at 7:24 AM, Allen Wittenauer wrote: >> >> On Jun 12, 2015, at 1:03 PM, Alan Burlison wrote: >> >>> On 14/05/2015 18:41, Chris Nauroth wrote: >>> As a reminder though, the community probably would want to see a strong justification for the upgrade in terms of features or performance or something else. Right now, I'm not seeing a significant benefit for us based on my reading of their release notes. I think it's worthwhile to figure this out first. Otherwise, there is a risk that any testing work turns out to be a wasted effort. >>> >>> One reason at least: PB 2.5.0 has no support for Solaris SPARC. 2.6.1 does. to be ruthless, that's not enough reason to upgrade branch-2, due to the transitive pain it makes all the way down. >> >> >>That's a pretty good reason. >> >>Some of us had a discussion at Summit about effectively forking >> protobuf and making it an Apache TLP. This would give us a chance to get >> out from under Google's blind spot, guarantee better compatibility across >> the ecosystem, etc, etc. >> >>It is sounding more and more like that's really what needs to happen. > > I agree that it would be nice if the protobuf project avoided making > backwards-incompatible API changes within a minor release. But in > practice, we have had the same issues with Jackson, Guava, jets3t, and > other dependencies. Nearly every important Hadoop dependency has made > backwards-incompatible API changes within a minor release of the > dependency... and that's one reason we are using such old versions of > everything. I don't think PB deserves to be singled out as much as it > has been. I think it does deserve as it was such an all-or-nothing change. Guava, well, we may keep it at 11.0, but we've made sure there are no classes used which aren't in the latest versions. Even where we depend on artifacts which need later versions (curator-2.7.1) we've addressed the version problem by verifying that you can actually rebuild curator with guava<-11.0 with everything working (curator-x-discovery doesn't compile, but we don't use that). So we know that unless a bit of curator uses reflection, we can run it against 11.x. And if someone wants to use a later version of Guava + hadoop-common, they can swap it in and hadoop will still work. Which is important as on Java 8u45 + you do need a recent Guava. In contrast, protobuf needed a co-ordinate update across everything, every project which had checked in their generated protobuf files had to rebuild and check in, which guarantees they could no longer work with protobuf 2.4 Jackson? its broken-ness wasn't so obvious: if we'd known I wouldn't have let it go updated. It's now on the risk list and I don't see us updating that for a long time. > I think the work going on now to implement CLASSPATH > isolation in Hadoop will really be beneficial here because we will be > able to upgrade without worrying about these problems. +1
Re: Protocol Buffers version
On Mon, Jun 15, 2015 at 7:24 AM, Allen Wittenauer wrote: > > On Jun 12, 2015, at 1:03 PM, Alan Burlison wrote: > >> On 14/05/2015 18:41, Chris Nauroth wrote: >> >>> As a reminder though, the community probably would want to see a strong >>> justification for the upgrade in terms of features or performance or >>> something else. Right now, I'm not seeing a significant benefit for us >>> based on my reading of their release notes. I think it's worthwhile to >>> figure this out first. Otherwise, there is a risk that any testing work >>> turns out to be a wasted effort. >> >> One reason at least: PB 2.5.0 has no support for Solaris SPARC. 2.6.1 does. > > > That's a pretty good reason. > > Some of us had a discussion at Summit about effectively forking > protobuf and making it an Apache TLP. This would give us a chance to get out > from under Google's blind spot, guarantee better compatibility across the > ecosystem, etc, etc. > > It is sounding more and more like that's really what needs to happen. I agree that it would be nice if the protobuf project avoided making backwards-incompatible API changes within a minor release. But in practice, we have had the same issues with Jackson, Guava, jets3t, and other dependencies. Nearly every important Hadoop dependency has made backwards-incompatible API changes within a minor release of the dependency... and that's one reason we are using such old versions of everything. I don't think PB deserves to be singled out as much as it has been. I think the work going on now to implement CLASSPATH isolation in Hadoop will really be beneficial here because we will be able to upgrade without worrying about these problems. cheers, Colin
Re: Protocol Buffers version
On Mon, Jun 15, 2015 at 8:57 AM, Andrew Purtell wrote: > I can't answer the original question but can point out the protostuff ( > https://github.com/protostuff/protostuff) folks have been responsive and > friendly in the past when we (HBase) were curious about swapping in their > stuff. Two significant benefits of protostuff, IMHO, is ASL 2 licensing and > everything is implemented in Java including the compiler. Big +1 to protostuff from community, licensing and implementation perspectives. Thanks, Roman.
Re: Protocol Buffers version
I can't answer the original question but can point out the protostuff ( https://github.com/protostuff/protostuff) folks have been responsive and friendly in the past when we (HBase) were curious about swapping in their stuff. Two significant benefits of protostuff, IMHO, is ASL 2 licensing and everything is implemented in Java including the compiler. On Mon, Jun 15, 2015 at 8:49 AM, Sean Busbey wrote: > Anyone have a read on how the protobuf folks would feel about that? Apache > has a history of not accepting projects that are non-amicable forks. > > On Mon, Jun 15, 2015 at 9:24 AM, Allen Wittenauer > wrote: > > > > > On Jun 12, 2015, at 1:03 PM, Alan Burlison > > wrote: > > > > > On 14/05/2015 18:41, Chris Nauroth wrote: > > > > > >> As a reminder though, the community probably would want to see a > strong > > >> justification for the upgrade in terms of features or performance or > > >> something else. Right now, I'm not seeing a significant benefit for > us > > >> based on my reading of their release notes. I think it's worthwhile > to > > >> figure this out first. Otherwise, there is a risk that any testing > work > > >> turns out to be a wasted effort. > > > > > > One reason at least: PB 2.5.0 has no support for Solaris SPARC. 2.6.1 > > does. > > > > > > That's a pretty good reason. > > > > Some of us had a discussion at Summit about effectively forking > > protobuf and making it an Apache TLP. This would give us a chance to get > > out from under Google's blind spot, guarantee better compatibility across > > the ecosystem, etc, etc. > > > > It is sounding more and more like that's really what needs to > > happen. > > > > > -- > Sean > -- Best regards, - Andy Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White)
Re: Protocol Buffers version
Anyone have a read on how the protobuf folks would feel about that? Apache has a history of not accepting projects that are non-amicable forks. On Mon, Jun 15, 2015 at 9:24 AM, Allen Wittenauer wrote: > > On Jun 12, 2015, at 1:03 PM, Alan Burlison > wrote: > > > On 14/05/2015 18:41, Chris Nauroth wrote: > > > >> As a reminder though, the community probably would want to see a strong > >> justification for the upgrade in terms of features or performance or > >> something else. Right now, I'm not seeing a significant benefit for us > >> based on my reading of their release notes. I think it's worthwhile to > >> figure this out first. Otherwise, there is a risk that any testing work > >> turns out to be a wasted effort. > > > > One reason at least: PB 2.5.0 has no support for Solaris SPARC. 2.6.1 > does. > > > That's a pretty good reason. > > Some of us had a discussion at Summit about effectively forking > protobuf and making it an Apache TLP. This would give us a chance to get > out from under Google's blind spot, guarantee better compatibility across > the ecosystem, etc, etc. > > It is sounding more and more like that's really what needs to > happen. -- Sean
Re: Protocol Buffers version
On Jun 12, 2015, at 1:03 PM, Alan Burlison wrote: > On 14/05/2015 18:41, Chris Nauroth wrote: > >> As a reminder though, the community probably would want to see a strong >> justification for the upgrade in terms of features or performance or >> something else. Right now, I'm not seeing a significant benefit for us >> based on my reading of their release notes. I think it's worthwhile to >> figure this out first. Otherwise, there is a risk that any testing work >> turns out to be a wasted effort. > > One reason at least: PB 2.5.0 has no support for Solaris SPARC. 2.6.1 does. That's a pretty good reason. Some of us had a discussion at Summit about effectively forking protobuf and making it an Apache TLP. This would give us a chance to get out from under Google's blind spot, guarantee better compatibility across the ecosystem, etc, etc. It is sounding more and more like that's really what needs to happen.
Re: Protocol Buffers version
On 14/05/2015 18:41, Chris Nauroth wrote: As a reminder though, the community probably would want to see a strong justification for the upgrade in terms of features or performance or something else. Right now, I'm not seeing a significant benefit for us based on my reading of their release notes. I think it's worthwhile to figure this out first. Otherwise, there is a risk that any testing work turns out to be a wasted effort. One reason at least: PB 2.5.0 has no support for Solaris SPARC. 2.6.1 does. -- Alan Burlison --
Re: Protocol Buffers version
> On 19 May 2015, at 17:59, Colin P. McCabe wrote: > > I agree that the protobuf 2.4.1 -> 2.5.0 transition could have been > handled a lot better by Google. Specifically, since it was an > API-breaking upgrade, it should have been a major version bump for the > Java library version. I also feel that removing the download links > for the old versions of the native libraries was careless, and > certainly burned some of our Hadoop users. > > However, I don't see any reason to believe that protobuf 2.6 will not > be wire-compatible with earlier versions. Google has actually been > pretty good about preserving wire-compatibility... just not about API > compatibility. If we want to get a formal statement from the project, > we can, but I would be pretty shocked if they decided to change the > protocol in a backwards-incompatible way in a minor version release. that's what they have done well: wire formats don't break (though you have the freedom to do that by adding new non-optional fields) Of course, they do have the standard service problems then of (a) downgrading if optional fields are omitted and (b) maintaining semantics over time. They just have that at a bigger scale than the rest of us. the 2.4/2.5 switch showed the trouble of using code from a company capable of doing a whole-stack rebuild overnight. They can update a dependency (protobuf.jar, guava.jar) and have it picked up in the binaries. We don't have that luxury. > > I do think there are some potential issues for our users of bumping > the library version in a minor Hadoop release. Until we implement > full dependency isolation for Hadoop, there may be some disruptions to > end-users from changing Java dependency versions. Similarly, users > will need to install a new native protobuf library version as well. > So I think we should bump the protobuf versions in Hadoop 3.0, but not > in 2.x. +1, though I do fear the more things we put off until "3.0", the bigger that switch and so the harder the adoption. FWIW, one area I do find hard with protobuf is trying to set message fields through reflection. That is, I want code that will link against, say, the Hadoop 2.6 binaries, but if there are the extra fields for a 2.7 message, to use them. Deep down in the internals, protobuf should let me do this -but not at the java API level.
Re: Protocol Buffers version
I pushed it out to a github fork: https://github.com/sjlee/protobuf/tree/2.5.0-incompatibility We haven't observed other compatibility issues than these. On Tue, May 19, 2015 at 10:05 PM, Chris Nauroth wrote: > Thanks, Sangjin. I'd be interested in taking a peek at a personal GitHub > repo or even just a patch file of those changes. If there were > incompatibilities, then that doesn't bode well for an upgrade to 2.6. > > --Chris Nauroth > > > > > On 5/19/15, 8:40 PM, "Sangjin Lee" wrote: > > >When we moved to Hadoop 2.4, the associated protobuf upgrade (2.4.1 -> > >2.5.0) proved to be one of the bigger problems. In our case, most of our > >users were using protobuf 2.4.x or earlier. > > > >We identified a couple of places where the backward compatibility was > >broken, and patched for those issues. We've been running with that patched > >version of protobuf 2.5.0 since. I can push out those changes to github or > >something if others are interested FWIW. > > > >Regards, > >Sangjin > > > >On Tue, May 19, 2015 at 9:59 AM, Colin P. McCabe > >wrote: > > > >> I agree that the protobuf 2.4.1 -> 2.5.0 transition could have been > >> handled a lot better by Google. Specifically, since it was an > >> API-breaking upgrade, it should have been a major version bump for the > >> Java library version. I also feel that removing the download links > >> for the old versions of the native libraries was careless, and > >> certainly burned some of our Hadoop users. > >> > >> However, I don't see any reason to believe that protobuf 2.6 will not > >> be wire-compatible with earlier versions. Google has actually been > >> pretty good about preserving wire-compatibility... just not about API > >> compatibility. If we want to get a formal statement from the project, > >> we can, but I would be pretty shocked if they decided to change the > >> protocol in a backwards-incompatible way in a minor version release. > >> > >> I do think there are some potential issues for our users of bumping > >> the library version in a minor Hadoop release. Until we implement > >> full dependency isolation for Hadoop, there may be some disruptions to > >> end-users from changing Java dependency versions. Similarly, users > >> will need to install a new native protobuf library version as well. > >> So I think we should bump the protobuf versions in Hadoop 3.0, but not > >> in 2.x. > >> > >> cheers, > >> Colin > >> > >> On Fri, May 15, 2015 at 4:55 AM, Alan Burlison > >> > >> wrote: > >> > On 15/05/2015 09:44, Steve Loughran wrote: > >> > > >> >> Now: why do you want to use a later version of protobuf.jar? Is it > >> >> because "it is there"? Or is there a tangible need? > >> > > >> > > >> > No, it's because I'm looking at this from a platform perspective: We > >>have > >> > other consumers of ProtoBuf beside Hadoop and we'd obviously like to > >> > minimise the versions of PB that we ship, and preferably just ship the > >> > latest version. The fact that PB seems to often be incompatible across > >> > releases is an issue as it makes upgrading and dropping older versions > >> > problematic. > >> > > >> > -- > >> > Alan Burlison > >> > -- > >> > >
Re: Protocol Buffers version
Thanks, Sangjin. I'd be interested in taking a peek at a personal GitHub repo or even just a patch file of those changes. If there were incompatibilities, then that doesn't bode well for an upgrade to 2.6. --Chris Nauroth On 5/19/15, 8:40 PM, "Sangjin Lee" wrote: >When we moved to Hadoop 2.4, the associated protobuf upgrade (2.4.1 -> >2.5.0) proved to be one of the bigger problems. In our case, most of our >users were using protobuf 2.4.x or earlier. > >We identified a couple of places where the backward compatibility was >broken, and patched for those issues. We've been running with that patched >version of protobuf 2.5.0 since. I can push out those changes to github or >something if others are interested FWIW. > >Regards, >Sangjin > >On Tue, May 19, 2015 at 9:59 AM, Colin P. McCabe >wrote: > >> I agree that the protobuf 2.4.1 -> 2.5.0 transition could have been >> handled a lot better by Google. Specifically, since it was an >> API-breaking upgrade, it should have been a major version bump for the >> Java library version. I also feel that removing the download links >> for the old versions of the native libraries was careless, and >> certainly burned some of our Hadoop users. >> >> However, I don't see any reason to believe that protobuf 2.6 will not >> be wire-compatible with earlier versions. Google has actually been >> pretty good about preserving wire-compatibility... just not about API >> compatibility. If we want to get a formal statement from the project, >> we can, but I would be pretty shocked if they decided to change the >> protocol in a backwards-incompatible way in a minor version release. >> >> I do think there are some potential issues for our users of bumping >> the library version in a minor Hadoop release. Until we implement >> full dependency isolation for Hadoop, there may be some disruptions to >> end-users from changing Java dependency versions. Similarly, users >> will need to install a new native protobuf library version as well. >> So I think we should bump the protobuf versions in Hadoop 3.0, but not >> in 2.x. >> >> cheers, >> Colin >> >> On Fri, May 15, 2015 at 4:55 AM, Alan Burlison >> >> wrote: >> > On 15/05/2015 09:44, Steve Loughran wrote: >> > >> >> Now: why do you want to use a later version of protobuf.jar? Is it >> >> because "it is there"? Or is there a tangible need? >> > >> > >> > No, it's because I'm looking at this from a platform perspective: We >>have >> > other consumers of ProtoBuf beside Hadoop and we'd obviously like to >> > minimise the versions of PB that we ship, and preferably just ship the >> > latest version. The fact that PB seems to often be incompatible across >> > releases is an issue as it makes upgrading and dropping older versions >> > problematic. >> > >> > -- >> > Alan Burlison >> > -- >>
Re: Protocol Buffers version
When we moved to Hadoop 2.4, the associated protobuf upgrade (2.4.1 -> 2.5.0) proved to be one of the bigger problems. In our case, most of our users were using protobuf 2.4.x or earlier. We identified a couple of places where the backward compatibility was broken, and patched for those issues. We've been running with that patched version of protobuf 2.5.0 since. I can push out those changes to github or something if others are interested FWIW. Regards, Sangjin On Tue, May 19, 2015 at 9:59 AM, Colin P. McCabe wrote: > I agree that the protobuf 2.4.1 -> 2.5.0 transition could have been > handled a lot better by Google. Specifically, since it was an > API-breaking upgrade, it should have been a major version bump for the > Java library version. I also feel that removing the download links > for the old versions of the native libraries was careless, and > certainly burned some of our Hadoop users. > > However, I don't see any reason to believe that protobuf 2.6 will not > be wire-compatible with earlier versions. Google has actually been > pretty good about preserving wire-compatibility... just not about API > compatibility. If we want to get a formal statement from the project, > we can, but I would be pretty shocked if they decided to change the > protocol in a backwards-incompatible way in a minor version release. > > I do think there are some potential issues for our users of bumping > the library version in a minor Hadoop release. Until we implement > full dependency isolation for Hadoop, there may be some disruptions to > end-users from changing Java dependency versions. Similarly, users > will need to install a new native protobuf library version as well. > So I think we should bump the protobuf versions in Hadoop 3.0, but not > in 2.x. > > cheers, > Colin > > On Fri, May 15, 2015 at 4:55 AM, Alan Burlison > wrote: > > On 15/05/2015 09:44, Steve Loughran wrote: > > > >> Now: why do you want to use a later version of protobuf.jar? Is it > >> because "it is there"? Or is there a tangible need? > > > > > > No, it's because I'm looking at this from a platform perspective: We have > > other consumers of ProtoBuf beside Hadoop and we'd obviously like to > > minimise the versions of PB that we ship, and preferably just ship the > > latest version. The fact that PB seems to often be incompatible across > > releases is an issue as it makes upgrading and dropping older versions > > problematic. > > > > -- > > Alan Burlison > > -- >
Re: Protocol Buffers version
I agree that the protobuf 2.4.1 -> 2.5.0 transition could have been handled a lot better by Google. Specifically, since it was an API-breaking upgrade, it should have been a major version bump for the Java library version. I also feel that removing the download links for the old versions of the native libraries was careless, and certainly burned some of our Hadoop users. However, I don't see any reason to believe that protobuf 2.6 will not be wire-compatible with earlier versions. Google has actually been pretty good about preserving wire-compatibility... just not about API compatibility. If we want to get a formal statement from the project, we can, but I would be pretty shocked if they decided to change the protocol in a backwards-incompatible way in a minor version release. I do think there are some potential issues for our users of bumping the library version in a minor Hadoop release. Until we implement full dependency isolation for Hadoop, there may be some disruptions to end-users from changing Java dependency versions. Similarly, users will need to install a new native protobuf library version as well. So I think we should bump the protobuf versions in Hadoop 3.0, but not in 2.x. cheers, Colin On Fri, May 15, 2015 at 4:55 AM, Alan Burlison wrote: > On 15/05/2015 09:44, Steve Loughran wrote: > >> Now: why do you want to use a later version of protobuf.jar? Is it >> because "it is there"? Or is there a tangible need? > > > No, it's because I'm looking at this from a platform perspective: We have > other consumers of ProtoBuf beside Hadoop and we'd obviously like to > minimise the versions of PB that we ship, and preferably just ship the > latest version. The fact that PB seems to often be incompatible across > releases is an issue as it makes upgrading and dropping older versions > problematic. > > -- > Alan Burlison > --
Re: Protocol Buffers version
On 15/05/2015 09:44, Steve Loughran wrote: Now: why do you want to use a later version of protobuf.jar? Is it because "it is there"? Or is there a tangible need? No, it's because I'm looking at this from a platform perspective: We have other consumers of ProtoBuf beside Hadoop and we'd obviously like to minimise the versions of PB that we ship, and preferably just ship the latest version. The fact that PB seems to often be incompatible across releases is an issue as it makes upgrading and dropping older versions problematic. -- Alan Burlison --
Re: Protocol Buffers version
On 14 May 2015, at 15:23, Alan Burlison mailto:alan.burli...@oracle.com>> wrote: I think bundling or forking is the only practical option. I was looking to see if we could provide ProtocolBuffers as an installable option on our platform, if it's a version-compatibility nightmare as you say, that's going to be difficult as we really don't want to have to provide multiple versions. The problem Hadoop has is that it's code, especially the HDFS client code, is used in a lot of other applications, and they end up having be in sync at the Java level. Hopefully the protobuf wire format is compatible (that is the whole point of the format, after all), but we know from experience that the JAR-level it isn't. Having to rebuild every single .proto derived java class and then switch across the entire dependency tree was the upgrade path there, with about a month where getting the trunk versions of two apps to link was pretty hit and miss. I think everyone came out burned from that -scared and unwilling to repeat the experience -not believing any further google assertions of library compatibility (see also: guava) What to do? 1. Leave alone and it slowly ages, when an upgrade happens it can be more traumatic. But until that time: nothing breaks. 2. Upgrade regularly and you can dramatically break things, so people don't upgrade Hadoop itself, they stick with old versions (with issues already fixed in the later releases), they keep on requesting backported fixes into the "working" branch and you end up with two branches of your code to maintain. 3. Fork and you take on maintenance costs of your forked library forever; it will implicitly age and theres' the opportunity cost of that work, i.e. better things to waste your time on. 4. Rip out protobuf entirely and switch to something else (thrift) that has better stability, tag the proto channels as deprecated, etc, etc. You'd better trust the successor's stability and security features before going to that effort. Hadoop 2.x has defaulted to option (1). Now: why do you want to use a later version of protobuf.jar? Is it because "it is there"? Or is there a tangible need? -steve
Re: Protocol Buffers version
Thanks for that link, Alan. That looks like a useful site! Ideally, the Protocol Buffers project would give a clear statement about wire compatibility between 2.5.0 and 2.6.1. Unfortunately, I can't find that anywhere. If it's not documented, then it's probably worth following up on the Protocol Buffers support lists to ask them. One thing we could try is starting up a mix of Hadoop processes using 2.5.0 and 2.6.1 to see how it goes. We've made a commitment to both forward and backward compatibility within Hadoop 2.x, so we'd need a 2.5.0 client to be able to talk to a 2.6.1 server, and we'd need a 2.6.1 client to be able to talk to a 2.5.0 server. Even if this appears to go well, I wouldn't consider it a substitute for a formal statement of the compatibility policy from the Protocol Buffers project. Otherwise, there might be some subtle lurking issue that we miss in our initial testing. As a reminder though, the community probably would want to see a strong justification for the upgrade in terms of features or performance or something else. Right now, I'm not seeing a significant benefit for us based on my reading of their release notes. I think it's worthwhile to figure this out first. Otherwise, there is a risk that any testing work turns out to be a wasted effort. --Chris Nauroth On 5/14/15, 7:23 AM, "Alan Burlison" wrote: >On 13/05/2015 17:13, Chris Nauroth wrote: > >> It was important to complete this upgrade before Hadoop 2.x came out of >> beta. After that, we committed to a policy of backwards-compatibility >> within the 2.x release line. I can't find a statement about whether or >> not Protocol Buffers 2.6.1 is backwards-compatible with 2.5.0 (both at >> compile time and on the wire). Do you know the answer? If it's >> backwards-incompatible, then we wouldn't be able to do this upgrade >>within >> Hadoop 2.x, though we could consider it for 3.x (trunk). > >I'm not sure about the wire format, what's the best way of checking for >wire format issues? > >http://upstream-tracker.org/versions/protobuf.html suggests there are >are some source-level issues which will require investigation. > >> In general, we upgrade dependencies when a new release offers a >>compelling >> benefit, not solely to keep up with the latest. In the case of 2.5.0, >> there was a performance benefit. Looking at the release notes for 2.6.0 >> and 2.6.1, I don't see anything particularly compelling. (That's just >>my >> opinion though, and others might disagree.) > >I think bundling or forking is the only practical option. I was looking >to see if we could provide ProtocolBuffers as an installable option on >our platform, if it's a version-compatibility nightmare as you say, >that's going to be difficult as we really don't want to have to provide >multiple versions. > >> BTW, if anyone is curious, it's possible to try a custom build right now >> linked against 2.6.1. You'd pass -Dprotobuf.version=2.6.1 and >> -Dprotoc.path= when you run the mvn >>command. > >Once I have fixed all the other source portability issues I'll circle >back around and take a look at this. > >-- >Alan Burlison >--
Re: Protocol Buffers version
On 13/05/2015 17:13, Chris Nauroth wrote: It was important to complete this upgrade before Hadoop 2.x came out of beta. After that, we committed to a policy of backwards-compatibility within the 2.x release line. I can't find a statement about whether or not Protocol Buffers 2.6.1 is backwards-compatible with 2.5.0 (both at compile time and on the wire). Do you know the answer? If it's backwards-incompatible, then we wouldn't be able to do this upgrade within Hadoop 2.x, though we could consider it for 3.x (trunk). I'm not sure about the wire format, what's the best way of checking for wire format issues? http://upstream-tracker.org/versions/protobuf.html suggests there are are some source-level issues which will require investigation. In general, we upgrade dependencies when a new release offers a compelling benefit, not solely to keep up with the latest. In the case of 2.5.0, there was a performance benefit. Looking at the release notes for 2.6.0 and 2.6.1, I don't see anything particularly compelling. (That's just my opinion though, and others might disagree.) I think bundling or forking is the only practical option. I was looking to see if we could provide ProtocolBuffers as an installable option on our platform, if it's a version-compatibility nightmare as you say, that's going to be difficult as we really don't want to have to provide multiple versions. BTW, if anyone is curious, it's possible to try a custom build right now linked against 2.6.1. You'd pass -Dprotobuf.version=2.6.1 and -Dprotoc.path= when you run the mvn command. Once I have fixed all the other source portability issues I'll circle back around and take a look at this. -- Alan Burlison --
Re: Protocol Buffers version
Some additional details... A few years ago, we moved from Protocol Buffers 2.4.1 to 2.5.0. There were some challenges with that upgrade, because 2.5.0 was not backwards-compatible with 2.4.1. We needed to coordinate carefully with projects downstream of Hadoop that receive our protobuf classes through transitive dependency. Here are a few issues with more background: https://issues.apache.org/jira/browse/HADOOP-9845 https://issues.apache.org/jira/browse/HBASE-8165 https://issues.apache.org/jira/browse/HIVE-5112 It was important to complete this upgrade before Hadoop 2.x came out of beta. After that, we committed to a policy of backwards-compatibility within the 2.x release line. I can't find a statement about whether or not Protocol Buffers 2.6.1 is backwards-compatible with 2.5.0 (both at compile time and on the wire). Do you know the answer? If it's backwards-incompatible, then we wouldn't be able to do this upgrade within Hadoop 2.x, though we could consider it for 3.x (trunk). In general, we upgrade dependencies when a new release offers a compelling benefit, not solely to keep up with the latest. In the case of 2.5.0, there was a performance benefit. Looking at the release notes for 2.6.0 and 2.6.1, I don't see anything particularly compelling. (That's just my opinion though, and others might disagree.) https://github.com/google/protobuf/blob/master/CHANGES.txt BTW, if anyone is curious, it's possible to try a custom build right now linked against 2.6.1. You'd pass -Dprotobuf.version=2.6.1 and -Dprotoc.path= when you run the mvn command. --Chris Nauroth On 5/13/15, 8:59 AM, "Allen Wittenauer" wrote: > >On May 13, 2015, at 5:02 AM, Alan Burlison >wrote: > >> The current version of Protocol Buffers is 2.6.1 but the current >>version required by Hadoop is 2.5.0. Is there any reason for this, or >>should I log a JIRA to get it updated? > > The story of protocol buffers is part of a shameful past where Hadoop >trusted Google. This was a terrible mistake, based upon the last time >the project upgraded. 2.4->2.5 required some source level, non-backward >compatible, and completely-avoidable-but-G-made-us-do-it-anyway surgery >to make work. This also ended up being a flag day for every single >developer who not only worked with Hadoop but all of the downstream >projects as well. Big disaster. > > The fact that when Google shut down Google Code, they didn't even tag >previous releases in the github source tree without significant amount >of pressure from the open source community was just adding insult to >injury. As a result, I believe the collective opinion is to just flat >out avoid adding any more Google bits into the system. > > See also: guava, which suffers from the same shortsightedness. > > At some point, we'll either upgrade, switch to a different protocol >serialization format, or fork protobuf. >
Re: Protocol Buffers version
On May 13, 2015, at 5:02 AM, Alan Burlison wrote: > The current version of Protocol Buffers is 2.6.1 but the current version > required by Hadoop is 2.5.0. Is there any reason for this, or should I log a > JIRA to get it updated? The story of protocol buffers is part of a shameful past where Hadoop trusted Google. This was a terrible mistake, based upon the last time the project upgraded. 2.4->2.5 required some source level, non-backward compatible, and completely-avoidable-but-G-made-us-do-it-anyway surgery to make work. This also ended up being a flag day for every single developer who not only worked with Hadoop but all of the downstream projects as well. Big disaster. The fact that when Google shut down Google Code, they didn't even tag previous releases in the github source tree without significant amount of pressure from the open source community was just adding insult to injury. As a result, I believe the collective opinion is to just flat out avoid adding any more Google bits into the system. See also: guava, which suffers from the same shortsightedness. At some point, we'll either upgrade, switch to a different protocol serialization format, or fork protobuf.