Re: Philosophy about forward compatibility of generated code?

Roger Meier Thu, 26 Aug 2010 15:22:09 -0700

I really like the approach using the http://semver.org/ approach whichis very simple to understand and desribed within a few lines.Nobody likes to search the mailing list or the Jira's generatedchangelog to find out what are the relevant changes.


Probably two wiki pages would be very helpful for users and developers:

- http://wiki.apache.org/thrift/VersioningSystem => describe the overallconcept- http://wiki.apache.org/thrift/RoadMap => what are the plans for thenext releases, just a few words about the overall target...JIRA is justa tool;-)


I absolutely agree with David's position about 1.0 versions.
...any version might be useful, also a 0.1.2

--roger

Am 26.08.2010 22:40, schrieb Anthony Molinaro:

I'd like to second David's opinion that 1.0 is a meaningless construct.
I like the philosophy outlined here about version numbers

http://semver.org/

In a nutshell, its MAJOR.MINOR.PATCH where MAJOR is binary incompatible
change, MINOR is backward compatible new feature, and PATCH is bugfix
of existing feature.

Now with thrift since we are dealing with 2 things (a serialization
protocol and libraries to interact with those), these version numbers
are not quite right.  I would say that thrift should adopt a four
digit version.

PROTO_VERSION.MAJOR.MINOR.PATCH

where PROTO_VERSION is the serialization format version.  Any
version of thrift released with PROTO_VERSION of 0 is compatible with
all other 0 versions at the serialization level.

MAJOR would be for binary incompatible changes of the generated code.
Since we are still pre "1.0" these tend to happen anyway.  What this means
is that if 0.5.0.0 is released, you'll most likely have to recompile
any code which linked against 0.4.x.x if you want some new features
in the client or server library.  However, since the PROTO_VERSION
has not changed you'd be able to interoperate with older servers, and
your server could take calls from older clients.

MINOR would be for new features which don't require you to rebuild if
you aren't using them.  This could involve things like adding a new
server type.  If its a completely new server type you haven't broken
any existing code so you are okay.

PATCH would be bugfix releases, you probably don't see a lot of these
but allow for quick fixes if say a 0.5.0.0 release had a major bug which
was fixed quickly, you release a 0.5.0.1 and move on.

Within such a framework I think you then make decisions about when to
include new things and when to release according to criteria that
David laid out.  Also, if this policy is spelled out explicitly on the
website then questions about compatibility can be answered there (via
a pointer from a FAQ).

Also, this framework allows for quicker turn around of certain features
and bugfixes.  I would expect to see several MINOR releases between MAJOR
releases.  It is release early, release often right?  I also follow
cassandra pretty closely and they seem to have MINOR releases (although they
use the PATCH number), every month or 6 weeks or so, and MAJOR releases
every 6 months or so.  Of course faster release cycle for things is
more of a headache for Bryan, so maybe that's not a good idea :)

Anyway, just my opinion, I wouldn't expect any action on this unless
others thought it was a good idea.

-Anthony

On Thu, Aug 26, 2010 at 12:42:27PM -0700, David Reiss wrote:

My opinion is that any time we face an incompatible change, the factors we
should consider are:

- How significant will the benefit be?  (In this case, it was huge
   because the byte[] were inherently broken.)
- How many users do we expect it to affect (e.g., Java vs. OCaml
   changes).
- How difficult is the upgrade path?  (Changing the code to create a
   server is relatively simple.  Changing all code that works with
   generated structures is harder.  Changing the wire protocol is
   prohibitive.)
- What is the effect if client code is not updated?  (In C++ or Java,
   you probably get a nice, safe compile error.  In Python, you might get
   difficult-to-explain behavior after your app has been running for a
   while.)

It's possible that I've left something off the list, but these are the
ones that come to mind.

This might be a controversial opinion, but I place no weight in version
numbers at all.  You'll notice that I intentionally left "whether we
have released version 1.0" off of my list.  Of course, others might
place weight on a 1.0 release, so it could indirectly affect my second
point.

At some point in the project's lifecycle, I think that Thrift should shift
to a "don't break existing code unless absolutely, positively necessary for
performance/security/etc."
But perhaps there's an explicit policy in place now that "Since we're only
at version 0.4, all bets are off and any marginal change justifies breaking
existing code. We'll stabilize at 1.0."

That's a reasonable position, but as both a developer and user of
Thrift, I personally would prefer not to do that.  When upgrade paths
are easy and upgrade errors are easily detected, I would *never* want to
be prevented from making a beneficial change because it wasn't
"absolutely, positively necessary".

If this means never releasing version 1.0, that's fine with me.  I
recently learned that Hadoop (an incredibly successful project) is still
hanging out at 0.22.  I'd be fine with Thrift doing the same.

If we do decide to release a version 1.0 and refrain from making
unnecessary breaking changes even if they are very beneficial and easy
to upgrade to, I would want to immediately create a 2.0 branch that
would accept more aggressive changes.

--David

On 08/24/2010 12:12 PM, Bryan Duxbury wrote:

In general, I don't aim to break things. I hate updating my code as much as
the next guy. However, we are still pre-1.0, which means if we're going to
break code, now is the time to do it. In fact, we should get as much of it
out of the way as we can before our user base and feature set becomes so
entrenched that we're permanently committed to given interfaces.

That said, I reject your premise that we "break compatibility for virtually
no benefit". In fact, this particular change was effected specifically to
give us a substantial performance benefit for structs with many binary
fields. (It also makes a lot of operations easier, including map keys and
set elements, comparison and equality, etc.)

Finally, I would *gladly* accept a patch to the Java generator that
generates accessors with the old-style signatures that wrap the new style
ones. There's no reason for us not to have a 0.4.1 if it would ease folks'
lives.

-Bryan

On Tue, Aug 24, 2010 at 11:44 AM, Dave Engberg<[email protected]>wrote:

  [ Caveat:  I should obviously spend more time keeping up with the daily
email traffic to weigh in on proposed changes before full releases, but you
know how it goes at a start-up.  Anyhoo ... ]

I'm starting to go through the process of updating our API to Thrift 0.4,
which hits around a dozen apps that we maintain (clients and server) and
around a hundred implemented by third party developers.

I tend to see changes in Thrift releases that break the external interfaces
of generated code.  In some cases, this may be based on a heavy conscious
decision, but other changes seem to casually break compatibility for
virtually no benefit.

For example, in 0.4.0, the Java generated interfaces for 'binary' fields
changed everything from the built-in 'byte[]' Java type to the
java.nio.ByteBuffer class.  This means that all relevant method signatures
and fields were changed, and all instance setters and getters went from:
  public byte[] getImage() ...
  public void setImage(byte[] image) ...
to:
  public ByteBuffer getImage()
  public void setImage(ByteBuffer image)

That means that we'll need to change a hundreds of lines of code from:
   foo.setImage(bytes);
to:
  foo.setImage(ByteBuffer.wrap(bytes));
and from:
  byte[] bar = foo.getImage();
to:
  byte[] bar = new byte[foo.getImage().capacity()];
  foo.getImage().get(bar, 0, bar.length);

(This particular compatibility change seems particularly gratuitous since
it actually increases the cost and overhead of constructing objects and
marshaling data while replacing a core Java type with an odd dependency on a
low-level Java IO package that's not really designed for this type of
thing.)

So my meta-question is:  what's the Thrift project's philosophy about
maintaining interface consistency and backward compatibility between
releases?

At some point in the project's lifecycle, I think that Thrift should shift
to a "don't break existing code unless absolutely, positively necessary for
performance/security/etc."
But perhaps there's an explicit policy in place now that "Since we're only
at version 0.4, all bets are off and any marginal change justifies breaking
existing code. We'll stabilize at 1.0."

Re: Philosophy about forward compatibility of generated code?

Reply via email to