I agree with Rob here and I think we’re still missing a key problem around 
upgrade problems this proposal creates.  Consider the following:


  1.  Beginning state
     *   TO version Q serving API version 12.0 and “unstable”
     *   Component version Q consuming version 12.0
  2.  A bug is found and fix backported or a new feature is required in 
production which entails an API change
     *   TO version Q.1 serving API version 12.0 and “unstable”
     *   Component version Q.1 now consumes “unstable”
  3.  Time to upgrade to TO version W
     *   Upgrade TO first

                                                               i.      TO 
version Q serving API version 12.0, 13.0, and “unstable”

                                                             ii.      Component 
on Q.1 previously consuming “unstable” now has a very different payload because 
“unstable” represents not only the bugfix that was previously needed but all 
unrelated API changes

                                                           iii.      Therefore 
mid-upgrade Component Q.1 will be broken until it’s also updated to W and 
leveraging API 13.0 presumably or W.-1 and some new payload expectation of 
“unstable”

     *   Upgrade Component first

                                                               i.      TO 
Version Q.1 serving API version 12.0 and “unstable” as in 2.a

                                                             ii.      Component 
version W either expects TO to provide version 13.0 or “unstable” using a 
definition based on TO Version W which is different than what it had been 
serving in Q.1

                                                           iii.      Therefore 
mid-upgrade Component W will be broken until TO is also upgraded to TO Version 
W.

     *   Downgrade Component first

                                                               i.      This 
doesn’t work because Component version Q has a known issue or required feature 
impacting enough to justify the work to create and upgrade to Q.1 which 
requires TO serving API “unstable” as the fix wasn’t in API 12.0.

In either case moving forward you can’t safely upgrade without simultaneously 
stopping both TO and Component.  Depending on what Component is, that may not 
be possible without creating a customer impact, and it doesn’t matter if it’s 
an official ATC supported component or a 3rd party tool.  Introducing the idea 
of an un-version version here opens a door to a new category of problems.

Jonathan G


From: Robert O Butts <r...@apache.org>
Date: Tuesday, September 7, 2021 at 9:58 AM
To: dev@trafficcontrol.apache.org <dev@trafficcontrol.apache.org>
Subject: Re: [EXTERNAL] Proposal: stable vs unstable TO API versions
I'm concerned that using this "unstable" version makes it impossible to
upgrade in-place.

Because if a client (cache config, Traffic Monitor, random ops scripts,
etc) uses it, and a breaking change is made, if you upgrade Traffic Ops
first you'll break all clients, and if you upgrade clients first, they'll
try to talk to TO and get 200's but the data will be malformed.

You could theoretically downgrade all clients to the previous version,
starting from the most-downstream, and then upgrade. But if a production
CDN is using a new feature, the CDN will almost certainly have things
relying on it that will break, either CDN operations or clients using new
features.

Worse, it seems like this isn't obvious. Which makes it a pretty big
footgun, if ATC operators use the "beta" API in their production CDN
without realizing they just made it impossible to upgrade.

On the other hand, I'm not seeing the big development savings. What
features have we added in the past that we added to the API, and then
changed our minds one version later and decided we did it wrong and wanted
to make a breaking change? Since using it makes it impossible to upgrade,
this means all production CDNs will have to wait 2 major versions for new
features. Underlying data changes that require two major versions to add
(like Layered Profiles) are pretty rare; this means for every small,
compatible change, users will have to wait two major versions to use a new
feature in production. That seems like a pretty high cost.


On Tue, Aug 31, 2021 at 10:27 AM Rawlin Peters <raw...@apache.org> wrote:

> For your 1st reason, that is all hinged on whether or not the software
> needs to use the unstable version of the API. That is why you also
> have the choice to stay on the stable version and not have to worry
> about coordinating upgrades. Mind you, upgrades would only need to be
> coordinated in the cases where a component actually uses one of the
> broken APIs in the unstable version. We can easily keep track of
> breaking changes in the changelog in order to call out certain
> upgrades that would need to be coordinated (for any components that
> use the unstable API). Just because that process might be more
> error-prone than keeping the latest API version stable doesn't mean we
> shouldn't do it. It's a small risk that has a huge reward in time
> saved by not having to deal with so many API upgrades.
>
> I think your 2nd reason is actually supporting this proposal:
>
> > The removal of the 1.x API is showing how expensive it truly is to
> safely remove API versions, and that’s something to be weighed in addition
> to maintenance cost to the project for those versions.
>
> The 1.x API removal was a prime example in just how much code was able
> to stay on the stable API version until we decided to remove it. With
> this proposal, all of that code would still be able to remain
> unchanged for a longer period of time than without this proposal,
> saving much unnecessary toil. It also reduces maintenance cost of
> prior versions because in creating less new major versions, we will
> have less of them to support over time.
>
> > I think the million-dollar question revolves more around how much/far
> back we are willing to support. If it’s only one release at a time, that’s
> going to drive those 3rd party code maintenance costs up significantly
> higher as part of just doing business which will slow down deployments even
> if releases are moving faster.
>
> I don't think so, because we'd be creating less major versions to
> remove in the first place, so we wouldn't have to worry about
> upgrading 3rd party code that stays on the stable API version. From
> the lessons learned with the API 1.x removal, the vast majority of 3rd
> party code stays on the stable API version until that version is
> getting removed. So we would be releasing faster *and* deploying
> faster.
>
> For your 3rd reason, developers working on the same route generally
> always have to coordinate changes in some way, and we are usually very
> good about that. That is how it's always been done and will continue
> to be done, unaffected by this proposal. It's not really the release
> manager's responsibility to figure out what has been broken and what
> upgrades need to be coordinated. That is a collective responsibility
> of all ATC developers when making breaking changes. Breaking changes
> should be called out in the changelog, along with any prescribed
> upgrade orders. If this proposal is accepted, I think we should give
> these types of changes their own specific section in the changelog.
>
> For your 4th reason, I don't think we've ever decided to merge
> something that was half-baked just to avoid API versioning issues. A
> PR is already a feature branch and can remain open until ready to
> merge. The problem this proposal solves is when a developer starts
> developing a feature towards e.g. API 4.0, but we just cut a release
> and are now on API 5.0, so that developer then needs to *rework* their
> PR to now target API 5.0. Unnecessary rework decreases productivity
> and makes the feature take longer to get to production and produce
> value for us. This proposal basically extends the runway, so that we
> don't have to make the decision to delay the release if the feature is
> nearly complete in order to avoid that unnecessary rework. We can
> simply cut the release on time and have the new feature land in the
> subsequent release (with no unnecessary rework for the developer).
> Additionally, it is always somewhat disappointing when we have to
> *wait* to start developing a new feature because a release is about to
> be cut in order to avoid unnecessary rework caused by API versioning.
> This proposal would allow that work to start at any point in time
> without adding any unnecessary rework.
>
> For your last point, I know you keep linking to Rob's
> https://urldefense.com/v3/__https://github.com/rob05c/apiver__;!!CQl3mcHX2A!S6GOQKU9zaRJOcesmvJnRZ75p8hCsurwihDB49QCG3az2zWPOIK-F-_7jHXx4zWuNTb2$<https://urldefense.com/v3/__https:/github.com/rob05c/apiver__;!!CQl3mcHX2A!S6GOQKU9zaRJOcesmvJnRZ75p8hCsurwihDB49QCG3az2zWPOIK-F-_7jHXx4zWuNTb2$>
>   library whenever conversations
> related to API versioning come up, but this proposal is mainly
> concerned with major version changes, for which that library was not
> made. Also, I'm not really sure how Elixir would help solve this
> problem.
>
> - Rawlin
>

Reply via email to