Re: [EXTERNAL] Re: Traffic Ops API versioning issues

Hank Beatty Thu, 14 Mar 2019 05:23:15 -0700

Hello Rawlin,

I still do not completely understand the sticking point on having theAPI version match the TC version. Especially now that TO is basicallythe API. However, what you describe below sounds fine with me.


Thank you for taking the time to go over this.

Regards,
Hank

On 3/13/19 3:21 PM, Rawlin Peters wrote:

You would be free to keep your scripts globally pointed at /api/1/*
until we have /api/2/* for all endpoints. We would have at least one
major release where both v1 and v2 would exist, then we'd remove the
v1 API entirely in the release after that. So you'd be able to keep
your scripts/clients/etc globally pointed at /api/1/*, then fully
rework them to use /api/2/* globally before upgrading to the release
where v1 has been removed. However, someone would still be able to use
a mix of v1 and v2 endpoints in the same script/client/etc if they so
desired.

In other words, we could keep _all_ v1 endpoints around until they
have all been deprecated by corresponding v2 endpoints. Would that be
sufficient?

Also, a perk of keeping the TO API separate from the TC version is
that scripts would work for as long as possible, rather than at most 2
major TC releases if the API version were to match the TC version. It
might take us any number of major releases to get a full v2 API in
place to where all v1 endpoints are deprecated, during which time all
v1 clients should continue to work.

- Rawlin

On Wed, Mar 13, 2019 at 11:32 AM Hank Beatty <hbea...@apache.org> wrote:


Hello Rawlin,

It is complicated and confusing to me when I write scripts that have to
use multiple versions of the same API in a major TC release.

Regards,
Hank

On 3/13/19 1:01 PM, Rawlin Peters wrote:

Hey Hank,

Thanks for reading the email. I think our TO API versioning strategy
will have many ramifications in terms of development, stability, and
usability, so it's important to start heading in the right direction.
My worry about tying the TO API version to the major version of TC is
that we should be able to release major versions of TC without having
to also rev the TO API version. We are not likely to deprecate an
entire major API version in the span of one major release, so I don't
think it would be wise to rev the TO API version with every major TC
release.

A much more likely scenario is that we will implement a v2 for a
specific endpoint (or set of related endpoints, like the delivery
service sslkeys endpoints for example), deprecating the v1 of those
endpoints at that time. The TC release would then support both v1 and
v2 at the same time, giving everyone some time to test out and migrate
to the v2 version of endpoints. The following major release would
remove the v1 endpoints that have been replaced with a v2.

Basically, unless we make a breaking change to an endpoint, I don't
think there's ever a good reason to rev an endpoint's major version.
Also, I don't want us to have the major API version be global. That
is, we shouldn't have to reimplement all of v1 with a v2 in the span
of a major release. We should be allowed to iterate over API v1
endpoints one by one to replace them with v2 endpoints. That might
mean having to specify two different major versions in
scripts/clients/etc for different endpoints, but doesn't seem all that
unreasonable to me.

- Rawlin




On Wed, Mar 13, 2019 at 9:14 AM Hank Beatty <hbea...@apache.org> wrote:


Hello Rawlin,

Wow! I know that you are obviously really passionate about this because
that was a really long email. :)

Here are my thoughts on API versioning:

1. The URL for the API should follow the TC major version (e.g.
http://.../api/v3/...)
2. TC 3.x.x would support v2 (with a warning message that it is
obsolete) and v3 of the API.
3. TC 4.x.x would support v3 (with a warning message that it is
obsolete) and v4 of the API.
4. A user of the API should not have to call v1.2, v1.3, and v1.4 in the
same script. This would be taken care of with 1, 2, and 3 above.
5. In the documentation, define "backward compatible" as new fields will
be introduced within a major version but, will always be optional for
both GET and POST.
6. If a user really wants to know the specific version (x.x.x) of the
API there should be an API route (e.g. https://.../api/v3/version).

Yes, I know that #2 is not how it is today.

Traffic Control does not do enough minor and bug fix releases that this
should cause that big of an impact.

Regards,
Hank

On 3/11/19 5:55 PM, Rawlin Peters wrote:

Hey Hank,

I don't think we are planning on releasing TO separately from the rest
of TC, nor do I think we should. The TO API version should not be the
same as the overarching TC version, because that would make it unclear
whether or not an old script/client/whatever is expected to continue
to function properly with a new version of TC.

The TO API version should provide some level of confidence that
scripts/clients/etc that have been coded against a particular API
major version will always work against that particular API major
version. Historically, we have inadvertently broken this "promise" due
to making backwards-incompatible changes to the 1.x API, but I believe
we are getting better at avoiding those breaking changes where its
reasonable.

For example, let's say you were working with TC 3.0 and wrote a script
against TO API V1 that GETs a bunch of servers, makes some changes,
then PUTs those changes back. Then you upgrade to TC 4.0 later. Should
your script continue to work as-is? As long as V1 of the API is still
supported in TC 4.0, you should have some level of confidence that
your script will still work with TC 4.0.

If the TO API was left un-versioned, every time you upgrade TC you
will have no real idea whether or not your old script is still going
to work. Some day, the script _won't_ work due to
backwards-incompatible changes made to the API. However, in a
major-versioned world, this would be in API V2, and your script would
stop working once API V1 is no longer supported, which hopefully would
not be a surprise since we'd have to follow a schedule like this:

- TC 3.0 released (supports TO API v1)
- TC 4.0 released 6 months later (supports TO API v1 _and_ v2, but v1
is now "obsolete")
At this point, you should convert any scripts/clients/whatever that
have been written against API v1 to work against the new API v2.
- TC 5.0 released 6 months later (now only supports TO API v2 --
requests made to the v1 API will fail immediately saying the v1 API is
no longer supported)
At this point, you should not upgrade TO to 5.0 until you are
reasonably confident that you've gotten all your old v1
scripts/clients/etc to work against TO API v2, which you've had 6
months notice to complete already.

If we left the TO API un-versioned or simply matched it to the TC
version, I think we'd fall into the trap of constantly making breaking
API changes with no notice, which is bad for anything that uses the
API. It would make upgrading major TC versions a nightmare since you'd
never really know if some script/client/etc was going to break after
the upgrade is completed. Using a major-versioned approach for the TO
API will make the upgrade process much smoother in the long run and
allow us to introduce breaking-API improvements without completely
blindsiding users.

This approach would also help in-repo clients like Traffic Portal.
Today, Traffic Portal seems to be pretty much all-in on a single API
version. When we implement a TO API v2, we wouldn't immediately also
have to go fix Traffic Portal to work with the TO API v2 version. This
means we can work on the API and UI separately, leading to more
focuses PRs and therefore a quicker PR review process, which is better
for contributors and reviewers.

On a related note, the TO API major version doesn't have to be an "all
or nothing" kind of thing, so we shouldn't make our clients be
"all-in" on a single API major version. That is, we shouldn't have a
single global constant in the client that controls the API version it
requests across all endpoints. I imagine we will introduce v2
endpoints on an endpoint-by-endpoint basis, deprecating v1 endpoints
as we go, so clients should be designed to support this kind of mixed
versioning eventually.

- Rawlin

On Mon, Mar 11, 2019 at 1:17 PM Hank Beatty <hbea...@apache.org> wrote:


Hello Rawlin,

What is your reasoning to keep the API version separate from the TC version?

The only reason that I can think of is that the Traffic Control Project
team would plan on releasing an API (TO) version separately from the
rest of the components. Is this something that we are planning on doing
in the future? From what I know we have not done that up to this point.

Regards,
Hank

On 2/14/19 12:53 PM, Rawlin Peters wrote:

There is a lot of stuff to digest here, and I agree that the v1 API is
desperately in need of a good v2. However, I'm not sure I agree that
the API should be versioned alongside Traffic Ops. The API version
should be separate from the Traffic Ops/Traffic Control release
versions. We should do our best to not break a major API version, so
in theory a script written against TO API v1 should always work with
TO API v1. When TO API v2 is introduced, the script written against v1
would not be guaranteed to work against v2 (although, we should still
support v1 for some time before deprecating in favor of v2).

- Rawlin

On Thu, Feb 14, 2019 at 8:07 AM Fieck, Brennan
<brennan_fi...@comcast.com> wrote:


I'd take it further, though in my opinion it shouldn't be done until we're 
ready to do away with Perl entirely.
I've been working on this for a little bit, and I sorta wanted to make a wiki 
page for a proposed API v2 design before sending it, but what the heck. The 
conversation is here now.


I don't like the way our API is versioned. Let me be clear, it _should_ be 
versioned, and the
versions _should_ be semantic, but it is my opinion that the API should be 
versioned alongside
Traffic Ops rather than at its own pace. I've written essentially an article on 
why this should be
the case, and it follows:

** TIMING **
The first issue I should address is timing - this needn't happen immediately. 
For the time being, I
think a radical change would be much more harmful than continuing along our 
current versioning
scheme. This change should be implemented with the advent of what we currently refer to 
as "API v2".
Presumably, at this time all of the Perl code has been removed and we are 
looking to re-tool the API
to be more sensible and standards-conforming. It's a time when we'll already be 
making big, breaking
changes. I personally would love for this to be ATC 4.0, but that may not be a 
realistic goal.

** REDUNDANCY **
The easiest to see - albiet simultaneously most trivial - issue with the 
current API versioning
scheme is how every request path must start with '/api/1.x/'. This is annoying, 
to be frank;
especially when one considers that the current plan for Traffic Ops is to 
reduce it entirely to an
API, so literally every endpoint will have '/api/' in common. We _know_ the 
endpoint we are trying
to query is an API endpoint because _all_ endpoints are API endpoints. When we reach 
"API v2" the
'/api/' part of request paths will have lost all value entirely.
Even with that gone we are left with '/1.' (or '/2.' as the case may become) as 
a common prefix,
again annoying although not totally useless in this case. However, the vast 
majority of API
endpoints see no changes between minor versions, so really '/1.x' just becomes 
a static, constant
prefix where 'x' is the latest version of the API.
In any case, versioning the API alongside Traffic Ops solves this problem 
because our Traffic Ops
server(s) emit HTTP headers that name the server. Once Perl is gone, we'll be 
free to use the HTTP
`Server:` header to name the server e.g. `Server: Traffic Ops/3.2.1`. At this 
point, we could
either implement an `OPTIONS` method request at the server's root that would 
just return some
headers - including `Server:` or just let people to a `GET` (or better yet 
`HEAD`) request to
`/ping` and pull the server version out of the headers. The client then has all 
the information it
needs to communicate effectively with the server. The alternative to this 
within our current
versioning schema is to implement an unversioned API endpoint such that we have 
a hundred `/api/1.x`
endpoints and one that has some other - or possibly no - prefix, or have a 
versioned API version API
endpoint, which is confusing even just to say.

** TRAFFIC OPS _IS_ THE API **
As mentioned previously, the endgame of our transition from Perl to Go is that 
Traffic Portal is the
only UI for ATC - and in fact that's already somewhat true. That leaves Traffic 
Ops as a database
and a REST API for interacting with said database. In fact, the database is 
often referred to as its
own service: "ToDb", "Traffic Ops DB" etc. That means that we have the single 
most important Traffic
Control component split in two - half of its functionality is versioned sanely 
with Traffic Control
while the other half is versioned separately from anything else in the world. 
That's crazy, because
if you have a program that only does two things, then surely a breaking change 
to one of those
things means increasing its major version? If that's the case, then why are we 
not versioning the
API alongside Traffic Ops?
It may be argued (incorrectly, in my opinion) that Traffic Ops does more than 
serve an API to
interact with a database. It generates configuration files and system images, 
it combines data and
does heavy processing on it. But really those things shouldn't be taken into 
account in a versioning
scheme except insofar as they affect the experience of some user, 
administrator, or application
interfacing with Traffic Ops. If the API responses don't change in form or 
content, and if the
process of setting up or maintaining the application haven't changed, then any 
code changes you've
made are a patch, not a version change. Traffic Ops does big things, but at the 
end of the day it
all just boils down to API inputs and outputs as far as anything and anyone 
else is concerned.

** CONFUSION **
We currently live in a world where I can run a script using the Traffic Ops API 
that works perfectly
fine against Traffic Ops version 3.2.1, but then if I again test it against 
version 3.2.1 at some
point in the future it breaks because breaking changes were made in the API. It's 
easy to say, "oh,
that just means that when we make breaking API changes we should increment the 
version
appropriately," but realize that this _is versioning the API alongside Traffic 
Ops_. If we're not
doing that, we're saying there is unpredictability with the behavior of our 
system within releases,
and if we _are_ doing that then the only difference between the API version and 
the Traffic Ops
version is that the API version is confusingly behind by about 2 major 
revisions. It should just be
the same for simplicity's sake.

** THE API "PROMISE" **
The single most common argument I hear in favor of our current API versioning scheme 
is "well we've
said an API version 1.x behaves in this way, and so we must uphold that promise to 
the user base".
Not only do we routinely break that promise already,
(e.g. PR #3110 [https://github.com/apache/trafficcontrol/pull/3110]) but I'm 
certainly not
suggesting that we don't uphold this promise. Yes, this does mean that making 
breaking API changes
in TO would require a new major release and adding features/fields to the API 
would require a new
minor release. I don't see that as a big deal, especially if implemented at the 
time I'm suggesting
when we'd be re-designing the API - and it does sorely need to be redesigned.

* The API Needs to be Redesigned *
I'm going to go down this rabbit hole for a second, if you're already convinced 
the TO API needs a
re-design then feel free to skip this section. I'm not going to touch on any 
problems caused in the
API as currently implemented by the use of a standalone API version - that's 
what the entire article
is for.
Currently, our API currently has three huge problems:

1. Rampant oversaturation of endpoints
            We have a systemic issue of re-implementing behaviour in multiple 
endpoints. This is due in part
            to a lack of good documentation - so developers aren't aware of the 
endpoints available to them
            - and partly because of the "Mojolicious Mindset" that plagues our 
oldest endpoints. The
            "Mojolicious Mindset" refers to the use of URL path fragments as 
request parameters, e.g.
            '/users/{{ID}}' instead of/in addition to '/users?id={{ID}}'. From 
the perspective of someone
            who is just writing the back-end for these endpoints, there's no 
clear advantage to one over the
            other except that the former seems to more clearly reflect the 
intent of requesting a specific
            object whereas the latter could be seen as more of a "filter" on a 
larger collection. That's not
            incorrect, necessarily, but the two are totally separate request 
paths, so having '/users' and
            '/users/{{ID}}' means documenting two endpoints instead of one, and 
it means two lines in the
            route definitions, and it means two seperate handlers instead of 
one (albiet a more complex
            one).
            Consider also that we have all of the following endpoints for 
manipulating Cache Groups:

            * /api/1.x/cachegroup/{{parameter ID}}/parameter
            * /api/1.x/cachegroup_fallbacks
            * /api/1.x/cachegroupparameters
            * /api/1.x/cachegroupparameters/{{ID}}/{{parameter ID}}
            * /api/1.x/cachegroups
            * /api/1.x/cachegroups/{{ID}}
            * /api/1.x/cachegroups/{{ID}}/deliveryservices
            * /api/1.x/cachegroups/{{ID}}/parameters
            * /api/1.x/cachegroups/{{ID}}/queue_updates
            * /api/1.x/cachegroups/{{ID}}/unassigned_parameters
            * /api/1.x/cachegroups/{{parameter ID}}/parameter_available
            * /api/1.x/cachegroups_trimmed

        These could all be collapsed neatly into one or two endpoints, but were 
instead implemented
        separately for whatever reason(s)
        (see Issue #2934 [https://github.com/apache/trafficcontrol/issues/2934] 
for details).

2. Improper/Non-existent standards conformity
            We have requests that should be read-only that make server-side 
state changes (Issue #3054
            [https://github.com/apache/trafficcontrol/issues/3054]), we have 
endpoints returning success
            responses on failure (Issue #3003 
[https://github.com/apache/trafficcontrol/issues/3003]) and
            our "CRUDER" has failed us. `PUT` should be used to _create_ 
objects (or possibly update them by
            creating an alternative representation with the same identifiying 
information) but is instead
            used as the primary "edit this" method. `POST` is for processing 
entities in a data payload, but
            is instead used for object creation. `PATCH` languishes, totally 
unimplemented. These problems
            are systemic and stem partially from the "Mojolicious Mindset" 
whereby new functionality is
            introduced into the API by first considering what request method is 
appropriate and then
            deciding on a request path that names the operation being done. 
Request methods are meant to be
            the different ways in which a client interacts with a resource on 
the server, and thus the
            resources themselves should be considered primary. The "CRUDER" 
hampers this mindset, because it
            makes treating payloads and query parameters generic and isn't 
receptive to injection of new
            behaviour.
            The "CRUDER" is a great innovation, to be sure, as it saves us 
quite a bit of time and prevents
            duplicating code, but the problem it solves is an emergent property 
of of API problem #1 above.
            With fewer endpoints we'd have much more specific handling, and the 
need for and advantages of
            the "CRUDER" will vanish.

3. Needing multiple queries to obtain a single piece of information
            This issue is pretty deeply rooted, and is related to the way our 
database is structured. But
            that's part of the problem - an API needn't replicate the database, 
and is therefore free from
            some of the constraints that bind a database.
            The way things are now, in order to e.g. create a server definition 
I must make several
            preliminary requests to determine the integral, unique, 
non-deterministic identifiers of other
            objects. I think we all agree that ideally these integral IDs would 
have no part in the output
            of the API, and many people I've spoken to would support wiping 
them from the database entirely
            given enough time and manpower.

________________________________________
From: Robert Butts <r...@apache.org>
Sent: Wednesday, February 13, 2019 6:11 PM
To: dev@trafficcontrol.apache.org
Subject: Re: [EXTERNAL] Re: Traffic Ops API versioning issues

It doesn't seem like the idea of full TO API SemVer was ever fully

discussed and voted on

Since it seems like we never truly committed to SemVer with minor versions

for the TO API

There was consensus:
https://lists.apache.org/thread.html/8f8a850c68424021a0fe06967894383a24e463f1b0cee4d652d04590@%3Cdev.trafficcontrol.apache.org%3E
https://lists.apache.org/thread.html/1a42a2192a81fc4d76639ccd10761b6b73c31345a63715bb8aa86e4e@%3Cdev.trafficcontrol.apache.org%3E

We didn't do an official [VOTE], but we rarely do that as a project, unless
there's difficulty reaching unofficial consensus. That said, there's
nothing stopping another discussion or vote, if we want to change things.

as long as we set the expectations for the user


That's precisely my point. We can always say "we set the expectations, it's
the user's fault for misreading, or misunderstanding, or not noticing a
changelog." It's not about fault, it's about providing a better user
experience.

that support has to be baked into traffic_ops_golang such that it's easy

and maintainable to support tons of minor versions. If we keep heading down
our current path we are going to be left with a giant mess.

I don't think anyone disagrees. Traffic Ops versioning has had considerable
technical debt for some time now. We just haven't been willing to spend the
time to fix it. I agree, that needs to change.

I'd also like to note, there's another option: code generation. We can keep
the structs and handlers largely the same, and generate the "duplicate"
code when a new minor version is added. There's a strong argument that this
is the "idomatic" way to solve this problem, for the Go language. I
personally feel Reflection is better here. But, I wanted everyone to be
aware that's an option, if the consensus is that it's the best option. In a
nutshell, we could generate code with a script (with a spectrum of
precision and complexity, from `cp && sed` to parsing the AST), and
`routes.go` would have a `//go:generate` header. For examples, see:

https://blog.golang.org/generate
https://godoc.org/golang.org/x/tools/cmd/stringer

I did some prototyping of that approach, if anyone is interested, just let
me know and I can provide examples of what that would look like.


On Wed, Feb 13, 2019 at 5:09 PM Rawlin Peters <rawlin.pet...@gmail.com>
wrote:

On Wed, Feb 13, 2019 at 1:20 PM Gray, Jonathan
<jonathan_g...@comcast.com> wrote:


I'm +1 on keeping full API SemVer.

On 2/13/19, 12:16 PM, "Robert Butts" <r...@apache.org> wrote:

        We would be abandoning Semantic Versioning.


This is why I wanted to open this up for broader discussion within the
community. It doesn't seem like the idea of full TO API SemVer was
ever fully discussed and voted on (at least not to my knowledge),
which is why I haven't been abiding by it myself or enforcing it upon
other TO API devs either. If we're going to truly commit to SemVer
with minor versions for TO API then we should fully understand the
cost vs utility of doing so. Also, SemVer doesn't have to be an "all
or nothing" thing, as we currently choose to ignore the "patch"
version for the TO API. We could also choose to ignore the minor
version and just focus on the most important aspect of SemVer which is
not introducing backwards-incompatible changes in the same major
version.

Since it seems like we never truly committed to SemVer with minor
versions for the TO API, traffic_ops_golang wasn't designed to easily
support minor versions. So if we're going to truly commit to SemVer
with minor versions for TO API, then that support has to be baked into
traffic_ops_golang such that it's easy and maintainable to support
tons of minor versions. If we keep heading down our current path we
are going to be left with a giant mess.

The only way I can get behind supporting the "minor version promise"
is if we have a way to basically just tag fields as introduced in a
specific API version, with just a single struct per resource and a
single handler per major version of an endpoint. We should only
require a single implementation of an endpoint per major version. If a
custom JSON parser allows us to do that, then that's great. Without
that, I don't think supporting the "minor version promise" is even a
viable option.

A lot of those scenarios around user confusion due to supporting just
a major version are not really an issue as long as we set the
expectations for the user. I.e.:
- clients should request v1
- as the v1 API is enhanced in a backwards-compatible manner, clients
will begin to see new fields in the server responses
- if a client wants to modify a resource, they will typically do a GET
on the resource, modify the resource, then do a PUT back to the
server.
- in the GET response, if the client sees new fields, they can expect
to modify those fields and see it reflected on the server.
- if the client does NOT see new fields, then they cannot add fields
and expect to see that reflected on the server
- null values in optional fields will be interpreted by the server as
whatever the default for that optional field is. So if the default for
new optional field "foo" is 5, then a client sending `"foo": null`
will be interpreted by the server as `"foo": 5`, and `"foo": null`
should never have a different meaning than `"foo": 5` on the server.

The advantage of just supporting the "major version promise":
- jives better with Go's lack of metaprogramming
- no custom JSON parser in Go required
- less general overhead in development
- only have to worry about breaking changes
- no worrying about which fields belong to which minor versions
- no extra API testing to make sure fields introduced in API v1.N
aren't returned to v1.N-1 clients
- don't have to update every single client in the repo with the new
minor version every time a minor version is incremented (since clients
would just specify the major version)

So, as a community, we need to weigh these options and decide whether
or not we want to take the "major version only" route or the "major
plus minor version" route. Personally, I prefer the "major version
only" route because it means less code, less overhead, less
coordination, and less things to potentially go wrong. However, if a
relatively small custom JSON parser is all we really need to
reasonably support the "minor version promise", then I can't say I'm
completely against that route either (just that I wouldn't prefer it).

- Rawlin

Re: [EXTERNAL] Re: Traffic Ops API versioning issues

Reply via email to