Replies below: On Tue, Sep 7, 2021 at 9:37 AM Dave Neuman <neu...@apache.org> wrote: > I do think that we will have to probably put some thought into when we > determine an API is "stable" and what that process looks like. It is a > little uncomfortable to just leave that as a gut feel type thing, but I > understand that it is also very hard to put more rules/processes around > something that is pretty subjective.
I think if our general guideline is to only make breaking changes when absolutely necessary for a new feature being added (i.e. we can't just add a new optional field with a default for some reason or adding new routes that tie into existing routes would make the API too unwieldy), then we should just look at what we have planned on our roadmaps for the next 6-12 months or so. If there is anything that sticks out as needing a breaking API change, then perhaps we hold off on stabilizing until we get that breaking change into the unstable API. Or, if the API version has already been unstable for a certain amount of time, perhaps we would stabilize it even if we have breaking changes on the roadmap. On Tue, Sep 7, 2021 at 9:58 AM Robert O Butts <r...@apache.org> wrote: > > I'm concerned that using this "unstable" version makes it impossible to > upgrade in-place. > > Because if a client (cache config, Traffic Monitor, random ops scripts, > etc) uses it, and a breaking change is made, if you upgrade Traffic Ops > first you'll break all clients, and if you upgrade clients first, they'll > try to talk to TO and get 200's but the data will be malformed. I understand your concern about upgrading, but in reality it's still possible to upgrade components that use the unstable API version. It will just require more coordination than upgrading components that use the stable API. Plus, keep in mind, it's not like every single breaking change to the unstable API automatically breaks every client of the unstable API. Only clients using the particular route(s) being broken in the unstable API would require coordination to upgrade. > Worse, it seems like this isn't obvious. Which makes it a pretty big > footgun, if ATC operators use the "beta" API in their production CDN > without realizing they just made it impossible to upgrade. If we declare a certain API version unstable, ATC operators should understand the risks of using it, just like there are risks involved in using the API in general. Using the API to make changes is generally a last-resort option when making the same changes in the UI would take much longer. Using the UI is generally the much safer option since it has a lot more built-in safeties (confirmations, form validation, etc) than the API, but in the case where ATC operators absolutely need the new features in the unstable API and can't use the UI instead, they will have to take that risk. > On the other hand, I'm not seeing the big development savings. https://github.com/apache/trafficcontrol/pull/6145 -- 60,000 lines of code just to add a new major TO API version is a pretty big savings, and that is not even counting all of the "if version == x" conditionals that have to clutter the code to handle multiple API versions. The fewer version-specific conditionals we have to deal with in the code, the easier it is to develop and the less bug-prone it is. > Since using it makes it impossible to upgrade, > this means all production CDNs will have to wait 2 major versions for new > features. Again, this is a false statement. CDNs will have access to unstable features via the API immediately upon release, and if certain components need new changes in the unstable API, their upgrades may need to be coordinated with the TO upgrade. Since `t3c` uses a large percentage of the API currently and will most likely need to use the unstable API, most of its upgrade concerns will be alleviated by the addition of Cache Config Snapshots. The Cache Config Snapshots API will generally always be stable in that the JSON snapshot will only have fields added in a backwards-compatible manner. We should never make a breaking change to a snapshot, and in general we never really have (at least for the CRConfig snapshot that I know of). So with Cache Config Snapshots, `t3c` will always have access to new features right away and won't have to use the unstable API. Hopefully that alleviates some of your upgrade concerts with respect to `t3c`. Most other ATC components use a much smaller percentage of the API and generally don't always need to use the latest API version. - Rawlin