Slow down a little bit. Do people believe things in my summary are untrue/inaccurate (it's ok if that's the case)?
Jonathan G On 4/19/19, 3:42 PM, "Jeremy Mitchell" <mitchell...@gmail.com> wrote: assuming that everyone is ok with ditching minor version support for the api (not really sure if rob is yet), what about this release plan? TC 3.0 / supports api 1.3, 1.2, 1.1 (this is already out the door) TC 4.0 / supports api 1.4, 1.3, 1.2, 1.1 (cut this branch asap) TC 5.0 / supports api v2, treat all calls to 1.x as 1.4 (this will be master after the 4.0 branch is cut) note: this leaves the existing data loss ( https://github.com/apache/trafficcontrol/issues/3497) problem in 3.0 and 4.0 that needs to be addressed. On Fri, Apr 19, 2019 at 3:39 PM Gray, Jonathan <jonathan_g...@comcast.com> wrote: > Also to be sure I'm tracking properly the complex subject here: > > 1. Today there are two promises: > 1a. One major rev back at the project level for all code/api/etc > 1b. API versions independently (but probably not stable enough on > whatever the latest is due to self-discipline/definition) > > 2. Primary issues around API stability are the addition and removal of > routes/fields > 2a. This relates to the discussion around 1b > 2b. A significant portion of this could be mitigated through the > support of a PATCH operation as opposed to passing full objects around > which already has inherent race conditions in a multi-user system. > 2c. This is where maintenance costs are potentially very high, but > might be mitigated somewhat through rob's apiver library. > > 3. Secondary issues are around strong versus weak typing > 3a. This is the primary source or breakage lately and is the result of > conversions from perl to go > 3b. This breaks not only promise 1b, but also promise 1a. > 3c. The solution to this is simply to be more diligent in the use of > existing or new datatypes until such time that a new major revision of > either 1a or 1b is made. Example use case issue #3304. > > 4. Better support for discoverability and compatibility by and between > API components/consumers > 4a. At the moment there is no way as a client to know which API > versions are supported by an arbitrary TO instance. Documented in issue > #2872 > 4b. This leads to failure scenarios wherein TO isn't upgraded first or > newer clients exist. I stopped counting when 5 different people internally > ran into this with the addition of the 1.4 API for us. > > 5. There are additional concerns with how we handle 1b with regard to > master as opposed to OSS releases and promise 1a > 5a. There is a lack of formality when it comes to does component X on > master changeset A work with component Y on master changeset B before it > lands in OSS Release Q. > 5b. Presently this entails SME reviewing changelogs of component X & Y. > 5c. This is the question SemVer helps with by better defining that > formality at the API layer. > 5c.i. It's not a total solution because unversioned > payloads/workflows such as the CRConfig can still cause additional issues. > 5d. This is where today we're relying on our existing monolithic > repository and one version to rule them all stance so that in theory we > never ask 5a > 5d.i. This is backed by our current testing procedures. When running > TO API tests today it's presumed to use one version inside the CIAB > environment. > > I think if we can clarify and agree on this, the questsions around how to > version technically and ATC-based clients versus 3rd-party clients is > mostly mitigated. > > Jonathan G > > > On 4/19/19, 1:26 PM, "Rawlin Peters" <rawlin.pet...@gmail.com> wrote: > > How about this: > 1. At the Go struct level, every struct for an API endpoint gets a > "Version" field. > 2. Every time a backwards-compatible field is added to an API > endpoint, the Version is incremented for that struct (even if a TC > release hasn't been made between increments). > 3. This Version field is read-only and included in all GET/PUT/POST > JSON responses. > > For the Go TO client, this would allow the client to GET a resource > and check if the resource's version matches the version that the > client is currently using. If the client is just reading certain > fields and only cares to read the fields it already knows about, it > wouldn't have to check the version field at all. If the client wants > to update a resource, it could GET the resource, compare the resource > versions, and make a determination about whether or not the update > would be safe. If the versions match, it can safely update the > resource without risk of data loss. If the versions do NOT match, the > client might choose to error out instead and/or send an email to the > maintainer that it should be recompiled for safety. > > This would allow classes of clients that handle unknown fields > properly (e.g. Traffic Portal, Python+Java TO clients) to continue > working as they do today, only having to worry about the API major > version. > > For clients that might be more susceptible to data loss due to > addition of unknown fields (e.g. Go TO client), they would have enough > information returned to them by the API in order to know if they can > safely make updates to resources. If a Go TO client is only reading > certain fields and not making updates, it would probably never need to > be recompiled for the entire life of the API major version. If a Go TO > client is updating resources, it only needs recompiled as often as the > APIs its actually using have been updated. > > This would allow random, non-standard TO clients to be written and > used for as long as possible before recompilation is required, and > allow the client to build in proper safeguards only where necessary. > > - Rawlin > > On Fri, Apr 19, 2019 at 12:01 PM Robert Butts <r...@apache.org> wrote: > > > > I'm trying really hard to come up with a solution that addresses > everyone's > > major concerns. I think we'll have a better product, that everyone > can live > > with, if we all try to think of solutions and are willing to > compromise, > > rather than take hard-line approach and refuse to compromise, and > argue > > until we're all just unhappy, and whatever gets pushed through meets > a few > > people's concerns and nobody else's. I'd definitely appreciate any > help in > > that regard. > > > > Sometimes there really are only two options, A or B. But this > particular > > issue has countless possibilities. We're all smart people, we can > figure > > something out that addresses everyone's needs and concerns. > > > > What about this idea: > > > > Along the lines of @hbeatty 's suggestion, what if we: > > > > 1. Make the API version match the TC version. > > 2. Always release new TC major versions, never do TC minor version > releases. > > 3. Support one major version back, in the API and clients. > > 4. New backward-compatible changes require a TC=API major version > increase. > > 5. OPTIONAL: There should be an API route to get the exact TC > version (e.g. > > https://.../api/v3/version). (This isn't strictly necessary, but > it's > > on @hbeatty > > 's list, and I know it's on @jonathan_gray 's, and it's super-easy > and > > there's no reason not to.) > > > > This: > > 1. Addresses the client version bugs concern: older clients simply > don't > > work because we don't support them, and newer clients will get the > "please > > downgrade" response. > > 2. Addresses the code ease-of-writing concern: We only ever have to > > maintain 1 older version in the API, which will typically only be a > few > > fields on a few endpoints. > > 3. Partially addresses the ease-of-use concern (Ops/@jonathan_gray). > It > > addresses the scripts-breaking-things problem, but it does make user > > scripts have a hard upgrade deadline. I see this as the biggest > weakness of > > this idea, and unfortunately I don't see a remedy; if the user-side > people > > are willing to live with that? > > 4. Patch versions are still ok. This doesn't prevent e.g. 4.0.1 when > we > > find a major bug in a release; just adding new things that would be a > > SemVer Minor Version. > > > > Some points: > > 1. Only doing major versions, we'll obviously quickly reach Traffic > Control > > Version 47. I think that's ok. There's precedent for this, Chrome and > > Firefox both do this, Chrome's latest version is 68 and Firefox is > 66. It > > might seem odd, but I don't think there are any big downsides. > > 2. This will make @jonathan_gray 's (/ Ops) life slightly harder, > having to > > upgrade script clients more frequently. But it prevents the data > loss risks > > (which I know everyone here doesn't agree with, but some of us do, > so bear > > with me), and upgrading our maintained clients should be relatively > simple. > > 2.1 As @hbeatty points out, if we release at a 6-month interval, > this would > > mean scripts using old clients would be supported for 1 year. We > could > > optionally support 2 major versions back, if we were willing to live > with a > > little more server work, to support 2-year-old clients. > > > > Just to be clear, I personally don't like making the API version > match the > > TC version, for reasons I won't get into here. I also loathe > Reflection. > > But I can live with those things, if it addresses everyone else's > concerns. > > This proposal isn't perfect; there is no perfect solution that will > fulfill > > everyone's ideal. But, is this something we could live with? If not, > is > > there a way to modify it to address whatever is unacceptable, while > still > > addressing the major concerns others have? Or is this just right out? > > > > > > On Fri, Apr 19, 2019 at 10:43 AM Jeff Elsloo <els...@apache.org> > wrote: > > > > > Without actually seeing how that would look across the code base, > the > > > best I can say is maybe. On the surface your proposal seems to > improve > > > the areas I'm concerned about, but we still have this implicit > model > > > where the server is responsible for dealing with older clients that > > > might not submit all data as expected. This implicitly requires us > to > > > handle the absence of that data in future APIs and think about how > any > > > change might impact all client versions across versions of ATC. > > > > > > My concern really amounts to the investment of time required to > think > > > through and implement changes that may affect the myriad of > different > > > client/server version combinations. If we remove that from the > > > equation entirely, we have a much simpler API that has a 1:1 > > > correspondence with the route and function, and only one way to > > > create/update a $thing (i.e.: a delivery service). I think having > only > > > one way to create/update a $thing is a much safer way of doing > > > business than continuing to support multiple versions of clients, > > > regardless of how easy that might be with this proposed approach. > > > Unless I'm missing something, the implementation might be > simplified > > > using this approach but the complexity of solving for the > combination > > > of client versions still exists which makes it harder to do > anything > > > when writing API code. > > > > > > So, it isn't a matter of whether this approach is simple enough > for us > > > to continue with semantic versioning. It's a matter of whether we > want > > > to have to continue to deal with older clients that prevent us from > > > making certain changes in the API because we are afraid of breaking > > > that client. I think that's a lot of burden for our small > development > > > team to shoulder for questionable utility. Viewed from another > lens, > > > with the semantic versioning approach we are enabling clients to be > > > lazy about updating their _unknown and custom_ client code at the > cost > > > of developer productivity and progress on our project. > > > > > > I'm not saying that semantic versioning is solely to blame for our > > > lack of progress on our migration to Golang, but it's one more > thing > > > that is slowing us down and definitely hasn't helped improve > progress. > > > -- > > > Thanks, > > > Jeff > > > > > > On Thu, Apr 18, 2019 at 3:23 PM Robert Butts <r...@apache.org> > wrote: > > > > > > > > >This is about simplifying our code in the API > > > > > > > > @jeff.elsloo That's what the tag solution I proposed does. The > only > > > > difference from not versioning, is that fields will have a new > tag, > > > > "NewField *int `json:"newField, db:"new_field", api:"1.5"`, and > endpoints > > > > will have an extra line, "json := api.NewJSON("1.4")". That's > it. That > > > > would be the entirety of the API code (or very nearly, Rawlin is > right, I > > > > haven't implemented it to be 100% sure). The library itself is > also tiny, > > > > it's ~250 lines of logic in a single file. 500 lines including > comments > > > and > > > > boilerplate. > > > > > > > > How do you feel about that? Would that be simple enough? > > > > > > > > > > > > On Thu, Apr 18, 2019 at 3:15 PM Fieck, Brennan < > > > brennan_fi...@comcast.com> > > > > wrote: > > > > > > > > > >If you're deploying the head of master, API minor versioning > doesn't > > > > > really solve that consistent API problem unless we start > saying that > > > > > every single new field added to an API endpoint is a new minor > version > > > > > instead of just incrementing an API's version once per TC > release. > > > > > > > > > > Yeah, you shouldn't expect an active development branch to be > stable - > > > > > it's the whole reason we have releases at all. We can't support > > > something > > > > > that changes potentially a dozen times in a day. > > > > > > > > > > >If someone goes > > > > > to the trouble to understand how our APIs work and develops > their own > > > > > client code, why is it so unreasonable to expect them to also > > > > > understand how an update of Traffic Ops could impact their > _custom_ > > > > > tooling? > > > > > > > > > > I agree with this so hard. I'd love to just say "TO vX uses > the vX API, > > > > > major changes to the biggest TC component are a major change > to TC", > > > but at > > > > > any given time we support and provide bug/security fixes for > versions > > > X and > > > > > X-1. I'll settle for eliminating minor API versions, though. > > > Developers can > > > > > be expected to understand that changing versions of a thing > can change > > > > > aspects of the ways in which you can interact with said thing. > A major > > > > > version change means major changes and a minor version change > means > > > minor > > > > > changes. > > > > > ________________________________________ > > > > > From: Jeff Elsloo <jeff.els...@gmail.com> > > > > > Sent: Thursday, April 18, 2019 2:59 PM > > > > > To: dev@trafficcontrol.apache.org > > > > > Subject: Re: [EXTERNAL] Re: Traffic Ops API versioning issues > > > > > > > > > > > Maybe I'm the only one, and everyone else can vote me out, > but I > > > don't > > > > > see > > > > > that as acceptable. It's our responsibility as developers to > create a > > > safe > > > > > user experience, and unacceptable to declare real bugs to be > the user's > > > > > fault for not using it right. When our Production CDN goes down > > > because an > > > > > Ops person used an old client and didn't "just recompile," > it's not > > > that > > > > > Ops person's fault, it's our fault as Developers, for > designing a > > > dangerous > > > > > system. Our job is to prevent the CDN from going down, not to > shift the > > > > > blame when it does. > > > > > > > > > > I don't think this is about shifting blame, safety, or the > potential > > > > > to crash a CDN. This is about simplifying our code in the API > and > > > > > making it more maintainable. If we simplify the API, we can > accelerate > > > > > development and get more things done, and maybe even complete > this > > > > > Golang migration. Another plus is simplification of routes by > > > > > eliminating versioning means less code and likely more > stability and > > > > > safety, easier testing, and less developer confusion, in the > long run. > > > > > > > > > > I think it's unreasonable for us to shoulder the burden and > cost to > > > > > maintain various API versions because we're afraid we might > break some > > > > > client out in the wild that might or might not exist. If > someone goes > > > > > to the trouble to understand how our APIs work and develops > their own > > > > > client code, why is it so unreasonable to expect them to also > > > > > understand how an update of Traffic Ops could impact their > _custom_ > > > > > tooling? > > > > > > > > > > Obviously we have to hold up our end of the deal and have good > API > > > > > documentation and change logs. I think the cost of maintaining > that is > > > > > much less than API versioning given our experience, especially > after > > > > > we simplify the APIs. We're already doing much of that today. > > > > > -- > > > > > Thanks, > > > > > Jeff > > > > > > > > > > On Thu, Apr 18, 2019 at 11:37 AM Robert Butts <r...@apache.org> > wrote: > > > > > > > > > > > > >Without minor versions, #3497 would not even an issue. It's > only an > > > > > issue > > > > > > because of the attempt to support minor versioning. > > > > > > > > > > > > That's simply not true. It's exactly the same issue. > Removing minor > > > > > > versioning just hides the issue. You have declared: > > > > > > > > > > > > >only certain clients that don't handle new unknown fields > would > > > > > > potentially be broken > > > > > > > > > > > > >all the client has to do is recompile > > > > > > > > > > > > Something doesn't cease to be an issue, because you redefine > it to > > > be the > > > > > > user's fault. It's exactly the same issue, removing minor > versions > > > just > > > > > > makes it much more difficult to debug. > > > > > > > > > > > > You're proposing not only removing minor versions, but > creating data > > > loss > > > > > > and version mismatch bugs, and declaring them to be the > user's fault. > > > > > > > > > > > > Maybe I'm the only one, and everyone else can vote me out, > but I > > > don't > > > > > see > > > > > > that as acceptable. It's our responsibility as developers to > create a > > > > > safe > > > > > > user experience, and unacceptable to declare real bugs to be > the > > > user's > > > > > > fault for not using it right. When our Production CDN goes > down > > > because > > > > > an > > > > > > Ops person used an old client and didn't "just recompile," > it's not > > > that > > > > > > Ops person's fault, it's our fault as Developers, for > designing a > > > > > dangerous > > > > > > system. Our job is to prevent the CDN from going down, not > to shift > > > the > > > > > > blame when it does. > > > > > > > > > > > > >Switching all the endpoints over to your "apiver" library > would not > > > be > > > > > as > > > > > > trivial to implement or remove as you make it sound. > > > > > > > > > > > > Maybe. I'm offering to do it. If you're sure, why don't you > let me > > > > > > demonstrate, and prove myself wrong? > > > > > > > > > > > > >It would require lots of added API test coverage > > > > > > > > > > > > Require? That would be ideal, but we have supported minor > versions > > > for > > > > > the > > > > > > history of Traffic Ops, and never had extensive version > tests. I > > > agree we > > > > > > should, but you're adding additional requirements to further > your > > > > > position, > > > > > > which doesn't seem fair. Notwithstanding, the tag library > already > > > has 90% > > > > > > test coverage and 3x as many lines of test code as logic; > and the API > > > > > Tests > > > > > > are actually pretty easy, I just added one in the > old-version-update > > > fix, > > > > > > and it was much easier than I expected: > > > > > > > > > > > > > > > https://github.com/apache/trafficcontrol/pull/3500/commits/16f2c96f086836f1d655fd62e673ee0a5e95e785 > > > > > > . > > > > > > > > > > > > >Certain UPDATE queries might be easy to generate from a > given > > > struct if > > > > > > the struct only uses a single table, but I don't think > something like > > > > > that > > > > > > would work for a field like `cachegroup.LocalizationMethods` > which > > > > > doesn't > > > > > > come from the cachegroups table > > > > > > > > > > > > I believe it is easy. The function to parse tags can use the > tags in > > > the > > > > > > primary object (Cachegroup), and the sub-objects > > > (LocalizationMethods) > > > > > will > > > > > > have their own version tags. I could be mistaken, I haven't > actually > > > > > > written the code yet, but I'm pretty sure sub-objects with > > > sub-updates > > > > > > won't be any more difficult or require much if any special > logic. > > > > > > > > > > > > > > > > > > On Thu, Apr 18, 2019 at 10:37 AM Gray, Jonathan < > > > > > jonathan_g...@comcast.com> > > > > > > wrote: > > > > > > > > > > > > > At the end of the day, what I want is a consistent API > that I can > > > code > > > > > > > against in the head of master that's treated like a > contract. As > > > an > > > > > API > > > > > > > user outside of the ATC repo it's incredibly frustrating > to have my > > > > > stuff > > > > > > > break all the time. It basically encourages never > developing > > > using the > > > > > > > latest API versions (regardless of how they're defined and > even > > > then > > > > > things > > > > > > > still break retroactively) or a non-official OSS release > > > alltogether. > > > > > It's > > > > > > > a catch22 to be forced to either not vendor the > go/python/bash > > > > > libraries > > > > > > > which leads to constant develop/recompile/deploys in > lockstep with > > > ATC > > > > > or > > > > > > > vendor and still have to do these things when stuff breaks > anyway > > > in > > > > > the > > > > > > > API. Really debating the native client libraries at all > is just a > > > red > > > > > > > herring because the root issue is the HTTP API itself > which is the > > > real > > > > > > > thing to care about since not all integrations use one of > the > > > client > > > > > > > libraries, nor can be forced to do so, and may require a > rigid API > > > > > > > definition. > > > > > > > > > > > > > > Jonathan G > > > > > > > > > > > > > > > > > > > > > On 4/18/19, 10:12 AM, "Rawlin Peters" < > rawlin.pet...@gmail.com> > > > > > wrote: > > > > > > > > > > > > > > > The UPDATE statements need modified to fix #3497 > even if we > > > get > > > > > rid > > > > > > > of > > > > > > > > versioning. Unless we decide to permanently break > all clients > > > > > older > > > > > > > than > > > > > > > > the newest server field, with every new server > upgrade. The > > > only > > > > > > > other > > > > > > > > option is to fix the updates. Unless you know of a > way to fix > > > > > missing > > > > > > > > fields without changing the update statements, that > I'm not > > > > > seeing? > > > > > > > > > > > > > > By removing minor versioning, only certain clients > that don't > > > > > handle > > > > > > > new unknown fields would potentially be broken, and I > believe > > > only > > > > > the > > > > > > > TO Go client has that problem in our repo. However, > the TO Go > > > > > client > > > > > > > happens to use the same Go structs as > traffic_ops_golang, so > > > > > whenever > > > > > > > new fields are added to the API, all the client has to > do is > > > > > recompile > > > > > > > with the up-to-date structs. Unless we made breaking > changes > > > to the > > > > > > > client, in most cases all that would be needed for > those > > > clients > > > > > is a > > > > > > > recompile. Traffic Portal, the Python TO client, and > I'm pretty > > > > > sure > > > > > > > the Java TO client all handle unknown fields properly. > > > > > > > > > > > > > > Without minor versions, #3497 would not even an issue. > It's > > > only an > > > > > > > issue because of the attempt to support minor > versioning. If we > > > > > just > > > > > > > support the major version, all client requests would be > > > treated as > > > > > v1, > > > > > > > and there would only ever be one SQL UPDATE statement > per major > > > > > > > version. We wouldn't need to "upgrade" 1.2 requests > into a 1.4 > > > > > struct > > > > > > > (thus preventing the bug in #3497) by selecting and > inserting > > > all > > > > > 1.4 > > > > > > > values from the DB into the struct before handling the > request > > > or > > > > > > > dynamically generating the SQL UPDATE statement to use > based > > > on the > > > > > > > requested minor version. > > > > > > > > > > > > > > > So, this solution actually gives us > > > > > > > > this bug fix almost for free. All that's required is > another > > > > > small > > > > > > > function > > > > > > > > to iterate over the object fields to create the > update query. > > > > > It's > > > > > > > by far > > > > > > > > the easiest and simplest fix for #3497; unless we > also > > > > > permanently > > > > > > > break > > > > > > > > all older clients on every server upgrade along with > the > > > minor > > > > > > > version > > > > > > > > removal. > > > > > > > > > > > > > > Switching all the endpoints over to your "apiver" > library would > > > > > not be > > > > > > > as trivial to implement or remove as you make it > sound. It > > > would > > > > > > > require lots of added API test coverage and a > non-trivial > > > amount of > > > > > > > code modifications to all API endpoints. Certain > UPDATE queries > > > > > might > > > > > > > be easy to generate from a given struct if the struct > only > > > uses a > > > > > > > single table, but I don't think something like that > would work > > > for > > > > > a > > > > > > > field like `cachegroup.LocalizationMethods` which > doesn't come > > > from > > > > > > > the cachegroups table and is updated separately from > the rest > > > of > > > > > the > > > > > > > cachegroup fields. > > > > > > > > > > > > > > - Rawlin > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >