sorry, didn't see your summary until after i hit send On Fri, Apr 19, 2019 at 4:09 PM Gray, Jonathan <jonathan_g...@comcast.com> wrote:
> Slow down a little bit. Do people believe things in my summary are > untrue/inaccurate (it's ok if that's the case)? > > Jonathan G > > > On 4/19/19, 3:42 PM, "Jeremy Mitchell" <mitchell...@gmail.com> wrote: > > assuming that everyone is ok with ditching minor version support for > the > api (not really sure if rob is yet), what about this release plan? > > TC 3.0 / supports api 1.3, 1.2, 1.1 (this is already out the door) > TC 4.0 / supports api 1.4, 1.3, 1.2, 1.1 (cut this branch asap) > TC 5.0 / supports api v2, treat all calls to 1.x as 1.4 (this will be > master after the 4.0 branch is cut) > > note: this leaves the existing data loss ( > https://github.com/apache/trafficcontrol/issues/3497) problem in 3.0 > and > 4.0 that needs to be addressed. > > On Fri, Apr 19, 2019 at 3:39 PM Gray, Jonathan < > jonathan_g...@comcast.com> > wrote: > > > Also to be sure I'm tracking properly the complex subject here: > > > > 1. Today there are two promises: > > 1a. One major rev back at the project level for all code/api/etc > > 1b. API versions independently (but probably not stable enough on > > whatever the latest is due to self-discipline/definition) > > > > 2. Primary issues around API stability are the addition and removal > of > > routes/fields > > 2a. This relates to the discussion around 1b > > 2b. A significant portion of this could be mitigated through the > > support of a PATCH operation as opposed to passing full objects > around > > which already has inherent race conditions in a multi-user system. > > 2c. This is where maintenance costs are potentially very high, but > > might be mitigated somewhat through rob's apiver library. > > > > 3. Secondary issues are around strong versus weak typing > > 3a. This is the primary source or breakage lately and is the > result of > > conversions from perl to go > > 3b. This breaks not only promise 1b, but also promise 1a. > > 3c. The solution to this is simply to be more diligent in the use > of > > existing or new datatypes until such time that a new major revision > of > > either 1a or 1b is made. Example use case issue #3304. > > > > 4. Better support for discoverability and compatibility by and > between > > API components/consumers > > 4a. At the moment there is no way as a client to know which API > > versions are supported by an arbitrary TO instance. Documented in > issue > > #2872 > > 4b. This leads to failure scenarios wherein TO isn't upgraded > first or > > newer clients exist. I stopped counting when 5 different people > internally > > ran into this with the addition of the 1.4 API for us. > > > > 5. There are additional concerns with how we handle 1b with regard > to > > master as opposed to OSS releases and promise 1a > > 5a. There is a lack of formality when it comes to does component X > on > > master changeset A work with component Y on master changeset B > before it > > lands in OSS Release Q. > > 5b. Presently this entails SME reviewing changelogs of component > X & Y. > > 5c. This is the question SemVer helps with by better defining that > > formality at the API layer. > > 5c.i. It's not a total solution because unversioned > > payloads/workflows such as the CRConfig can still cause additional > issues. > > 5d. This is where today we're relying on our existing monolithic > > repository and one version to rule them all stance so that in theory > we > > never ask 5a > > 5d.i. This is backed by our current testing procedures. When > running > > TO API tests today it's presumed to use one version inside the CIAB > > environment. > > > > I think if we can clarify and agree on this, the questsions around > how to > > version technically and ATC-based clients versus 3rd-party clients is > > mostly mitigated. > > > > Jonathan G > > > > > > On 4/19/19, 1:26 PM, "Rawlin Peters" <rawlin.pet...@gmail.com> > wrote: > > > > How about this: > > 1. At the Go struct level, every struct for an API endpoint gets > a > > "Version" field. > > 2. Every time a backwards-compatible field is added to an API > > endpoint, the Version is incremented for that struct (even if a > TC > > release hasn't been made between increments). > > 3. This Version field is read-only and included in all > GET/PUT/POST > > JSON responses. > > > > For the Go TO client, this would allow the client to GET a > resource > > and check if the resource's version matches the version that the > > client is currently using. If the client is just reading certain > > fields and only cares to read the fields it already knows about, > it > > wouldn't have to check the version field at all. If the client > wants > > to update a resource, it could GET the resource, compare the > resource > > versions, and make a determination about whether or not the > update > > would be safe. If the versions match, it can safely update the > > resource without risk of data loss. If the versions do NOT > match, the > > client might choose to error out instead and/or send an email to > the > > maintainer that it should be recompiled for safety. > > > > This would allow classes of clients that handle unknown fields > > properly (e.g. Traffic Portal, Python+Java TO clients) to > continue > > working as they do today, only having to worry about the API > major > > version. > > > > For clients that might be more susceptible to data loss due to > > addition of unknown fields (e.g. Go TO client), they would have > enough > > information returned to them by the API in order to know if they > can > > safely make updates to resources. If a Go TO client is only > reading > > certain fields and not making updates, it would probably never > need to > > be recompiled for the entire life of the API major version. If a > Go TO > > client is updating resources, it only needs recompiled as often > as the > > APIs its actually using have been updated. > > > > This would allow random, non-standard TO clients to be written > and > > used for as long as possible before recompilation is required, > and > > allow the client to build in proper safeguards only where > necessary. > > > > - Rawlin > > > > On Fri, Apr 19, 2019 at 12:01 PM Robert Butts <r...@apache.org> > wrote: > > > > > > I'm trying really hard to come up with a solution that > addresses > > everyone's > > > major concerns. I think we'll have a better product, that > everyone > > can live > > > with, if we all try to think of solutions and are willing to > > compromise, > > > rather than take hard-line approach and refuse to compromise, > and > > argue > > > until we're all just unhappy, and whatever gets pushed through > meets > > a few > > > people's concerns and nobody else's. I'd definitely appreciate > any > > help in > > > that regard. > > > > > > Sometimes there really are only two options, A or B. But this > > particular > > > issue has countless possibilities. We're all smart people, we > can > > figure > > > something out that addresses everyone's needs and concerns. > > > > > > What about this idea: > > > > > > Along the lines of @hbeatty 's suggestion, what if we: > > > > > > 1. Make the API version match the TC version. > > > 2. Always release new TC major versions, never do TC minor > version > > releases. > > > 3. Support one major version back, in the API and clients. > > > 4. New backward-compatible changes require a TC=API major > version > > increase. > > > 5. OPTIONAL: There should be an API route to get the exact TC > > version (e.g. > > > https://.../api/v3/version). (This isn't strictly necessary, > but > > it's > > > on @hbeatty > > > 's list, and I know it's on @jonathan_gray 's, and it's > super-easy > > and > > > there's no reason not to.) > > > > > > This: > > > 1. Addresses the client version bugs concern: older clients > simply > > don't > > > work because we don't support them, and newer clients will get > the > > "please > > > downgrade" response. > > > 2. Addresses the code ease-of-writing concern: We only ever > have to > > > maintain 1 older version in the API, which will typically only > be a > > few > > > fields on a few endpoints. > > > 3. Partially addresses the ease-of-use concern > (Ops/@jonathan_gray). > > It > > > addresses the scripts-breaking-things problem, but it does > make user > > > scripts have a hard upgrade deadline. I see this as the biggest > > weakness of > > > this idea, and unfortunately I don't see a remedy; if the > user-side > > people > > > are willing to live with that? > > > 4. Patch versions are still ok. This doesn't prevent e.g. > 4.0.1 when > > we > > > find a major bug in a release; just adding new things that > would be a > > > SemVer Minor Version. > > > > > > Some points: > > > 1. Only doing major versions, we'll obviously quickly reach > Traffic > > Control > > > Version 47. I think that's ok. There's precedent for this, > Chrome and > > > Firefox both do this, Chrome's latest version is 68 and > Firefox is > > 66. It > > > might seem odd, but I don't think there are any big downsides. > > > 2. This will make @jonathan_gray 's (/ Ops) life slightly > harder, > > having to > > > upgrade script clients more frequently. But it prevents the > data > > loss risks > > > (which I know everyone here doesn't agree with, but some of us > do, > > so bear > > > with me), and upgrading our maintained clients should be > relatively > > simple. > > > 2.1 As @hbeatty points out, if we release at a 6-month > interval, > > this would > > > mean scripts using old clients would be supported for 1 year. > We > > could > > > optionally support 2 major versions back, if we were willing > to live > > with a > > > little more server work, to support 2-year-old clients. > > > > > > Just to be clear, I personally don't like making the API > version > > match the > > > TC version, for reasons I won't get into here. I also loathe > > Reflection. > > > But I can live with those things, if it addresses everyone > else's > > concerns. > > > This proposal isn't perfect; there is no perfect solution that > will > > fulfill > > > everyone's ideal. But, is this something we could live with? > If not, > > is > > > there a way to modify it to address whatever is unacceptable, > while > > still > > > addressing the major concerns others have? Or is this just > right out? > > > > > > > > > On Fri, Apr 19, 2019 at 10:43 AM Jeff Elsloo < > els...@apache.org> > > wrote: > > > > > > > Without actually seeing how that would look across the code > base, > > the > > > > best I can say is maybe. On the surface your proposal seems > to > > improve > > > > the areas I'm concerned about, but we still have this > implicit > > model > > > > where the server is responsible for dealing with older > clients that > > > > might not submit all data as expected. This implicitly > requires us > > to > > > > handle the absence of that data in future APIs and think > about how > > any > > > > change might impact all client versions across versions of > ATC. > > > > > > > > My concern really amounts to the investment of time required > to > > think > > > > through and implement changes that may affect the myriad of > > different > > > > client/server version combinations. If we remove that from > the > > > > equation entirely, we have a much simpler API that has a 1:1 > > > > correspondence with the route and function, and only one way > to > > > > create/update a $thing (i.e.: a delivery service). I think > having > > only > > > > one way to create/update a $thing is a much safer way of > doing > > > > business than continuing to support multiple versions of > clients, > > > > regardless of how easy that might be with this proposed > approach. > > > > Unless I'm missing something, the implementation might be > > simplified > > > > using this approach but the complexity of solving for the > > combination > > > > of client versions still exists which makes it harder to do > > anything > > > > when writing API code. > > > > > > > > So, it isn't a matter of whether this approach is simple > enough > > for us > > > > to continue with semantic versioning. It's a matter of > whether we > > want > > > > to have to continue to deal with older clients that prevent > us from > > > > making certain changes in the API because we are afraid of > breaking > > > > that client. I think that's a lot of burden for our small > > development > > > > team to shoulder for questionable utility. Viewed from > another > > lens, > > > > with the semantic versioning approach we are enabling > clients to be > > > > lazy about updating their _unknown and custom_ client code > at the > > cost > > > > of developer productivity and progress on our project. > > > > > > > > I'm not saying that semantic versioning is solely to blame > for our > > > > lack of progress on our migration to Golang, but it's one > more > > thing > > > > that is slowing us down and definitely hasn't helped improve > > progress. > > > > -- > > > > Thanks, > > > > Jeff > > > > > > > > On Thu, Apr 18, 2019 at 3:23 PM Robert Butts <r...@apache.org > > > > wrote: > > > > > > > > > > >This is about simplifying our code in the API > > > > > > > > > > @jeff.elsloo That's what the tag solution I proposed does. > The > > only > > > > > difference from not versioning, is that fields will have a > new > > tag, > > > > > "NewField *int `json:"newField, db:"new_field", > api:"1.5"`, and > > endpoints > > > > > will have an extra line, "json := api.NewJSON("1.4")". > That's > > it. That > > > > > would be the entirety of the API code (or very nearly, > Rawlin is > > right, I > > > > > haven't implemented it to be 100% sure). The library > itself is > > also tiny, > > > > > it's ~250 lines of logic in a single file. 500 lines > including > > comments > > > > and > > > > > boilerplate. > > > > > > > > > > How do you feel about that? Would that be simple enough? > > > > > > > > > > > > > > > On Thu, Apr 18, 2019 at 3:15 PM Fieck, Brennan < > > > > brennan_fi...@comcast.com> > > > > > wrote: > > > > > > > > > > > >If you're deploying the head of master, API minor > versioning > > doesn't > > > > > > really solve that consistent API problem unless we start > > saying that > > > > > > every single new field added to an API endpoint is a new > minor > > version > > > > > > instead of just incrementing an API's version once per TC > > release. > > > > > > > > > > > > Yeah, you shouldn't expect an active development branch > to be > > stable - > > > > > > it's the whole reason we have releases at all. We can't > support > > > > something > > > > > > that changes potentially a dozen times in a day. > > > > > > > > > > > > >If someone goes > > > > > > to the trouble to understand how our APIs work and > develops > > their own > > > > > > client code, why is it so unreasonable to expect them to > also > > > > > > understand how an update of Traffic Ops could impact > their > > _custom_ > > > > > > tooling? > > > > > > > > > > > > I agree with this so hard. I'd love to just say "TO vX > uses > > the vX API, > > > > > > major changes to the biggest TC component are a major > change > > to TC", > > > > but at > > > > > > any given time we support and provide bug/security fixes > for > > versions > > > > X and > > > > > > X-1. I'll settle for eliminating minor API versions, > though. > > > > Developers can > > > > > > be expected to understand that changing versions of a > thing > > can change > > > > > > aspects of the ways in which you can interact with said > thing. > > A major > > > > > > version change means major changes and a minor version > change > > means > > > > minor > > > > > > changes. > > > > > > ________________________________________ > > > > > > From: Jeff Elsloo <jeff.els...@gmail.com> > > > > > > Sent: Thursday, April 18, 2019 2:59 PM > > > > > > To: dev@trafficcontrol.apache.org > > > > > > Subject: Re: [EXTERNAL] Re: Traffic Ops API versioning > issues > > > > > > > > > > > > > Maybe I'm the only one, and everyone else can vote me > out, > > but I > > > > don't > > > > > > see > > > > > > that as acceptable. It's our responsibility as > developers to > > create a > > > > safe > > > > > > user experience, and unacceptable to declare real bugs > to be > > the user's > > > > > > fault for not using it right. When our Production CDN > goes down > > > > because an > > > > > > Ops person used an old client and didn't "just > recompile," > > it's not > > > > that > > > > > > Ops person's fault, it's our fault as Developers, for > > designing a > > > > dangerous > > > > > > system. Our job is to prevent the CDN from going down, > not to > > shift the > > > > > > blame when it does. > > > > > > > > > > > > I don't think this is about shifting blame, safety, or > the > > potential > > > > > > to crash a CDN. This is about simplifying our code in > the API > > and > > > > > > making it more maintainable. If we simplify the API, we > can > > accelerate > > > > > > development and get more things done, and maybe even > complete > > this > > > > > > Golang migration. Another plus is simplification of > routes by > > > > > > eliminating versioning means less code and likely more > > stability and > > > > > > safety, easier testing, and less developer confusion, in > the > > long run. > > > > > > > > > > > > I think it's unreasonable for us to shoulder the burden > and > > cost to > > > > > > maintain various API versions because we're afraid we > might > > break some > > > > > > client out in the wild that might or might not exist. If > > someone goes > > > > > > to the trouble to understand how our APIs work and > develops > > their own > > > > > > client code, why is it so unreasonable to expect them to > also > > > > > > understand how an update of Traffic Ops could impact > their > > _custom_ > > > > > > tooling? > > > > > > > > > > > > Obviously we have to hold up our end of the deal and > have good > > API > > > > > > documentation and change logs. I think the cost of > maintaining > > that is > > > > > > much less than API versioning given our experience, > especially > > after > > > > > > we simplify the APIs. We're already doing much of that > today. > > > > > > -- > > > > > > Thanks, > > > > > > Jeff > > > > > > > > > > > > On Thu, Apr 18, 2019 at 11:37 AM Robert Butts < > r...@apache.org> > > wrote: > > > > > > > > > > > > > > >Without minor versions, #3497 would not even an > issue. It's > > only an > > > > > > issue > > > > > > > because of the attempt to support minor versioning. > > > > > > > > > > > > > > That's simply not true. It's exactly the same issue. > > Removing minor > > > > > > > versioning just hides the issue. You have declared: > > > > > > > > > > > > > > >only certain clients that don't handle new unknown > fields > > would > > > > > > > potentially be broken > > > > > > > > > > > > > > >all the client has to do is recompile > > > > > > > > > > > > > > Something doesn't cease to be an issue, because you > redefine > > it to > > > > be the > > > > > > > user's fault. It's exactly the same issue, removing > minor > > versions > > > > just > > > > > > > makes it much more difficult to debug. > > > > > > > > > > > > > > You're proposing not only removing minor versions, but > > creating data > > > > loss > > > > > > > and version mismatch bugs, and declaring them to be the > > user's fault. > > > > > > > > > > > > > > Maybe I'm the only one, and everyone else can vote me > out, > > but I > > > > don't > > > > > > see > > > > > > > that as acceptable. It's our responsibility as > developers to > > create a > > > > > > safe > > > > > > > user experience, and unacceptable to declare real bugs > to be > > the > > > > user's > > > > > > > fault for not using it right. When our Production CDN > goes > > down > > > > because > > > > > > an > > > > > > > Ops person used an old client and didn't "just > recompile," > > it's not > > > > that > > > > > > > Ops person's fault, it's our fault as Developers, for > > designing a > > > > > > dangerous > > > > > > > system. Our job is to prevent the CDN from going down, > not > > to shift > > > > the > > > > > > > blame when it does. > > > > > > > > > > > > > > >Switching all the endpoints over to your "apiver" > library > > would not > > > > be > > > > > > as > > > > > > > trivial to implement or remove as you make it sound. > > > > > > > > > > > > > > Maybe. I'm offering to do it. If you're sure, why > don't you > > let me > > > > > > > demonstrate, and prove myself wrong? > > > > > > > > > > > > > > >It would require lots of added API test coverage > > > > > > > > > > > > > > Require? That would be ideal, but we have supported > minor > > versions > > > > for > > > > > > the > > > > > > > history of Traffic Ops, and never had extensive version > > tests. I > > > > agree we > > > > > > > should, but you're adding additional requirements to > further > > your > > > > > > position, > > > > > > > which doesn't seem fair. Notwithstanding, the tag > library > > already > > > > has 90% > > > > > > > test coverage and 3x as many lines of test code as > logic; > > and the API > > > > > > Tests > > > > > > > are actually pretty easy, I just added one in the > > old-version-update > > > > fix, > > > > > > > and it was much easier than I expected: > > > > > > > > > > > > > > > > > > > > https://github.com/apache/trafficcontrol/pull/3500/commits/16f2c96f086836f1d655fd62e673ee0a5e95e785 > > > > > > > . > > > > > > > > > > > > > > >Certain UPDATE queries might be easy to generate from > a > > given > > > > struct if > > > > > > > the struct only uses a single table, but I don't think > > something like > > > > > > that > > > > > > > would work for a field like > `cachegroup.LocalizationMethods` > > which > > > > > > doesn't > > > > > > > come from the cachegroups table > > > > > > > > > > > > > > I believe it is easy. The function to parse tags can > use the > > tags in > > > > the > > > > > > > primary object (Cachegroup), and the sub-objects > > > > (LocalizationMethods) > > > > > > will > > > > > > > have their own version tags. I could be mistaken, I > haven't > > actually > > > > > > > written the code yet, but I'm pretty sure sub-objects > with > > > > sub-updates > > > > > > > won't be any more difficult or require much if any > special > > logic. > > > > > > > > > > > > > > > > > > > > > On Thu, Apr 18, 2019 at 10:37 AM Gray, Jonathan < > > > > > > jonathan_g...@comcast.com> > > > > > > > wrote: > > > > > > > > > > > > > > > At the end of the day, what I want is a consistent > API > > that I can > > > > code > > > > > > > > against in the head of master that's treated like a > > contract. As > > > > an > > > > > > API > > > > > > > > user outside of the ATC repo it's incredibly > frustrating > > to have my > > > > > > stuff > > > > > > > > break all the time. It basically encourages never > > developing > > > > using the > > > > > > > > latest API versions (regardless of how they're > defined and > > even > > > > then > > > > > > things > > > > > > > > still break retroactively) or a non-official OSS > release > > > > alltogether. > > > > > > It's > > > > > > > > a catch22 to be forced to either not vendor the > > go/python/bash > > > > > > libraries > > > > > > > > which leads to constant develop/recompile/deploys in > > lockstep with > > > > ATC > > > > > > or > > > > > > > > vendor and still have to do these things when stuff > breaks > > anyway > > > > in > > > > > > the > > > > > > > > API. Really debating the native client libraries at > all > > is just a > > > > red > > > > > > > > herring because the root issue is the HTTP API itself > > which is the > > > > real > > > > > > > > thing to care about since not all integrations use > one of > > the > > > > client > > > > > > > > libraries, nor can be forced to do so, and may > require a > > rigid API > > > > > > > > definition. > > > > > > > > > > > > > > > > Jonathan G > > > > > > > > > > > > > > > > > > > > > > > > On 4/18/19, 10:12 AM, "Rawlin Peters" < > > rawlin.pet...@gmail.com> > > > > > > wrote: > > > > > > > > > > > > > > > > > The UPDATE statements need modified to fix > #3497 > > even if we > > > > get > > > > > > rid > > > > > > > > of > > > > > > > > > versioning. Unless we decide to permanently > break > > all clients > > > > > > older > > > > > > > > than > > > > > > > > > the newest server field, with every new server > > upgrade. The > > > > only > > > > > > > > other > > > > > > > > > option is to fix the updates. Unless you know > of a > > way to fix > > > > > > missing > > > > > > > > > fields without changing the update statements, > that > > I'm not > > > > > > seeing? > > > > > > > > > > > > > > > > By removing minor versioning, only certain > clients > > that don't > > > > > > handle > > > > > > > > new unknown fields would potentially be broken, > and I > > believe > > > > only > > > > > > the > > > > > > > > TO Go client has that problem in our repo. > However, > > the TO Go > > > > > > client > > > > > > > > happens to use the same Go structs as > > traffic_ops_golang, so > > > > > > whenever > > > > > > > > new fields are added to the API, all the client > has to > > do is > > > > > > recompile > > > > > > > > with the up-to-date structs. Unless we made > breaking > > changes > > > > to the > > > > > > > > client, in most cases all that would be needed > for > > those > > > > clients > > > > > > is a > > > > > > > > recompile. Traffic Portal, the Python TO client, > and > > I'm pretty > > > > > > sure > > > > > > > > the Java TO client all handle unknown fields > properly. > > > > > > > > > > > > > > > > Without minor versions, #3497 would not even an > issue. > > It's > > > > only an > > > > > > > > issue because of the attempt to support minor > > versioning. If we > > > > > > just > > > > > > > > support the major version, all client requests > would be > > > > treated as > > > > > > v1, > > > > > > > > and there would only ever be one SQL UPDATE > statement > > per major > > > > > > > > version. We wouldn't need to "upgrade" 1.2 > requests > > into a 1.4 > > > > > > struct > > > > > > > > (thus preventing the bug in #3497) by selecting > and > > inserting > > > > all > > > > > > 1.4 > > > > > > > > values from the DB into the struct before > handling the > > request > > > > or > > > > > > > > dynamically generating the SQL UPDATE statement > to use > > based > > > > on the > > > > > > > > requested minor version. > > > > > > > > > > > > > > > > > So, this solution actually gives us > > > > > > > > > this bug fix almost for free. All that's > required is > > another > > > > > > small > > > > > > > > function > > > > > > > > > to iterate over the object fields to create the > > update query. > > > > > > It's > > > > > > > > by far > > > > > > > > > the easiest and simplest fix for #3497; unless > we > > also > > > > > > permanently > > > > > > > > break > > > > > > > > > all older clients on every server upgrade > along with > > the > > > > minor > > > > > > > > version > > > > > > > > > removal. > > > > > > > > > > > > > > > > Switching all the endpoints over to your "apiver" > > library would > > > > > > not be > > > > > > > > as trivial to implement or remove as you make it > > sound. It > > > > would > > > > > > > > require lots of added API test coverage and a > > non-trivial > > > > amount of > > > > > > > > code modifications to all API endpoints. Certain > > UPDATE queries > > > > > > might > > > > > > > > be easy to generate from a given struct if the > struct > > only > > > > uses a > > > > > > > > single table, but I don't think something like > that > > would work > > > > for > > > > > > a > > > > > > > > field like `cachegroup.LocalizationMethods` which > > doesn't come > > > > from > > > > > > > > the cachegroups table and is updated separately > from > > the rest > > > > of > > > > > > the > > > > > > > > cachegroup fields. > > > > > > > > > > > > > > > > - Rawlin > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >