Also to be sure I'm tracking properly the complex subject here:

1. Today there are two promises:
  1a. One major rev back at the project level for all code/api/etc
  1b. API versions independently (but probably not stable enough on whatever 
the latest is due to self-discipline/definition)

2.  Primary issues around API stability are the addition and removal of 
routes/fields
  2a.  This relates to the discussion around 1b
  2b.  A significant portion of this could be mitigated through the support of 
a PATCH operation as opposed to passing full objects around which already has 
inherent race conditions in a multi-user system.
  2c.  This is where maintenance costs are potentially very high, but might be 
mitigated somewhat through rob's apiver library.

3.  Secondary issues are around strong versus weak typing 
  3a. This is the primary source or breakage lately and is the result of 
conversions from perl to go
  3b.  This breaks not only promise 1b, but also promise 1a.
  3c.  The solution to this is simply to be more diligent in the use of 
existing or new datatypes until such time that a new major revision of either 
1a or 1b is made.  Example use case issue #3304.

4.  Better support for discoverability and compatibility by and between API 
components/consumers
  4a.  At the moment there is no way as a client to know which API versions are 
supported by an arbitrary TO instance.  Documented in issue #2872
  4b. This leads to failure scenarios wherein TO isn't upgraded first or newer 
clients exist.  I stopped counting when 5 different people internally ran into 
this with the addition of the 1.4 API for us.

5.  There are additional concerns with how we handle 1b with regard to master 
as opposed to OSS releases and promise 1a
  5a. There is a lack of formality when it comes to does component X on master 
changeset A work with component Y on master changeset B before it lands in OSS 
Release Q.
  5b.  Presently this entails SME reviewing changelogs of component X & Y.
  5c.  This is the question SemVer helps with by better defining that formality 
at the API layer.
    5c.i.  It's not a total solution because unversioned payloads/workflows 
such as the CRConfig can still cause additional issues.
  5d.  This is where today we're relying on our existing monolithic repository 
and one version to rule them all stance so that in theory we never ask 5a
    5d.i.  This is backed by our current testing procedures.  When running TO 
API tests today it's presumed to use one version inside the CIAB environment.

I think if we can clarify and agree on this, the questsions around how to 
version technically and ATC-based clients versus 3rd-party clients is mostly 
mitigated.

Jonathan G


On 4/19/19, 1:26 PM, "Rawlin Peters" <rawlin.pet...@gmail.com> wrote:

    How about this:
    1. At the Go struct level, every struct for an API endpoint gets a
    "Version" field.
    2. Every time a backwards-compatible field is added to an API
    endpoint, the Version is incremented for that struct (even if a TC
    release hasn't been made between increments).
    3. This Version field is read-only and included in all GET/PUT/POST
    JSON responses.
    
    For the Go TO client, this would allow the client to GET a resource
    and check if the resource's version matches the version that the
    client is currently using. If the client is just reading certain
    fields and only cares to read the fields it already knows about, it
    wouldn't have to check the version field at all. If the client wants
    to update a resource, it could GET the resource, compare the resource
    versions, and make a determination about whether or not the update
    would be safe. If the versions match, it can safely update the
    resource without risk of data loss. If the versions do NOT match, the
    client might choose to error out instead and/or send an email to the
    maintainer that it should be recompiled for safety.
    
    This would allow classes of clients that handle unknown fields
    properly (e.g. Traffic Portal, Python+Java TO clients) to continue
    working as they do today, only having to worry about the API major
    version.
    
    For clients that might be more susceptible to data loss due to
    addition of unknown fields (e.g. Go TO client), they would have enough
    information returned to them by the API in order to know if they can
    safely make updates to resources. If a Go TO client is only reading
    certain fields and not making updates, it would probably never need to
    be recompiled for the entire life of the API major version. If a Go TO
    client is updating resources, it only needs recompiled as often as the
    APIs its actually using have been updated.
    
    This would allow random, non-standard TO clients to be written and
    used for as long as possible before recompilation is required, and
    allow the client to build in proper safeguards only where necessary.
    
    - Rawlin
    
    On Fri, Apr 19, 2019 at 12:01 PM Robert Butts <r...@apache.org> wrote:
    >
    > I'm trying really hard to come up with a solution that addresses 
everyone's
    > major concerns. I think we'll have a better product, that everyone can 
live
    > with, if we all try to think of solutions and are willing to compromise,
    > rather than take hard-line approach and refuse to compromise, and argue
    > until we're all just unhappy, and whatever gets pushed through meets a few
    > people's concerns and nobody else's. I'd definitely appreciate any help in
    > that regard.
    >
    > Sometimes there really are only two options, A or B. But this particular
    > issue has countless possibilities. We're all smart people, we can figure
    > something out that addresses everyone's needs and concerns.
    >
    > What about this idea:
    >
    > Along the lines of @hbeatty 's suggestion, what if we:
    >
    > 1. Make the API version match the TC version.
    > 2. Always release new TC major versions, never do TC minor version 
releases.
    > 3. Support one major version back, in the API and clients.
    > 4. New backward-compatible changes require a TC=API major version 
increase.
    > 5. OPTIONAL: There should be an API route to get the exact TC version 
(e.g.
    > https://.../api/v3/version). (This isn't strictly necessary, but it's
    > on @hbeatty
    > 's list, and I know it's on @jonathan_gray 's, and it's super-easy and
    > there's no reason not to.)
    >
    > This:
    > 1. Addresses the client version bugs concern: older clients simply don't
    > work because we don't support them, and newer clients will get the "please
    > downgrade" response.
    > 2. Addresses the code ease-of-writing concern: We only ever have to
    > maintain 1 older version in the API, which will typically only be a few
    > fields on a few endpoints.
    > 3. Partially addresses the ease-of-use concern (Ops/@jonathan_gray). It
    > addresses the scripts-breaking-things problem, but it does make user
    > scripts have a hard upgrade deadline. I see this as the biggest weakness 
of
    > this idea, and unfortunately I don't see a remedy; if the user-side people
    > are willing to live with that?
    > 4. Patch versions are still ok. This doesn't prevent e.g. 4.0.1 when we
    > find a major bug in a release; just adding new things that would be a
    > SemVer Minor Version.
    >
    > Some points:
    > 1. Only doing major versions, we'll obviously quickly reach Traffic 
Control
    > Version 47. I think that's ok. There's precedent for this, Chrome and
    > Firefox both do this, Chrome's latest version is 68 and Firefox is 66. It
    > might seem odd, but I don't think there are any big downsides.
    > 2. This will make @jonathan_gray 's (/ Ops) life slightly harder, having 
to
    > upgrade script clients more frequently. But it prevents the data loss 
risks
    > (which I know everyone here doesn't agree with, but some of us do, so bear
    > with me), and upgrading our maintained clients should be relatively 
simple.
    > 2.1 As @hbeatty points out, if we release at a 6-month interval, this 
would
    > mean scripts using old clients would be supported for 1 year. We could
    > optionally support 2 major versions back, if we were willing to live with 
a
    > little more server work, to support 2-year-old clients.
    >
    > Just to be clear, I personally don't like making the API version match the
    > TC version, for reasons I won't get into here. I also loathe Reflection.
    > But I can live with those things, if it addresses everyone else's 
concerns.
    > This proposal isn't perfect; there is no perfect solution that will 
fulfill
    > everyone's ideal. But, is this something we could live with? If not, is
    > there a way to modify it to address whatever is unacceptable, while still
    > addressing the major concerns others have? Or is this just right out?
    >
    >
    > On Fri, Apr 19, 2019 at 10:43 AM Jeff Elsloo <els...@apache.org> wrote:
    >
    > > Without actually seeing how that would look across the code base, the
    > > best I can say is maybe. On the surface your proposal seems to improve
    > > the areas I'm concerned about, but we still have this implicit model
    > > where the server is responsible for dealing with older clients that
    > > might not submit all data as expected. This implicitly requires us to
    > > handle the absence of that data in future APIs and think about how any
    > > change might impact all client versions across versions of ATC.
    > >
    > > My concern really amounts to the investment of time required to think
    > > through and implement changes that may affect the myriad of different
    > > client/server version combinations. If we remove that from the
    > > equation entirely, we have a much simpler API that has a 1:1
    > > correspondence with the route and function, and only one way to
    > > create/update a $thing (i.e.: a delivery service). I think having only
    > > one way to create/update a $thing is a much safer way of doing
    > > business than continuing to support multiple versions of clients,
    > > regardless of how easy that might be with this proposed approach.
    > > Unless I'm missing something, the implementation might be simplified
    > > using this approach but the complexity of solving for the combination
    > > of client versions still exists which makes it harder to do anything
    > > when writing API code.
    > >
    > > So, it isn't a matter of whether this approach is simple enough for us
    > > to continue with semantic versioning. It's a matter of whether we want
    > > to have to continue to deal with older clients that prevent us from
    > > making certain changes in the API because we are afraid of breaking
    > > that client. I think that's a lot of burden for our small development
    > > team to shoulder for questionable utility. Viewed from another lens,
    > > with the semantic versioning approach we are enabling clients to be
    > > lazy about updating their _unknown and custom_ client code at the cost
    > > of developer productivity and progress on our project.
    > >
    > > I'm not saying that semantic versioning is solely to blame for our
    > > lack of progress on our migration to Golang, but it's one more thing
    > > that is slowing us down and definitely hasn't helped improve progress.
    > > --
    > > Thanks,
    > > Jeff
    > >
    > > On Thu, Apr 18, 2019 at 3:23 PM Robert Butts <r...@apache.org> wrote:
    > > >
    > > > >This is about simplifying our code in the API
    > > >
    > > > @jeff.elsloo That's what the tag solution I proposed does. The only
    > > > difference from not versioning, is that fields will have a new tag,
    > > > "NewField *int `json:"newField, db:"new_field", api:"1.5"`, and 
endpoints
    > > > will have an extra line, "json := api.NewJSON("1.4")". That's it. That
    > > > would be the entirety of the API code (or very nearly, Rawlin is 
right, I
    > > > haven't implemented it to be 100% sure). The library itself is also 
tiny,
    > > > it's ~250 lines of logic in a single file. 500 lines including 
comments
    > > and
    > > > boilerplate.
    > > >
    > > > How do you feel about that? Would that be simple enough?
    > > >
    > > >
    > > > On Thu, Apr 18, 2019 at 3:15 PM Fieck, Brennan <
    > > brennan_fi...@comcast.com>
    > > > wrote:
    > > >
    > > > > >If you're deploying the head of master, API minor versioning 
doesn't
    > > > > really solve that consistent API problem unless we start saying that
    > > > > every single new field added to an API endpoint is a new minor 
version
    > > > > instead of just incrementing an API's version once per TC release.
    > > > >
    > > > > Yeah, you shouldn't expect an active development branch to be 
stable -
    > > > > it's the whole reason we have releases at all. We can't support
    > > something
    > > > > that changes potentially a dozen times in a day.
    > > > >
    > > > > >If someone goes
    > > > > to the trouble to understand how our APIs work and develops their 
own
    > > > > client code, why is it so unreasonable to expect them to also
    > > > > understand how an update of Traffic Ops could impact their _custom_
    > > > > tooling?
    > > > >
    > > > > I agree with this so hard. I'd love to just say "TO vX uses the vX 
API,
    > > > > major changes to the biggest TC component are a major change to TC",
    > > but at
    > > > > any given time we support and provide bug/security fixes for 
versions
    > > X and
    > > > > X-1. I'll settle for eliminating minor API versions, though.
    > > Developers can
    > > > > be expected to understand that changing versions of a thing can 
change
    > > > > aspects of the ways in which you can interact with said thing. A 
major
    > > > > version change means major changes and a minor version change means
    > > minor
    > > > > changes.
    > > > > ________________________________________
    > > > > From: Jeff Elsloo <jeff.els...@gmail.com>
    > > > > Sent: Thursday, April 18, 2019 2:59 PM
    > > > > To: dev@trafficcontrol.apache.org
    > > > > Subject: Re: [EXTERNAL] Re: Traffic Ops API versioning issues
    > > > >
    > > > > > Maybe I'm the only one, and everyone else can vote me out, but I
    > > don't
    > > > > see
    > > > > that as acceptable. It's our responsibility as developers to create 
a
    > > safe
    > > > > user experience, and unacceptable to declare real bugs to be the 
user's
    > > > > fault for not using it right. When our Production CDN goes down
    > > because an
    > > > > Ops person used an old client and didn't "just recompile," it's not
    > > that
    > > > > Ops person's fault, it's our fault as Developers, for designing a
    > > dangerous
    > > > > system. Our job is to prevent the CDN from going down, not to shift 
the
    > > > > blame when it does.
    > > > >
    > > > > I don't think this is about shifting blame, safety, or the potential
    > > > > to crash a CDN. This is about simplifying our code in the API and
    > > > > making it more maintainable. If we simplify the API, we can 
accelerate
    > > > > development and get more things done, and maybe even complete this
    > > > > Golang migration. Another plus is simplification of routes by
    > > > > eliminating versioning means less code and likely more stability and
    > > > > safety, easier testing, and less developer confusion, in the long 
run.
    > > > >
    > > > > I think it's unreasonable for us to shoulder the burden and cost to
    > > > > maintain various API versions because we're afraid we might break 
some
    > > > > client out in the wild that might or might not exist. If someone 
goes
    > > > > to the trouble to understand how our APIs work and develops their 
own
    > > > > client code, why is it so unreasonable to expect them to also
    > > > > understand how an update of Traffic Ops could impact their _custom_
    > > > > tooling?
    > > > >
    > > > > Obviously we have to hold up our end of the deal and have good API
    > > > > documentation and change logs. I think the cost of maintaining that 
is
    > > > > much less than API versioning given our experience, especially after
    > > > > we simplify the APIs. We're already doing much of that today.
    > > > > --
    > > > > Thanks,
    > > > > Jeff
    > > > >
    > > > > On Thu, Apr 18, 2019 at 11:37 AM Robert Butts <r...@apache.org> 
wrote:
    > > > > >
    > > > > > >Without minor versions, #3497 would not even an issue. It's only 
an
    > > > > issue
    > > > > > because of the attempt to support minor versioning.
    > > > > >
    > > > > > That's simply not true. It's exactly the same issue. Removing 
minor
    > > > > > versioning just hides the issue. You have declared:
    > > > > >
    > > > > > >only certain clients that don't handle new unknown fields would
    > > > > > potentially be broken
    > > > > >
    > > > > > >all the client has to do is recompile
    > > > > >
    > > > > > Something doesn't cease to be an issue, because you redefine it to
    > > be the
    > > > > > user's fault. It's exactly the same issue, removing minor versions
    > > just
    > > > > > makes it much more difficult to debug.
    > > > > >
    > > > > > You're proposing not only removing minor versions, but creating 
data
    > > loss
    > > > > > and version mismatch bugs, and declaring them to be the user's 
fault.
    > > > > >
    > > > > > Maybe I'm the only one, and everyone else can vote me out, but I
    > > don't
    > > > > see
    > > > > > that as acceptable. It's our responsibility as developers to 
create a
    > > > > safe
    > > > > > user experience, and unacceptable to declare real bugs to be the
    > > user's
    > > > > > fault for not using it right. When our Production CDN goes down
    > > because
    > > > > an
    > > > > > Ops person used an old client and didn't "just recompile," it's 
not
    > > that
    > > > > > Ops person's fault, it's our fault as Developers, for designing a
    > > > > dangerous
    > > > > > system. Our job is to prevent the CDN from going down, not to 
shift
    > > the
    > > > > > blame when it does.
    > > > > >
    > > > > > >Switching all the endpoints over to your "apiver" library would 
not
    > > be
    > > > > as
    > > > > > trivial to implement or remove as you make it sound.
    > > > > >
    > > > > > Maybe. I'm offering to do it. If you're sure, why don't you let me
    > > > > > demonstrate, and prove myself wrong?
    > > > > >
    > > > > > >It would require lots of added API test coverage
    > > > > >
    > > > > > Require? That would be ideal, but we have supported minor versions
    > > for
    > > > > the
    > > > > > history of Traffic Ops, and never had extensive version tests. I
    > > agree we
    > > > > > should, but you're adding additional requirements to further your
    > > > > position,
    > > > > > which doesn't seem fair. Notwithstanding, the tag library already
    > > has 90%
    > > > > > test coverage and 3x as many lines of test code as logic; and the 
API
    > > > > Tests
    > > > > > are actually pretty easy, I just added one in the 
old-version-update
    > > fix,
    > > > > > and it was much easier than I expected:
    > > > > >
    > > > >
    > > 
https://github.com/apache/trafficcontrol/pull/3500/commits/16f2c96f086836f1d655fd62e673ee0a5e95e785
    > > > > > .
    > > > > >
    > > > > > >Certain UPDATE queries might be easy to generate from a given
    > > struct if
    > > > > > the struct only uses a single table, but I don't think something 
like
    > > > > that
    > > > > > would work for a field like `cachegroup.LocalizationMethods` which
    > > > > doesn't
    > > > > > come from the cachegroups table
    > > > > >
    > > > > > I believe it is easy. The function to parse tags can use the tags 
in
    > > the
    > > > > > primary object (Cachegroup), and the sub-objects
    > > (LocalizationMethods)
    > > > > will
    > > > > > have their own version tags. I could be mistaken, I haven't 
actually
    > > > > > written the code yet, but I'm pretty sure sub-objects with
    > > sub-updates
    > > > > > won't be any more difficult or require much if any special logic.
    > > > > >
    > > > > >
    > > > > > On Thu, Apr 18, 2019 at 10:37 AM Gray, Jonathan <
    > > > > jonathan_g...@comcast.com>
    > > > > > wrote:
    > > > > >
    > > > > > > At the end of the day, what I want is a consistent API that I 
can
    > > code
    > > > > > > against in the head of master that's treated like a contract.  
As
    > > an
    > > > > API
    > > > > > > user outside of the ATC repo it's incredibly frustrating to 
have my
    > > > > stuff
    > > > > > > break all the time.  It basically encourages never developing
    > > using the
    > > > > > > latest API versions (regardless of how they're defined and even
    > > then
    > > > > things
    > > > > > > still break retroactively) or a non-official OSS release
    > > alltogether.
    > > > > It's
    > > > > > > a catch22 to be forced to either not vendor the go/python/bash
    > > > > libraries
    > > > > > > which leads to constant develop/recompile/deploys in lockstep 
with
    > > ATC
    > > > > or
    > > > > > > vendor and still have to do these things when stuff breaks 
anyway
    > > in
    > > > > the
    > > > > > > API.  Really debating the native client libraries at all is 
just a
    > > red
    > > > > > > herring because the root issue is the HTTP API itself which is 
the
    > > real
    > > > > > > thing to care about since not all integrations use one of the
    > > client
    > > > > > > libraries, nor can be forced to do so, and may require a rigid 
API
    > > > > > > definition.
    > > > > > >
    > > > > > > Jonathan G
    > > > > > >
    > > > > > >
    > > > > > > On 4/18/19, 10:12 AM, "Rawlin Peters" <rawlin.pet...@gmail.com>
    > > > > wrote:
    > > > > > >
    > > > > > >     > The UPDATE statements need modified to fix #3497 even if 
we
    > > get
    > > > > rid
    > > > > > > of
    > > > > > >     > versioning. Unless we decide to permanently break all 
clients
    > > > > older
    > > > > > > than
    > > > > > >     > the newest server field, with every new server upgrade. 
The
    > > only
    > > > > > > other
    > > > > > >     > option is to fix the updates. Unless you know of a way to 
fix
    > > > > missing
    > > > > > >     > fields without changing the update statements, that I'm 
not
    > > > > seeing?
    > > > > > >
    > > > > > >     By removing minor versioning, only certain clients that 
don't
    > > > > handle
    > > > > > >     new unknown fields would potentially be broken, and I 
believe
    > > only
    > > > > the
    > > > > > >     TO Go client has that problem in our repo. However, the TO 
Go
    > > > > client
    > > > > > >     happens to use the same Go structs as traffic_ops_golang, so
    > > > > whenever
    > > > > > >     new fields are added to the API, all the client has to do is
    > > > > recompile
    > > > > > >     with the up-to-date structs. Unless we made breaking changes
    > > to the
    > > > > > >     client, in most cases all that would be needed for those
    > > clients
    > > > > is a
    > > > > > >     recompile. Traffic Portal, the Python TO client, and I'm 
pretty
    > > > > sure
    > > > > > >     the Java TO client all handle unknown fields properly.
    > > > > > >
    > > > > > >     Without minor versions, #3497 would not even an issue. It's
    > > only an
    > > > > > >     issue because of the attempt to support minor versioning. 
If we
    > > > > just
    > > > > > >     support the major version, all client requests would be
    > > treated as
    > > > > v1,
    > > > > > >     and there would only ever be one SQL UPDATE statement per 
major
    > > > > > >     version. We wouldn't need to "upgrade" 1.2 requests into a 
1.4
    > > > > struct
    > > > > > >     (thus preventing the bug in #3497) by selecting and 
inserting
    > > all
    > > > > 1.4
    > > > > > >     values from the DB into the struct before handling the 
request
    > > or
    > > > > > >     dynamically generating the SQL UPDATE statement to use based
    > > on the
    > > > > > >     requested minor version.
    > > > > > >
    > > > > > >     > So, this solution actually gives us
    > > > > > >     > this bug fix almost for free. All that's required is 
another
    > > > > small
    > > > > > > function
    > > > > > >     > to iterate over the object fields to create the update 
query.
    > > > > It's
    > > > > > > by far
    > > > > > >     > the easiest and simplest fix for #3497; unless we also
    > > > > permanently
    > > > > > > break
    > > > > > >     > all older clients on every server upgrade along with the
    > > minor
    > > > > > > version
    > > > > > >     > removal.
    > > > > > >
    > > > > > >     Switching all the endpoints over to your "apiver" library 
would
    > > > > not be
    > > > > > >     as trivial to implement or remove as you make it sound. It
    > > would
    > > > > > >     require lots of added API test coverage and a non-trivial
    > > amount of
    > > > > > >     code modifications to all API endpoints. Certain UPDATE 
queries
    > > > > might
    > > > > > >     be easy to generate from a given struct if the struct only
    > > uses a
    > > > > > >     single table, but I don't think something like that would 
work
    > > for
    > > > > a
    > > > > > >     field like `cachegroup.LocalizationMethods` which doesn't 
come
    > > from
    > > > > > >     the cachegroups table and is updated separately from the 
rest
    > > of
    > > > > the
    > > > > > >     cachegroup fields.
    > > > > > >
    > > > > > >     - Rawlin
    > > > > > >
    > > > > > >
    > > > > > >
    > > > >
    > >
    

Reply via email to