I'm still digesting this for a better response.  If you're considering what 
both the big next thing for the API is and at the same time are considering 
linking it to the ATC release, I'll just put this back on the radar 
https://github.com/apache/trafficcontrol/issues/2630.  There was good 
discussion after the presentation I gave on it at the last TrafficControl 
Summit as a potential augment/replacement someday.  Now that the TO Golang 
plugin system is in place, it wouldn't be too bad to have a plugin that served 
as an authentication proxy to an instance hanging out there.  I still very much 
like the solution as a whole because it helps encourage better sql data 
modeling practices (admittedly that we're not quite ready for today probably).

Jonathan G


On 2/14/19, 9:39 AM, "Moltzau, Matthew" <[email protected]> wrote:

    > The "CRUDER" is a great innovation, to be sure, as it saves us quite a 
bit of time and prevents
                duplicating code, but the problem it solves is an emergent 
property of of API problem #1 above.
    > With fewer endpoints we'd have much more specific handling, and the need 
for and advantages of
                the "CRUDER" will vanish.
    
    FYI: a restructure of our CRUDer interfaces currently has a PR here that is 
in review:
    
    
https://protect2.fireeye.com/url?k=ef31a313a5498710.ef3184a7-8d8f491145442f8b&u=https://github.com/apache/trafficcontrol/pull/2886
    
    It allows the user to implement a single C, R, U, or D operation for better 
flexibility.
    
    On 2/14/19, 8:21 AM, "Fieck, Brennan" <[email protected]> wrote:
    
        I'd take it further, though in my opinion it shouldn't be done until 
we're ready to do away with Perl entirely.
        I've been working on this draft for a while now, and I sort of wanted 
to wait until I made a wiki page for a proposed API v2 spec before sending it, 
but what the heck. The conversation's here now.
        
        I don't like the way our API is versioned. Let me be clear, it _should_ 
be versioned, and the
        versions _should_ be semantic, but it is my opinion that the API should 
be versioned alongside
        Traffic Ops rather than at its own pace. I've written essentially an 
article on why this should be
        the case, and it follows:
        
        ** TIMING **
        The first issue I should address is timing - this needn't happen 
immediately. For the time being, I
        think a radical change would be much more harmful than continuing along 
our current versioning
        scheme. This change should be implemented with the advent of what we 
currently refer to as "API v2".
        Presumably, at this time all of the Perl code has been removed and we 
are looking to re-tool the API
        to be more sensible and standards-conforming. It's a time when we'll 
already be making big, breaking
        changes. I personally would love for this to be ATC 4.0, but that may 
not be a realistic goal.
        
        ** REDUNDANCY **
        The easiest to see - albiet simultaneously most trivial - issue with 
the current API versioning
        scheme is how every request path must start with '/api/1.x/'. This is 
annoying, to be frank;
        especially when one considers that the current plan for Traffic Ops is 
to reduce it entirely to an
        API, so literally every endpoint will have '/api/' in common. We _know_ 
the endpoint we are trying
        to query is an API endpoint because _all_ endpoints are API endpoints. 
When we reach "API v2" the
        '/api/' part of request paths will have lost all value entirely.
        Even with that gone we are left with '/1.' (or '/2.' as the case may 
become) as a common prefix,
        again annoying although not totally useless in this case. However, the 
vast majority of API
        endpoints see no changes between minor versions, so really '/1.x' just 
becomes a static, constant
        prefix where 'x' is the latest version of the API.
        In any case, versioning the API alongside Traffic Ops solves this 
problem because our Traffic Ops
        server(s) emit HTTP headers that name the server. Once Perl is gone, 
we'll be free to use the HTTP
        `Server:` header to name the server e.g. `Server: Traffic Ops/3.2.1`. 
At this point, we could
        either implement an `OPTIONS` method request at the server's root that 
would just return some
        headers - including `Server:` or just let people to a `GET` (or better 
yet `HEAD`) request to
        `/ping` and pull the server version out of the headers. The client then 
has all the information it
        needs to communicate effectively with the server. The alternative to 
this within our current
        versioning schema is to implement an unversioned API endpoint such that 
we have a hundred `/api/1.x`
        endpoints and one that has some other - or possibly no - prefix, or 
have a versioned API version API
        endpoint, which is confusing even just to say.
        
        ** TRAFFIC OPS _IS_ THE API **
        As mentioned previously, the endgame of our transition from Perl to Go 
is that Traffic Portal is the
        only UI for ATC - and in fact that's already somewhat true. That leaves 
Traffic Ops as a database
        and a REST API for interacting with said database. In fact, the 
database is often referred to as its
        own service: "ToDb", "Traffic Ops DB" etc. That means that we have the 
single most important Traffic
        Control component split in two - half of its functionality is versioned 
sanely with Traffic Control
        while the other half is versioned separately from anything else in the 
world. That's crazy, because
        if you have a program that only does two things, then surely a breaking 
change to one of those
        things means increasing its major version? If that's the case, then why 
are we not versioning the
        API alongside Traffic Ops?
        It may be argued (incorrectly, in my opinion) that Traffic Ops does 
more than serve an API to
        interact with a database. It generates configuration files and system 
images, it combines data and
        does heavy processing on it. But really those things shouldn't be taken 
into account in a versioning
        scheme except insofar as they affect the experience of some user, 
administrator, or application
        interfacing with Traffic Ops. If the API responses don't change in form 
or content, and if the
        process of setting up or maintaining the application haven't changed, 
then any code changes you've
        made are a patch, not a version change. Traffic Ops does big things, 
but at the end of the day it
        all just boils down to API inputs and outputs as far as anything and 
anyone else is concerned.
        
        ** CONFUSION **
        We currently live in a world where I can run a script using the Traffic 
Ops API that works perfectly
        fine against Traffic Ops version 3.2.1, but then if I again test it 
against version 3.2.1 at some
        point in the future it breaks because breaking changes were made in the 
API. It's easy to say, "oh,
        that just means that when we make breaking API changes we should 
increment the version
        appropriately," but realize that this _is versioning the API alongside 
Traffic Ops_. If we're not
        doing that, we're saying there is unpredictability with the behavior of 
our system within releases,
        and if we _are_ doing that then the only difference between the API 
version and the Traffic Ops
        version is that the API version is confusingly behind by about 2 major 
revisions. It should just be
        the same for simplicity's sake.
        
        ** THE API "PROMISE" **
        The single most common argument I hear in favor of our current API 
versioning scheme is "well we've
        said an API version 1.x behaves in this way, and so we must uphold that 
promise to the user base".
        Not only do we routinely break that promise already,
        (e.g. PR #3110 [https://github.com/apache/trafficcontrol/pull/3110]) 
but I'm certainly not
        suggesting that we don't uphold this promise. Yes, this does mean that 
making breaking API changes
        in TO would require a new major release and adding features/fields to 
the API would require a new
        minor release. I don't see that as a big deal, especially if 
implemented at the time I'm suggesting
        when we'd be re-designing the API - and it does sorely need to be 
redesigned.
        
        * The API Needs to be Redesigned *
        I'm going to go down this rabbit hole for a second, if you're already 
convinced the TO API needs a
        re-design then feel free to skip this section. I'm not going to touch 
on any problems caused in the
        API as currently implemented by the use of a standalone API version - 
that's what the entire article
        is for.
        Currently, our API currently has three huge problems:
        
        1. Rampant oversaturation of endpoints
                We have a systemic issue of re-implementing behaviour in 
multiple endpoints. This is due in part
                to a lack of good documentation - so developers aren't aware of 
the endpoints available to them
                - and partly because of the "Mojolicious Mindset" that plagues 
our oldest endpoints. The
                "Mojolicious Mindset" refers to the use of URL path fragments 
as request parameters, e.g.
                '/users/{{ID}}' instead of/in addition to '/users?id={{ID}}'. 
From the perspective of someone
                who is just writing the back-end for these endpoints, there's 
no clear advantage to one over the
                other except that the former seems to more clearly reflect the 
intent of requesting a specific
                object whereas the latter could be seen as more of a "filter" 
on a larger collection. That's not
                incorrect, necessarily, but the two are totally separate 
request paths, so having '/users' and
                '/users/{{ID}}' means documenting two endpoints instead of one, 
and it means two lines in the
                route definitions, and it means two seperate handlers instead 
of one (albiet a more complex
                one).
                Consider also that we have all of the following endpoints for 
manipulating Cache Groups:
        
                * /api/1.x/cachegroup/{{parameter ID}}/parameter
                * /api/1.x/cachegroup_fallbacks
                * /api/1.x/cachegroupparameters
                * /api/1.x/cachegroupparameters/{{ID}}/{{parameter ID}}
                * /api/1.x/cachegroups
                * /api/1.x/cachegroups/{{ID}}
                * /api/1.x/cachegroups/{{ID}}/deliveryservices
                * /api/1.x/cachegroups/{{ID}}/parameters
                * /api/1.x/cachegroups/{{ID}}/queue_updates
                * /api/1.x/cachegroups/{{ID}}/unassigned_parameters
                * /api/1.x/cachegroups/{{parameter ID}}/parameter_available
                * /api/1.x/cachegroups_trimmed
        
            These could all be collapsed neatly into one or two endpoints, but 
were instead implemented
            separately for whatever reason(s)
            (see Issue #2934 
[https://github.com/apache/trafficcontrol/issues/2934] for details).
        
        2. Improper/Non-existent standards conformity
                We have requests that should be read-only that make server-side 
state changes (Issue #3054
                [https://github.com/apache/trafficcontrol/issues/3054]), we 
have endpoints returning success
                responses on failure (Issue #3003 
[https://github.com/apache/trafficcontrol/issues/3003]) and
                our "CRUDER" has failed us. `PUT` should be used to _create_ 
objects (or possibly update them by
                creating an alternative representation with the same 
identifiying information) but is instead
                used as the primary "edit this" method. `POST` is for 
processing entities in a data payload, but
                is instead used for object creation. `PATCH` languishes, 
totally unimplemented. These problems
                are systemic and stem partially from the "Mojolicious Mindset" 
whereby new functionality is
                introduced into the API by first considering what request 
method is appropriate and then
                deciding on a request path that names the operation being done. 
Request methods are meant to be
                the different ways in which a client interacts with a resource 
on the server, and thus the
                resources themselves should be considered primary. The "CRUDER" 
hampers this mindset, because it
                makes treating payloads and query parameters generic and isn't 
receptive to injection of new
                behaviour.
                The "CRUDER" is a great innovation, to be sure, as it saves us 
quite a bit of time and prevents
                duplicating code, but the problem it solves is an emergent 
property of of API problem #1 above.
                With fewer endpoints we'd have much more specific handling, and 
the need for and advantages of
                the "CRUDER" will vanish.
        
        3. Needing multiple queries to obtain a single piece of information
                This issue is pretty deeply rooted, and is related to the way 
our database is structured. But
                that's part of the problem - an API needn't replicate the 
database, and is therefore free from
                some of the constraints that bind a database.
                The way things are now, in order to e.g. create a server 
definition I must make several
                preliminary requests to determine the integral, unique, 
non-deterministic identifiers of other
                objects. I think we all agree that ideally these integral IDs 
would have no part in the output
                of the API, and many people I've spoken to would support wiping 
them from the database entirely
                given enough time and manpower.
        
    
    

Reply via email to