Re: API GW route configuration

Amir Yeshurun Thu, 11 May 2017 13:14:51 -0700

Hi Jeremy,
Note that attachments seems to be stripped off on this list and the image
is unavailable.


Your assumptions are correct. We need to figure out the easiest topology
for UI routes to bypass the GW. Please reattach the picture so we can get
more specific.

Thanks
/amiry



On Thu, May 11, 2017, 20:06 Jeremy Mitchell <mitchell...@gmail.com> wrote:

> What is of utmost importance to me is the ability to ease into this. We
> have a TO UI right now that needs to be unaffected by the API gateway in my
> opinion. Granted the old UI might go away at some point but until that time
> it needs to function as-is.
>
> To me, the simplest approach is to key off request URL. anything that
> starts with /api gets api gateway treatment, the rest passes on
> thru...Here's a fancy picture to communicate what I envision...
>
> [image: Inline image 1]
>
> I'm assuming all requests (endpoints) go thru the api gateway but maybe
> i'm wrong in that assumption. Anyhow, i guess my point is the UI should
> continue to work with the mojo cookie and "api" calls should use the jwt
> token...however, the UI also uses api endpoints so not sure how that would
> work...
>
> If it's too difficult for the api gateway to support UI and API routes, we
> could always wait until the new UI (which leverages the API) is complete...
>
> Jeremy
>
> On Thu, May 11, 2017 at 10:23 AM, Chris Lemmons <alfic...@gmail.com>
> wrote:
>
>> > invalidate ALL tokens by changing the token signing key
>>
>> Interesting idea. That does mean that the signing key has to be retrieved
>> every time from the authentication authority, or it'd be subject to the
>> exact same set of attacks. But a nearly-constant rarely changing key could
>> be communicated very efficiently, I suspect. And if the authentication
>> system is a web API, it can even use Modified-Since to 304 99% of the time
>> for maximum efficiency.
>>
>> It does have the downside that key-invalidation events are fairly
>> significant. You'd need to invalidate the keys whenever someone's access
>> was reduced or removed. As the number of accounts in the system increases,
>> that might not wind up being as infrequent as one might hope. It's easy to
>> implement, though.
>>
>> On Thu, May 11, 2017 at 10:12 AM Jeremy Mitchell <mitchell...@gmail.com>
>> wrote:
>>
>> > Regarding the TTL on the JWT token. a 5 minute TTL seems silly. What's
>> the
>> > point? Unless we get into refresh tokens but that sounds like
>> oauth...blah.
>> >
>> > What about this and maybe i'm oversimplifying. the TTL on the jwt token
>> is
>> > 24 hours. If we become aware that a token has been compromised,
>> invalidate
>> > ALL tokens by changing the token signing key. maybe this is a good idea
>> or
>> > maybe this is a terrible idea. I have no idea. just a thought..
>> >
>> > jeremy
>> >
>> > On Wed, May 10, 2017 at 12:23 PM, Chris Lemmons <alfic...@gmail.com>
>> > wrote:
>> >
>> > > Responding to a few people:
>> > >
>> > > > Often times every auth action must be accompanied by DB writes for
>> > audit
>> > > logs or callback functions.
>> > >
>> > > True. But a) if logging is too expensive it should probably be made
>> > cheaper
>> > > and b) the answer to "audits are too expensive" probably isn't "lets
>> just
>> > > do less authentication". If the audit log is genuinely the
>> bottle-neck,
>> > it
>> > > would still be better to re-auth without the audit log.
>> > >
>> > > > The API gateway can poll for the latest list of tokens at a regular
>> > > interval
>> > >
>> > > Yeah, datastore replication for local performance is great. Though if
>> you
>> > > can reasonably query for a list of all valid tokens every second, it's
>> > > probably cheaper to to just query for the token you need every time
>> you
>> > > need it. If there are massive batches of queries that are coming
>> through,
>> > > it's probably not unreasonable to choose not to re-validate a token
>> > that's
>> > > been validated in the last second.
>> > >
>> > > > Regarding maliciously delayed message or such - I don't fully
>> > understand
>> > > the
>> > > point; if an attacker has such capabilities she can simply
>> prevent/delay
>> > > devop users from updating the auth database itself thus enabling the
>> > > attack.
>> > >
>> > > In a typical attack, an attacker might gain control of a box on the
>> local
>> > > network, but not necessarily the Gateway, Traffic Ops, or Auth Server.
>> > > Those are probably better hardened. But lots of networks have a
>> squishy
>> > > test box that everyone forgot was there or something. The bad guy
>> wants
>> > to
>> > > use the CDN to DOS someone, or redirect traffic to somewhere
>> malicious,
>> > or
>> > > just cause mayhem. The longer he can keep control, the better for him.
>> > >
>> > > So this attacker uses the local box to sniff the token off the
>> network.
>> > If
>> > > the communication with the Gateway is encrypted, he might have to do
>> some
>> > > ARP poisoning or something else to trick a host into talking to the
>> local
>> > > box instead. (Properly implemented TLS also migates this angle.) He
>> knows
>> > > that as soon as he starts his nefarious deed, alarms are going to go
>> off,
>> > > so he also uses this local box to DOS the Auth Server. It's a lot
>> easier
>> > to
>> > > take a box down from the outside than to actually gain control.
>> > >
>> > > If the Gateway "fails open" when it can't contact the Auth server, the
>> > > attacker remains in control. If it "fails closed", the attacker has to
>> > > actually compromise the auth server (which is harder) to remain in
>> > control.
>> > >
>> > > > Do we block all API calls if the auth service is temporarily down
>> > (being
>> > > upgraded, container restarting, etc…)?
>> > >
>> > > Yes, I think we have to. Authentication is integral to reliable
>> > operation.
>> > >
>> > > We've been talking in some fairly wild hypotheticals, though. Is
>> there a
>> > > specific auth service you're envisioning?
>> > >
>> > > On Wed, May 10, 2017 at 12:50 AM Shmulik Asafi <shmul...@qwilt.com>
>> > wrote:
>> > >
>> > > > Regarding the communication issue Chris raised - there is more than
>> one
>> > > > possible pattern to this, e.g.:
>> > > >
>> > > >    - Blacklisted tokens can be communicated via a pub-sub mechanism
>> > > >    - The API gateway can poll for the latest list of tokens at a
>> > regular
>> > > >    interval (which can be very short ~1sec, much shorter than the
>> time
>> > it
>> > > >    takes devops to detect and react to malign tokens)
>> > > >
>> > > > Regarding hitting the blacklist datastore - this only sounds
>> similar to
>> > > > hitting to auth database; but the simplicity of a blacklist function
>> > > allows
>> > > > you to employ more efficient datastores, e.g. Redis or just a
>> hashmap
>> > in
>> > > > the API gateway process memory.
>> > > >
>> > > > Regarding maliciously delayed message or such - I don't fully
>> > understand
>> > > > the point; if an attacker has such capabilities she can simply
>> > > > prevent/delay devop users from updating the auth database itself
>> thus
>> > > > enabling the attack.
>> > > >
>> > > >
>> > > > On Wed, May 10, 2017 at 4:25 AM, Eric Friedrich (efriedri) <
>> > > > efrie...@cisco.com> wrote:
>> > > >
>> > > > > Our current management wrapper around Traffic Control (called OMD
>> > > > > Director, demo’d at last TC summit) uses a very similar approach
>> to
>> > > > > authentication.
>> > > > >
>> > > > > We have an auth service that issues a JWT. The JWT is then
>> provided
>> > > along
>> > > > > with all API calls. A few comments on our practical experience:
>> > > > >
>> > > > > - I am a supported of validating tokens both in the API gateway
>> and
>> > in
>> > > > the
>> > > > > service. We have several examples of services- Grafana for
>> example,
>> > > that
>> > > > > require external authentication. Similarly, we have other services
>> > that
>> > > > > need finer grained authentication than API Gateway policy can
>> handle.
>> > > > > Specifically, a given user may have permissions to view/modify
>> some
>> > > > > delivery services but not others. The API gateway presumably would
>> > not
>> > > > > understand the semantics of payload so this decision would need
>> to be
>> > > > made
>> > > > > by auth within the service.
>> > > > >
>> > > > > - As brought up earlier, auth in the gateway is both a strength
>> and a
>> > > > > risk. Additional layer of security is also positive, but for my
>> case
>> > of
>> > > > > Grafana above it can present an opportunity to bypass
>> authentication.
>> > > > This
>> > > > > is a risk, but it can be mitigated by adding auth to the service
>> > where
>> > > > > needed.
>> > > > >
>> > > > > - Verifying tokens on every access may potentially be more a
>> little
>> > > > > expensive than discussed. Often times every auth action must be
>> > > > accompanied
>> > > > > by DB writes for audit logs or callback functions. Not the straw
>> to
>> > > break
>> > > > > the camel’s back, but something to keep in mind.
>> > > > >
>> > > > > - There is also the problem of what to do if the underlying auth
>> > > service
>> > > > > is temporarily unavailable. Do we block all API calls if the auth
>> > > service
>> > > > > is temporarily down (being upgraded, container restarting, etc…)?
>> > > > >
>> > > > > - I’d like to see what we can do to use a pre-existing package as
>> an
>> > > API
>> > > > > Gateway. As we decompose TO into microservices, something like
>> nginx
>> > > can
>> > > > > provide additional benefits like TLS termination and load
>> balancing
>> > > > between
>> > > > > service endpoints. I’d hate to see us have to reimplement these
>> > > functions
>> > > > > later.
>> > > > >
>> > > > > - I’d also like to see us give some consideration to how an API
>> > gateway
>> > > > is
>> > > > > deployed. We raised the bar for new users by unbundling Traffic
>> Ops
>> > > from
>> > > > > the database and it could further complicate the installation if
>> we
>> > > don’t
>> > > > > provide enough guidance on how to deploy the API gateway in a lab
>> > > trial,
>> > > > if
>> > > > > not best practices for production deployment. Should we recommend
>> to
>> > > > deploy
>> > > > > as an new RPM/systemd service, an immutable container, or as part
>> of
>> > > the
>> > > > > existing TO RPM?
>> > > > >
>> > > > > —Eric
>> > > > >
>> > > > >
>> > > > > > On May 9, 2017, at 5:05 PM, Chris Lemmons <alfic...@gmail.com>
>> > > wrote:
>> > > > > >
>> > > > > > Blacklisting requires proactive communication between the
>> > > > authentication
>> > > > > > system and the gateway. Furthermore, the client can't be sure
>> that
>> > > > > > something hasn't been blacklisted recently (and the message
>> lost or
>> > > > > perhaps
>> > > > > > maliciously delayed) unless it checks whatever system it is that
>> > does
>> > > > the
>> > > > > > blacklisting. And if you're checking a datastore of some sort
>> for
>> > the
>> > > > > > validity of the token every time, you might as well just check
>> each
>> > > > time
>> > > > > > and skip the blacklisting step.
>> > > > > >
>> > > > > > On Tue, May 9, 2017 at 1:27 PM Shmulik Asafi <
>> shmul...@qwilt.com>
>> > > > wrote:
>> > > > > >
>> > > > > >> Hi,
>> > > > > >> Maybe a missing link here is another component in a jwt
>> stateless
>> > > > > >> architecture which is *blacklisting* malign tokens when
>> necessary.
>> > > > > >> This is obviously a sort of state which needs to be handled in
>> a
>> > > > > datastore;
>> > > > > >> but it's quite different and easy to scale and has less
>> > performance
>> > > > > impact
>> > > > > >> (I guess especially under DDOS) than doing full auth queries.
>> > > > > >> I believe this should be the approach on the API Gateway
>> roadmap
>> > > > > >> Thanks
>> > > > > >>
>> > > > > >> On 9 May 2017 21:14, "Chris Lemmons" <alfic...@gmail.com>
>> wrote:
>> > > > > >>
>> > > > > >>> I'll second the principle behind "start with security,
>> optimize
>> > > when
>> > > > > >>> there's a problem".
>> > > > > >>>
>> > > > > >>> It seems to me that in order to maintain security, basically
>> > > everyone
>> > > > > >> would
>> > > > > >>> need to dial the revalidate time so close to zero that it does
>> > very
>> > > > > >> little
>> > > > > >>> good as a cache on the credentials. Otherwise, as Rob as
>> pointed
>> > > out,
>> > > > > the
>> > > > > >>> TTL on your credential cache is effectively "how long am I ok
>> > with
>> > > > > >> hackers
>> > > > > >>> in control after I find them". Practically, it also means that
>> > much
>> > > > lag
>> > > > > >> on
>> > > > > >>> adding or removing permissions. That effectively means a
>> database
>> > > hit
>> > > > > for
>> > > > > >>> every query, or near enough to every query as not to matter.
>> > > > > >>>
>> > > > > >>> That said, you can get the best of multiple worlds, I think.
>> The
>> > > only
>> > > > > DB
>> > > > > >>> query that really has to be done is "give me the last update
>> time
>> > > for
>> > > > > >> this
>> > > > > >>> user". Compare that to the generation time in the token and
>> 99%
>> > of
>> > > > the
>> > > > > >>> time, it's the only query you need. With that check, you can
>> even
>> > > use
>> > > > > >>> fairly long-lived tokens. If anything about the user has
>> changed,
>> > > > > reject
>> > > > > >>> the token, generate a new one, send that to the user and use
>> it.
>> > > The
>> > > > > >>> regenerate step is somewhat expensive, but still well inside
>> > > > > reasonable,
>> > > > > >> I
>> > > > > >>> think.
>> > > > > >>>
>> > > > > >>> On Tue, May 9, 2017 at 11:31 AM Robert Butts <
>> > > > robert.o.bu...@gmail.com
>> > > > > >
>> > > > > >>> wrote:
>> > > > > >>>
>> > > > > >>>>> The TO service (and any other service that requires auth)
>> MUST
>> > > hit
>> > > > > >> the
>> > > > > >>>> database (or the auth service, which itself hits the
>> database)
>> > to
>> > > > > >> verify
>> > > > > >>>> valid tokens' users still have the permissions they did when
>> the
>> > > > token
>> > > > > >>> was
>> > > > > >>>> created. Otherwise, it's impossible to revoke tokens, e.g.
>> if an
>> > > > > >> employee
>> > > > > >>>> quits, or an attacker gains a token, or a user changes their
>> > > > password.
>> > > > > >>>>
>> > > > > >>>> I'm elaborating on this, and moving a discussion from a PR
>> > review
>> > > > > here.
>> > > > > >>>>
>> > > > > >>>> From the code submissions to the repo, it appears the current
>> > plan
>> > > > is
>> > > > > >> for
>> > > > > >>>> the API Gateway to create a JWT, and then for that JWT to be
>> > > > accepted
>> > > > > >> by
>> > > > > >>>> all Traffic Ops microservices, with no database
>> authentication.
>> > > > > >>>>
>> > > > > >>>> It's a common misconception that JWT allows you authenticate
>> > > without
>> > > > > >>>> hitting the database. This is an exceedingly dangerous
>> > > > misconception.
>> > > > > >> If
>> > > > > >>>> you don't check the database when every authenticated route
>> is
>> > > > > >> requested,
>> > > > > >>>> it's impossible to revoke access. In practice, this means the
>> > JWT
>> > > > TTL
>> > > > > >>>> becomes the length of time _after you discover an attacker is
>> > > > > >>> manipulating
>> > > > > >>>> your production system_, before it's _possible_ to evict
>> them.
>> > > > > >>>>
>> > > > > >>>> How long do you feel is acceptable to have a hacker in and
>> > > > > manipulating
>> > > > > >>>> your system, after you discover them? A day? An hour? Five
>> > > minutes?
>> > > > > >>>> Whatever your TTL, that's the length of time you're willing
>> to
>> > > > allow a
>> > > > > >>>> hacker to steal and destroy you and your customers' data.
>> Worse,
>> > > > > >> because
>> > > > > >>>> this is a CDN, it's the length of time you're willing to
>> allow
>> > > your
>> > > > > CDN
>> > > > > >>> to
>> > > > > >>>> be used to DDOS a target.
>> > > > > >>>>
>> > > > > >>>> Are you going to explain in court that the DDOS your system
>> > > executed
>> > > > > >>> lasted
>> > > > > >>>> 24 hours, or 1 hour, or 10 minutes after you discovered it,
>> > > because
>> > > > > >>> that's
>> > > > > >>>> the TTL you hard-coded? Are you going to explain to a judge
>> and
>> > > > > >>> prosecuting
>> > > > > >>>> attorney exactly which sensitive data was stolen in the ten
>> > > minutes
>> > > > > >> after
>> > > > > >>>> you discovered the attacker in your system, before their JWT
>> > > > expired?
>> > > > > >>>>
>> > > > > >>>> If you're willing to accept the legal consequences, that's
>> your
>> > > > > >> business.
>> > > > > >>>> Apache Traffic Control should not require users to accept
>> those
>> > > > > >>>> consequences, and ideally shouldn't make it possible, as many
>> > > users
>> > > > > >> won't
>> > > > > >>>> understand the security risks.
>> > > > > >>>>
>> > > > > >>>> The argument has been made "authorization does not check the
>> > > > database
>> > > > > >> to
>> > > > > >>>> avoid congestion" -- Has anyone tested this in practice? The
>> > > > database
>> > > > > >>> query
>> > > > > >>>> itself is 50ms. Assuming your database and service are 2500km
>> > > apart,
>> > > > > >>> that's
>> > > > > >>>> another 50ms network latency. Traffic Ops has endpoints that
>> > take
>> > > > 10s
>> > > > > >> to
>> > > > > >>>> generate. Worst-case scenario, this will double the time of
>> tiny
>> > > > > >>> endpoints
>> > > > > >>>> to 200ms, and increase large endpoints inconsequentially.
>> It's
>> > > > highly
>> > > > > >>>> unlikely performance is an issue in practice.
>> > > > > >>>>
>> > > > > >>>> As Jan said, we can still have the services check the auth as
>> > well
>> > > > > >> after
>> > > > > >>>> the proxy auth. Moreover, the services don't even have to
>> know
>> > > about
>> > > > > >> the
>> > > > > >>>> auth service, they can hit a mapped route on the API Gateway,
>> > > which
>> > > > > >> gives
>> > > > > >>>> us better modularisation and separation of concerns.
>> > > > > >>>>
>> > > > > >>>> It's not difficult, it can be a trivial endpoint on the auth
>> > > > service,
>> > > > > >>>> remapped in the API Gateway, which takes the JWT token and
>> > returns
>> > > > > true
>> > > > > >>> if
>> > > > > >>>> it's still authorized in the database. To be clear, this is
>> not
>> > a
>> > > > > >> problem
>> > > > > >>>> today. Traffic Ops still uses the Mojolicious cookie today,
>> so
>> > > this
>> > > > > >> would
>> > > > > >>>> only need done if and when we remove that, or if we move
>> > > authorized
>> > > > > >>>> endpoints out of Traffic Ops into their own microservices.
>> > > > > >>>>
>> > > > > >>>> Considering the significant security and legal risks, we
>> should
>> > > > always
>> > > > > >>> hit
>> > > > > >>>> the database to validate requests of authorized endpoints,
>> and
>> > > > > >> reconsider
>> > > > > >>>> if and when someone observes performance issues in practice.
>> > > > > >>>>
>> > > > > >>>>
>> > > > > >>>> On Tue, May 9, 2017 at 6:56 AM, Dewayne Richardson <
>> > > > dewr...@gmail.com
>> > > > > >
>> > > > > >>>> wrote:
>> > > > > >>>>
>> > > > > >>>>> If only the API GW authenticates/authorizes we also have a
>> > single
>> > > > > >> point
>> > > > > >>>> of
>> > > > > >>>>> entry to test for security instead of having it sprinkled
>> > across
>> > > > > >>> services
>> > > > > >>>>> in different ways.  It also simplifies the code on the
>> service
>> > > side
>> > > > > >> and
>> > > > > >>>>> makes them easier to test with automation.
>> > > > > >>>>>
>> > > > > >>>>> -Dew
>> > > > > >>>>>
>> > > > > >>>>> On Mon, May 8, 2017 at 8:42 AM, Robert Butts <
>> > > > > >> robert.o.bu...@gmail.com
>> > > > > >>>>
>> > > > > >>>>> wrote:
>> > > > > >>>>>
>> > > > > >>>>>>> couldn't make nginx or http do what we need.
>> > > > > >>>>>>
>> > > > > >>>>>> I was suggesting a different architecture. Not making the
>> > proxy
>> > > do
>> > > > > >>>> auth,
>> > > > > >>>>>> only standard proxying.
>> > > > > >>>>>>
>> > > > > >>>>>>> We can still have the services check the auth as well
>> after
>> > the
>> > > > > >>> proxy
>> > > > > >>>>>> auth
>> > > > > >>>>>>
>> > > > > >>>>>> +1
>> > > > > >>>>>>
>> > > > > >>>>>>
>> > > > > >>>>>> On Mon, May 8, 2017 at 3:36 AM, Amir Yeshurun <
>> > am...@qwilt.com>
>> > > > > >>> wrote:
>> > > > > >>>>>>
>> > > > > >>>>>>> Hi,
>> > > > > >>>>>>>
>> > > > > >>>>>>> Let me elaborate some more on the purpose of the API GW. I
>> > will
>> > > > > >> put
>> > > > > >>>> up
>> > > > > >>>>> a
>> > > > > >>>>>>> wiki page following our discussions here.
>> > > > > >>>>>>>
>> > > > > >>>>>>> Main purpose is to allow innovation by creating new
>> services
>> > > that
>> > > > > >>>>> handle
>> > > > > >>>>>> TO
>> > > > > >>>>>>> functionality, not as a part of the monolithic Mojo app.
>> > > > > >>>>>>> The long term vision is to de-compose TO into multiple
>> > > > > >>> microservices,
>> > > > > >>>>>>> allowing new functionality easily added.
>> > > > > >>>>>>> Indeed, the goal it to eventually deprecate the current
>> AAA
>> > > > > >> model,
>> > > > > >>>> and
>> > > > > >>>>>>> replace it with the new AAA model currently under work
>> > > > > >> (user-roles,
>> > > > > >>>>>>> role-capabilities)
>> > > > > >>>>>>>
>> > > > > >>>>>>> I think that handling authorization in the API layer is a
>> > valid
>> > > > > >>>>> approach.
>> > > > > >>>>>>> Security wise, I don't see much difference between that,
>> and
>> > > > > >> having
>> > > > > >>>>> each
>> > > > > >>>>>>> module access the auth service, as long as the auth
>> service
>> > is
>> > > > > >>>> deployed
>> > > > > >>>>>> in
>> > > > > >>>>>>> the backend.
>> > > > > >>>>>>> Having another proxy (nginx?) fronting the world and
>> > forwarding
>> > > > > >> all
>> > > > > >>>>>>> requests to the backend GW mitigates the risk for
>> > compromising
>> > > > > >> the
>> > > > > >>>>>>> authorization service.
>> > > > > >>>>>>> However, as mentioned above, we can still have the
>> services
>> > > check
>> > > > > >>> the
>> > > > > >>>>>> auth
>> > > > > >>>>>>> as well after the proxy auth.
>> > > > > >>>>>>>
>> > > > > >>>>>>> It is a standalone process, completely optional at this
>> > point.
>> > > > > >> One
>> > > > > >>>> can
>> > > > > >>>>>>> choose to deploy it in order to allow integration with
>> > > additional
>> > > > > >>>>>>> services. Deployment
>> > > > > >>>>>>> and management are still T.B.D, and feedback on this is
>> most
>> > > > > >>> welcome.
>> > > > > >>>>>>>
>> > > > > >>>>>>> Regarding token validation and revocation:
>> > > > > >>>>>>> Tokens have expiration time. Expired tokens do not pass
>> token
>> > > > > >>>>> validation.
>> > > > > >>>>>>> In production, expiration should be set to relatively
>> short
>> > > time,
>> > > > > >>>> say 5
>> > > > > >>>>>>> minute.
>> > > > > >>>>>>> This way revocation is automatic. Re-authentication is
>> > handled
>> > > > > >> via
>> > > > > >>>>>> refresh
>> > > > > >>>>>>> tokens (not implemented yet). Hitting the DB upon every
>> API
>> > > call
>> > > > > >>>> cause
>> > > > > >>>>>>> congestion on users DB.
>> > > > > >>>>>>> To avoid that, we chose to have all user information
>> > > > > >> self-contained
>> > > > > >>>>>> inside
>> > > > > >>>>>>> the JWT.
>> > > > > >>>>>>>
>> > > > > >>>>>>> Thanks
>> > > > > >>>>>>> /amiry
>> > > > > >>>>>>>
>> > > > > >>>>>>> On Mon, May 8, 2017 at 5:42 AM Jan van Doorn <
>> > j...@knutsel.com>
>> > > > > >>>> wrote:
>> > > > > >>>>>>>
>> > > > > >>>>>>>> It's the reverse proxy we've discussed for the "micro
>> > > services"
>> > > > > >>>>> version
>> > > > > >>>>>>> for
>> > > > > >>>>>>>> a while now (as in
>> > > > > >>>>>>>>
>> > > > > >>>> https://cwiki.apache.org/confluence/display/TC/Design+
>> > > Overview+v3.0
>> > > > > >>>>> ).
>> > > > > >>>>>>>>
>> > > > > >>>>>>>> On Sun, May 7, 2017 at 7:22 PM Eric Friedrich (efriedri)
>> <
>> > > > > >>>>>>>> efrie...@cisco.com>
>> > > > > >>>>>>>> wrote:
>> > > > > >>>>>>>>
>> > > > > >>>>>>>>> From a higher level- what is purpose of the API Gateway?
>> > It
>> > > > > >>>> seems
>> > > > > >>>>>> like
>> > > > > >>>>>>>>> there may have been some previous discussions about API
>> > > > > >>> Gateway.
>> > > > > >>>>> Are
>> > > > > >>>>>>>> there
>> > > > > >>>>>>>>> any notes or description that I can catch up on?
>> > > > > >>>>>>>>>
>> > > > > >>>>>>>>> How will it be deployed? (Is it a standalone service or
>> > > > > >>> something
>> > > > > >>>>>> that
>> > > > > >>>>>>>>> runs inside the experimental Traffic Ops)?
>> > > > > >>>>>>>>>
>> > > > > >>>>>>>>> Is this new component required or optional?
>> > > > > >>>>>>>>>
>> > > > > >>>>>>>>> —Eric
>> > > > > >>>>>>>>>
>> > > > > >>>>>>>>>
>> > > > > >>>>>>>>>
>> > > > > >>>>>>>>>> On May 7, 2017, at 8:28 PM, Jan van Doorn <
>> > j...@knutsel.com
>> > > > > >>>
>> > > > > >>>>> wrote:
>> > > > > >>>>>>>>>>
>> > > > > >>>>>>>>>> I looked into this a year or so ago, and I couldn't
>> make
>> > > > > >>> nginx
>> > > > > >>>> or
>> > > > > >>>>>>> http
>> > > > > >>>>>>>> do
>> > > > > >>>>>>>>>> what we need.
>> > > > > >>>>>>>>>>
>> > > > > >>>>>>>>>> We can still have the services check the auth as well
>> > after
>> > > > > >>> the
>> > > > > >>>>>> proxy
>> > > > > >>>>>>>>> auth,
>> > > > > >>>>>>>>>> and make things better than today, where we have the
>> same
>> > > > > >>>> problem
>> > > > > >>>>>>> that
>> > > > > >>>>>>>> if
>> > > > > >>>>>>>>>> the TO mojo app is compromised, everything is
>> compromised.
>> > > > > >>>>>>>>>>
>> > > > > >>>>>>>>>> If we always route to TO, we don't untangle the mess of
>> > > > > >> being
>> > > > > >>>>>>> dependent
>> > > > > >>>>>>>>> on
>> > > > > >>>>>>>>>> the monolithic TO for everything. Many services today,
>> and
>> > > > > >>> more
>> > > > > >>>>> in
>> > > > > >>>>>>> the
>> > > > > >>>>>>>>>> future really just need a check to see if the user is
>> > > > > >>>> authorized,
>> > > > > >>>>>> and
>> > > > > >>>>>>>>>> nothing more.
>> > > > > >>>>>>>>>>
>> > > > > >>>>>>>>>> On Sun, May 7, 2017 at 11:55 AM Robert Butts <
>> > > > > >>>>>>> robert.o.bu...@gmail.com
>> > > > > >>>>>>>>>
>> > > > > >>>>>>>>>> wrote:
>> > > > > >>>>>>>>>>
>> > > > > >>>>>>>>>>> What are the advantages of these config files, over an
>> > > > > >>>> existing
>> > > > > >>>>>>>> reverse
>> > > > > >>>>>>>>>>> proxy, like Nginx or httpd? It's just as much work as
>> > > > > >>>>> configuring
>> > > > > >>>>>>> and
>> > > > > >>>>>>>>>>> deploying an existing product, but more code we have
>> to
>> > > > > >>> write
>> > > > > >>>>> and
>> > > > > >>>>>>>>> maintain.
>> > > > > >>>>>>>>>>> I'm having trouble seeing the advantage.
>> > > > > >>>>>>>>>>>
>> > > > > >>>>>>>>>>> -1 on auth rules as a part of the proxy. Making a
>> proxy
>> > > > > >> care
>> > > > > >>>>> about
>> > > > > >>>>>>>> auth
>> > > > > >>>>>>>>>>> violates the Single Responsibility Principle, and
>> > further,
>> > > > > >>> is
>> > > > > >>>> a
>> > > > > >>>>>>>> security
>> > > > > >>>>>>>>>>> risk. It creates unnecessary attack surface. If your
>> > proxy
>> > > > > >>> app
>> > > > > >>>>> or
>> > > > > >>>>>>>>> server is
>> > > > > >>>>>>>>>>> compromised, the entire framework is now compromised.
>> An
>> > > > > >>>>> attacker
>> > > > > >>>>>>>> could
>> > > > > >>>>>>>>>>> simply rewrite the proxy config to make all routes
>> > > > > >> no-auth.
>> > > > > >>>>>>>>>>>
>> > > > > >>>>>>>>>>> The simple alternative is for the proxy to always
>> route
>> > to
>> > > > > >>> TO,
>> > > > > >>>>> and
>> > > > > >>>>>>> TO
>> > > > > >>>>>>>>>>> checks the token against the auth service (which may
>> also
>> > > > > >> be
>> > > > > >>>>>>> proxied),
>> > > > > >>>>>>>>> and
>> > > > > >>>>>>>>>>> redirects unauthorized requests to a login endpoint
>> > (which
>> > > > > >>> may
>> > > > > >>>>>> also
>> > > > > >>>>>>> be
>> > > > > >>>>>>>>>>> proxied).
>> > > > > >>>>>>>>>>>
>> > > > > >>>>>>>>>>> The TO service (and any other service that requires
>> auth)
>> > > > > >>> MUST
>> > > > > >>>>> hit
>> > > > > >>>>>>> the
>> > > > > >>>>>>>>>>> database (or the auth service, which itself hits the
>> > > > > >>> database)
>> > > > > >>>>> to
>> > > > > >>>>>>>> verify
>> > > > > >>>>>>>>>>> valid tokens' users still have the permissions they
>> did
>> > > > > >> when
>> > > > > >>>> the
>> > > > > >>>>>>> token
>> > > > > >>>>>>>>> was
>> > > > > >>>>>>>>>>> created. Otherwise, it's impossible to revoke tokens,
>> > e.g.
>> > > > > >>> if
>> > > > > >>>> an
>> > > > > >>>>>>>>> employee
>> > > > > >>>>>>>>>>> quits, or an attacker gains a token, or a user changes
>> > > > > >> their
>> > > > > >>>>>>> password.
>> > > > > >>>>>>>>>>>
>> > > > > >>>>>>>>>>>
>> > > > > >>>>>>>>>>> On Sun, May 7, 2017 at 4:35 AM, Amir Yeshurun <
>> > > > > >>>> am...@qwilt.com>
>> > > > > >>>>>>>> wrote:
>> > > > > >>>>>>>>>>>
>> > > > > >>>>>>>>>>>> Seems that attachments are stripped on this list.
>> > > > > >> Examples
>> > > > > >>>>> pasted
>> > > > > >>>>>>>> below
>> > > > > >>>>>>>>>>>>
>> > > > > >>>>>>>>>>>> *rules.json*
>> > > > > >>>>>>>>>>>> [
>> > > > > >>>>>>>>>>>>   { "host": "localhost", "path": "/login",
>> > > > > >>>>>>> "forward":
>> > > > > >>>>>>>>>>>> "localhost:9004", "scheme": "https", "auth": false },
>> > > > > >>>>>>>>>>>>   { "host": "localhost", "path":
>> "/api/1.2/innovation/",
>> > > > > >>>>>>> "forward":
>> > > > > >>>>>>>>>>>> "localhost:8004", "scheme": "http",  "auth": true,
>> > > > > >>>>> "routes-file":
>> > > > > >>>>>>>>>>>> "innovation.json" },
>> > > > > >>>>>>>>>>>>   { "host": "localhost", "path": "/api/1.2/",
>> > > > > >>>>>>> "forward":
>> > > > > >>>>>>>>>>>> "localhost:3000", "scheme": "http",  "auth": true,
>> > > > > >>>>> "routes-file":
>> > > > > >>>>>>>>>>>> "traffic-ops-routes.json" },
>> > > > > >>>>>>>>>>>>   { "host": "localhost", "path":
>> "/internal/api/1.2/",
>> > > > > >>>>>>> "forward":
>> > > > > >>>>>>>>>>>> "localhost:3000", "scheme": "http",  "auth": true,
>> > > > > >>>>> "routes-file":
>> > > > > >>>>>>>>>>>> "internal-routes.json" }
>> > > > > >>>>>>>>>>>> ]
>> > > > > >>>>>>>>>>>>
>> > > > > >>>>>>>>>>>> *traffic-ops-routes.json (partial)*
>> > > > > >>>>>>>>>>>> .
>> > > > > >>>>>>>>>>>> .
>> > > > > >>>>>>>>>>>> .
>> > > > > >>>>>>>>>>>>   { "match": "/cdns/health",
>> > > > > >>> "auth":
>> > > > > >>>> {
>> > > > > >>>>>>> "GET":
>> > > > > >>>>>>>>>>>> ["cdn-health-read"] }},
>> > > > > >>>>>>>>>>>>   { "match": "/cdns/capacity",
>> > > > > >>> "auth":
>> > > > > >>>> {
>> > > > > >>>>>>> "GET":
>> > > > > >>>>>>>>>>>> ["cdn-health-read"] }},
>> > > > > >>>>>>>>>>>>   { "match": "/cdns/usage/overview",
>> > > > > >>> "auth":
>> > > > > >>>> {
>> > > > > >>>>>>> "GET":
>> > > > > >>>>>>>>>>>> ["cdn-stats-read"] }},
>> > > > > >>>>>>>>>>>>   { "match": "/cdns/name/dnsseckeys/generate",
>> > > > > >>> "auth":
>> > > > > >>>> {
>> > > > > >>>>>>> "GET":
>> > > > > >>>>>>>>>>>> ["cdn-security-keys-read"] }},
>> > > > > >>>>>>>>>>>>   { "match": "/cdns/name/[^\/]+/?",
>> > > > > >>> "auth":
>> > > > > >>>> {
>> > > > > >>>>>>> "GET":
>> > > > > >>>>>>>>>>>> ["cdn-read"] }},
>> > > > > >>>>>>>>>>>>   { "match": "/cdns/name/[^\/]+/sslkeys",
>> > > > > >>> "auth":
>> > > > > >>>> {
>> > > > > >>>>>>> "GET":
>> > > > > >>>>>>>>>>>> ["cdn-security-keys-read"] }},
>> > > > > >>>>>>>>>>>>   { "match": "/cdns/name/[^\/]+/dnsseckeys",
>> > > > > >>> "auth":
>> > > > > >>>> {
>> > > > > >>>>>>> "GET":
>> > > > > >>>>>>>>>>>> ["cdn-security-keys-read"] }},
>> > > > > >>>>>>>>>>>>   { "match": "/cdns/name/[^\/]+/dnsseckeys/delete",
>> > > > > >>> "auth":
>> > > > > >>>> {
>> > > > > >>>>>>> "GET":
>> > > > > >>>>>>>>>>>> ["cdn-security-keys-write"] }},
>> > > > > >>>>>>>>>>>>   { "match": "/cdns/[^\/]+/queue_update",
>> > > > > >>> "auth":
>> > > > > >>>> {
>> > > > > >>>>>>>> "POST":
>> > > > > >>>>>>>>>>>> ["queue-updates-write"] }},
>> > > > > >>>>>>>>>>>>   { "match": "/cdns/[^\/]+/snapshot",
>> > > > > >>> "auth":
>> > > > > >>>> {
>> > > > > >>>>>>> "PUT":
>> > > > > >>>>>>>>>>>> ["cdn-config-snapshot-write"] }},
>> > > > > >>>>>>>>>>>>   { "match": "/cdns/[^\/]+/health",
>> > > > > >>> "auth":
>> > > > > >>>> {
>> > > > > >>>>>>> "GET":
>> > > > > >>>>>>>>>>>> ["cdn-health-read"] }},
>> > > > > >>>>>>>>>>>>   { "match": "/cdns/[^\/]+/?",
>> > > > > >>> "auth":
>> > > > > >>>> {
>> > > > > >>>>>>> "GET":
>> > > > > >>>>>>>>>>>> ["cdn-read"], "PUT":  ["cdn-write"], "PATCH":
>> > > > > >>> ["cdn-write"],
>> > > > > >>>>>>>> "DELETE":
>> > > > > >>>>>>>>>>>> ["cdn-write"] }},
>> > > > > >>>>>>>>>>>>   { "match": "/cdns",
>> > > > > >>> "auth":
>> > > > > >>>> {
>> > > > > >>>>>>> "GET":
>> > > > > >>>>>>>>>>>> ["cdn-read"], "POST": ["cdn-write"] }},
>> > > > > >>>>>>>>>>>>
>> > > > > >>>>>>>>>>>> .
>> > > > > >>>>>>>>>>>> .
>> > > > > >>>>>>>>>>>> .
>> > > > > >>>>>>>>>>>>
>> > > > > >>>>>>>>>>>>
>> > > > > >>>>>>>>>>>> On Sun, May 7, 2017 at 12:39 PM Amir Yeshurun <
>> > > > > >>>> am...@qwilt.com
>> > > > > >>>>>>
>> > > > > >>>>>>>> wrote:
>> > > > > >>>>>>>>>>>>
>> > > > > >>>>>>>>>>>>> Attached please find examples for forwarding rules
>> file
>> > > > > >>>>>>> (rules.json)
>> > > > > >>>>>>>>>>> and
>> > > > > >>>>>>>>>>>>> the authorization rules file
>> (traffic-ops-routes.json)
>> > > > > >>>>>>>>>>>>>
>> > > > > >>>>>>>>>>>>>
>> > > > > >>>>>>>>>>>>> On Sun, May 7, 2017 at 10:39 AM Amir Yeshurun <
>> > > > > >>>>> am...@qwilt.com>
>> > > > > >>>>>>>>> wrote:
>> > > > > >>>>>>>>>>>>>
>> > > > > >>>>>>>>>>>>>> Hi all,
>> > > > > >>>>>>>>>>>>>>
>> > > > > >>>>>>>>>>>>>> I am about to submit a PR with a first operational
>> > > > > >>> version
>> > > > > >>>> of
>> > > > > >>>>>> the
>> > > > > >>>>>>>> API
>> > > > > >>>>>>>>>>>> GW,
>> > > > > >>>>>>>>>>>>>> to the "experimental" code base.
>> > > > > >>>>>>>>>>>>>>
>> > > > > >>>>>>>>>>>>>> The API GW forwarding logic is as follow:
>> > > > > >>>>>>>>>>>>>>
>> > > > > >>>>>>>>>>>>>>  1. Find host to forward the request: Prefix match
>> on
>> > > > > >>> the
>> > > > > >>>>>>> request
>> > > > > >>>>>>>>>>> path
>> > > > > >>>>>>>>>>>>>>  against a list of forwarding rules. The matched
>> > > > > >>>> forwarding
>> > > > > >>>>>> rule
>> > > > > >>>>>>>>>>>> defines the
>> > > > > >>>>>>>>>>>>>>  target's host, and the target's *authorization
>> > > > > >> rules*.
>> > > > > >>>>>>>>>>>>>>  2. Authorization: Regex match on the request path
>> > > > > >>>> against a
>> > > > > >>>>>>> list
>> > > > > >>>>>>>> of
>> > > > > >>>>>>>>>>>> *authorization
>> > > > > >>>>>>>>>>>>>>  rules*. The matched rule defines the required
>> > > > > >>>> capabilities
>> > > > > >>>>> to
>> > > > > >>>>>>>>>>> perform
>> > > > > >>>>>>>>>>>>>>  the HTTP method on the route. These capabilities
>> are
>> > > > > >>>>> compared
>> > > > > >>>>>>>>>>>> against the
>> > > > > >>>>>>>>>>>>>>  user's capabilities in the user's JWT
>> > > > > >>>>>>>>>>>>>>
>> > > > > >>>>>>>>>>>>>> At this moment, the 2 sets of rules are hard-coded
>> in
>> > > > > >>> json
>> > > > > >>>>>> files.
>> > > > > >>>>>>>> The
>> > > > > >>>>>>>>>>>>>> files are provided with the API GW distribution and
>> > > > > >>> contain
>> > > > > >>>>>>>>>>> definitions
>> > > > > >>>>>>>>>>>> for
>> > > > > >>>>>>>>>>>>>> TC 2.0 API routes. I have tested parts of the API,
>> > > > > >>> however,
>> > > > > >>>>>> there
>> > > > > >>>>>>>>>>> might
>> > > > > >>>>>>>>>>>> be
>> > > > > >>>>>>>>>>>>>> mistakes in some of the routes. Please be warned.
>> > > > > >>>>>>>>>>>>>>
>> > > > > >>>>>>>>>>>>>> Considering manageability and high availability, I
>> am
>> > > > > >>> aware
>> > > > > >>>>>> that
>> > > > > >>>>>>>>> using
>> > > > > >>>>>>>>>>>>>> local files for storing the set of authorization
>> rules
>> > > > > >> is
>> > > > > >>>>>>> inferior
>> > > > > >>>>>>>> to
>> > > > > >>>>>>>>>>>>>> centralized configuration.
>> > > > > >>>>>>>>>>>>>>
>> > > > > >>>>>>>>>>>>>> We are considering different approaches for
>> > centralized
>> > > > > >>>>>>>>> configuration,
>> > > > > >>>>>>>>>>>>>> having the following points in mind
>> > > > > >>>>>>>>>>>>>>
>> > > > > >>>>>>>>>>>>>>  - Microservice world: API GW will front multiple
>> > > > > >>>> services,
>> > > > > >>>>>> not
>> > > > > >>>>>>>> only
>> > > > > >>>>>>>>>>>>>>  Mojo. It can also front other TC components like
>> > > > > >>> Traffic
>> > > > > >>>>>> Stats
>> > > > > >>>>>>>> and
>> > > > > >>>>>>>>>>>> Traffic
>> > > > > >>>>>>>>>>>>>>  Monitor. Each service defines its own routes and
>> > > > > >>>>>> capabilities.
>> > > > > >>>>>>>> Here
>> > > > > >>>>>>>>>>>> comes
>> > > > > >>>>>>>>>>>>>>  the question of what is the "source of truth" for
>> the
>> > > > > >>>> route
>> > > > > >>>>>>>>>>>> definitions.
>> > > > > >>>>>>>>>>>>>>  - Handling private routes. API GW may front non-TC
>> > > > > >>>>> services.
>> > > > > >>>>>>>>>>>>>>  - User changes to the AAA scheme. The ability for
>> > > > > >> admin
>> > > > > >>>>> user
>> > > > > >>>>>> to
>> > > > > >>>>>>>>>>> makes
>> > > > > >>>>>>>>>>>>>>  changes in the required capabilities of a route,
>> > > > > >> maybe
>> > > > > >>>> even
>> > > > > >>>>>>>> define
>> > > > > >>>>>>>>>>>> new
>> > > > > >>>>>>>>>>>>>>  capability names, was raised in the past as a use
>> > > > > >> case
>> > > > > >>>> that
>> > > > > >>>>>>>> should
>> > > > > >>>>>>>>>>> be
>> > > > > >>>>>>>>>>>>>>  supported.
>> > > > > >>>>>>>>>>>>>>  - Easy development and deployment of new services.
>> > > > > >>>>>>>>>>>>>>  - Using TO DB for expediency.
>> > > > > >>>>>>>>>>>>>>
>> > > > > >>>>>>>>>>>>>> I would appreciate any feedback and views on your
>> > > > > >>> approach
>> > > > > >>>> to
>> > > > > >>>>>>>> manage
>> > > > > >>>>>>>>>>>>>> route definitions.
>> > > > > >>>>>>>>>>>>>>
>> > > > > >>>>>>>>>>>>>> Thanks
>> > > > > >>>>>>>>>>>>>> /amiry
>> > > > > >>>>>>>>>>>>>>
>> > > > > >>>>>>>>>>>>>
>> > > > > >>>>>>>>>>>>
>> > > > > >>>>>>>>>>>
>> > > > > >>>>>>>>>
>> > > > > >>>>>>>>>
>> > > > > >>>>>>>>
>> > > > > >>>>>>>
>> > > > > >>>>>>
>> > > > > >>>>>
>> > > > > >>>>
>> > > > > >>>
>> > > > > >>
>> > > > >
>> > > > >
>> > > >
>> > > >
>> > > > --
>> > > > *Shmulik Asafi*
>> > > > Qwilt | Work: +972-72-2221692 <+972%2072-222-1692>
>> > <+972%2072-222-1692>| Mobile:
>> > > > +972-54-6581595 <054-658-1595> <+972%2054-658-1595>
>> <+972%2054-658-1595>|
>> > shmul...@qwilt.com
>> > > > <y...@qwilt.com>
>> > > >
>> > >
>> >
>>
>
>

Re: API GW route configuration

Reply via email to