Re: API GW route configuration

Chris Lemmons Tue, 09 May 2017 14:06:41 -0700

Blacklisting requires proactive communication between the authentication
system and the gateway. Furthermore, the client can't be sure that
something hasn't been blacklisted recently (and the message lost or perhaps
maliciously delayed) unless it checks whatever system it is that does the
blacklisting. And if you're checking a datastore of some sort for the
validity of the token every time, you might as well just check each time
and skip the blacklisting step.


On Tue, May 9, 2017 at 1:27 PM Shmulik Asafi <shmul...@qwilt.com> wrote:

> Hi,
> Maybe a missing link here is another component in a jwt stateless
> architecture which is *blacklisting* malign tokens when necessary.
> This is obviously a sort of state which needs to be handled in a datastore;
> but it's quite different and easy to scale and has less performance impact
> (I guess especially under DDOS) than doing full auth queries.
> I believe this should be the approach on the API Gateway roadmap
> Thanks
>
> On 9 May 2017 21:14, "Chris Lemmons" <alfic...@gmail.com> wrote:
>
> > I'll second the principle behind "start with security, optimize when
> > there's a problem".
> >
> > It seems to me that in order to maintain security, basically everyone
> would
> > need to dial the revalidate time so close to zero that it does very
> little
> > good as a cache on the credentials. Otherwise, as Rob as pointed out, the
> > TTL on your credential cache is effectively "how long am I ok with
> hackers
> > in control after I find them". Practically, it also means that much lag
> on
> > adding or removing permissions. That effectively means a database hit for
> > every query, or near enough to every query as not to matter.
> >
> > That said, you can get the best of multiple worlds, I think. The only DB
> > query that really has to be done is "give me the last update time for
> this
> > user". Compare that to the generation time in the token and 99% of the
> > time, it's the only query you need. With that check, you can even use
> > fairly long-lived tokens. If anything about the user has changed, reject
> > the token, generate a new one, send that to the user and use it. The
> > regenerate step is somewhat expensive, but still well inside reasonable,
> I
> > think.
> >
> > On Tue, May 9, 2017 at 11:31 AM Robert Butts <robert.o.bu...@gmail.com>
> > wrote:
> >
> > > > The TO service (and any other service that requires auth) MUST hit
> the
> > > database (or the auth service, which itself hits the database) to
> verify
> > > valid tokens' users still have the permissions they did when the token
> > was
> > > created. Otherwise, it's impossible to revoke tokens, e.g. if an
> employee
> > > quits, or an attacker gains a token, or a user changes their password.
> > >
> > > I'm elaborating on this, and moving a discussion from a PR review here.
> > >
> > > From the code submissions to the repo, it appears the current plan is
> for
> > > the API Gateway to create a JWT, and then for that JWT to be accepted
> by
> > > all Traffic Ops microservices, with no database authentication.
> > >
> > > It's a common misconception that JWT allows you authenticate without
> > > hitting the database. This is an exceedingly dangerous misconception.
> If
> > > you don't check the database when every authenticated route is
> requested,
> > > it's impossible to revoke access. In practice, this means the JWT TTL
> > > becomes the length of time _after you discover an attacker is
> > manipulating
> > > your production system_, before it's _possible_ to evict them.
> > >
> > > How long do you feel is acceptable to have a hacker in and manipulating
> > > your system, after you discover them? A day? An hour? Five minutes?
> > > Whatever your TTL, that's the length of time you're willing to allow a
> > > hacker to steal and destroy you and your customers' data. Worse,
> because
> > > this is a CDN, it's the length of time you're willing to allow your CDN
> > to
> > > be used to DDOS a target.
> > >
> > > Are you going to explain in court that the DDOS your system executed
> > lasted
> > > 24 hours, or 1 hour, or 10 minutes after you discovered it, because
> > that's
> > > the TTL you hard-coded? Are you going to explain to a judge and
> > prosecuting
> > > attorney exactly which sensitive data was stolen in the ten minutes
> after
> > > you discovered the attacker in your system, before their JWT expired?
> > >
> > > If you're willing to accept the legal consequences, that's your
> business.
> > > Apache Traffic Control should not require users to accept those
> > > consequences, and ideally shouldn't make it possible, as many users
> won't
> > > understand the security risks.
> > >
> > > The argument has been made "authorization does not check the database
> to
> > > avoid congestion" -- Has anyone tested this in practice? The database
> > query
> > > itself is 50ms. Assuming your database and service are 2500km apart,
> > that's
> > > another 50ms network latency. Traffic Ops has endpoints that take 10s
> to
> > > generate. Worst-case scenario, this will double the time of tiny
> > endpoints
> > > to 200ms, and increase large endpoints inconsequentially. It's highly
> > > unlikely performance is an issue in practice.
> > >
> > > As Jan said, we can still have the services check the auth as well
> after
> > > the proxy auth. Moreover, the services don't even have to know about
> the
> > > auth service, they can hit a mapped route on the API Gateway, which
> gives
> > > us better modularisation and separation of concerns.
> > >
> > > It's not difficult, it can be a trivial endpoint on the auth service,
> > > remapped in the API Gateway, which takes the JWT token and returns true
> > if
> > > it's still authorized in the database. To be clear, this is not a
> problem
> > > today. Traffic Ops still uses the Mojolicious cookie today, so this
> would
> > > only need done if and when we remove that, or if we move authorized
> > > endpoints out of Traffic Ops into their own microservices.
> > >
> > > Considering the significant security and legal risks, we should always
> > hit
> > > the database to validate requests of authorized endpoints, and
> reconsider
> > > if and when someone observes performance issues in practice.
> > >
> > >
> > > On Tue, May 9, 2017 at 6:56 AM, Dewayne Richardson <dewr...@gmail.com>
> > > wrote:
> > >
> > > > If only the API GW authenticates/authorizes we also have a single
> point
> > > of
> > > > entry to test for security instead of having it sprinkled across
> > services
> > > > in different ways.  It also simplifies the code on the service side
> and
> > > > makes them easier to test with automation.
> > > >
> > > > -Dew
> > > >
> > > > On Mon, May 8, 2017 at 8:42 AM, Robert Butts <
> robert.o.bu...@gmail.com
> > >
> > > > wrote:
> > > >
> > > > > > couldn't make nginx or http do what we need.
> > > > >
> > > > > I was suggesting a different architecture. Not making the proxy do
> > > auth,
> > > > > only standard proxying.
> > > > >
> > > > > > We can still have the services check the auth as well after the
> > proxy
> > > > > auth
> > > > >
> > > > > +1
> > > > >
> > > > >
> > > > > On Mon, May 8, 2017 at 3:36 AM, Amir Yeshurun <am...@qwilt.com>
> > wrote:
> > > > >
> > > > > > Hi,
> > > > > >
> > > > > > Let me elaborate some more on the purpose of the API GW. I will
> put
> > > up
> > > > a
> > > > > > wiki page following our discussions here.
> > > > > >
> > > > > > Main purpose is to allow innovation by creating new services that
> > > > handle
> > > > > TO
> > > > > > functionality, not as a part of the monolithic Mojo app.
> > > > > > The long term vision is to de-compose TO into multiple
> > microservices,
> > > > > > allowing new functionality easily added.
> > > > > > Indeed, the goal it to eventually deprecate the current AAA
> model,
> > > and
> > > > > > replace it with the new AAA model currently under work
> (user-roles,
> > > > > > role-capabilities)
> > > > > >
> > > > > > I think that handling authorization in the API layer is a valid
> > > > approach.
> > > > > > Security wise, I don't see much difference between that, and
> having
> > > > each
> > > > > > module access the auth service, as long as the auth service is
> > > deployed
> > > > > in
> > > > > > the backend.
> > > > > > Having another proxy (nginx?) fronting the world and forwarding
> all
> > > > > > requests to the backend GW mitigates the risk for compromising
> the
> > > > > > authorization service.
> > > > > > However, as mentioned above, we can still have the services check
> > the
> > > > > auth
> > > > > > as well after the proxy auth.
> > > > > >
> > > > > > It is a standalone process, completely optional at this point.
> One
> > > can
> > > > > > choose to deploy it in order to allow integration with additional
> > > > > > services. Deployment
> > > > > > and management are still T.B.D, and feedback on this is most
> > welcome.
> > > > > >
> > > > > > Regarding token validation and revocation:
> > > > > > Tokens have expiration time. Expired tokens do not pass token
> > > > validation.
> > > > > > In production, expiration should be set to relatively short time,
> > > say 5
> > > > > > minute.
> > > > > > This way revocation is automatic. Re-authentication is handled
> via
> > > > > refresh
> > > > > > tokens (not implemented yet). Hitting the DB upon every API call
> > > cause
> > > > > > congestion on users DB.
> > > > > > To avoid that, we chose to have all user information
> self-contained
> > > > > inside
> > > > > > the JWT.
> > > > > >
> > > > > > Thanks
> > > > > > /amiry
> > > > > >
> > > > > > On Mon, May 8, 2017 at 5:42 AM Jan van Doorn <j...@knutsel.com>
> > > wrote:
> > > > > >
> > > > > > > It's the reverse proxy we've discussed for the "micro services"
> > > > version
> > > > > > for
> > > > > > > a while now (as in
> > > > > > >
> > > https://cwiki.apache.org/confluence/display/TC/Design+Overview+v3.0
> > > > ).
> > > > > > >
> > > > > > > On Sun, May 7, 2017 at 7:22 PM Eric Friedrich (efriedri) <
> > > > > > > efrie...@cisco.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > From a higher level- what is purpose of the API Gateway?  It
> > > seems
> > > > > like
> > > > > > > > there may have been some previous discussions about API
> > Gateway.
> > > > Are
> > > > > > > there
> > > > > > > > any notes or description that I can catch up on?
> > > > > > > >
> > > > > > > > How will it be deployed? (Is it a standalone service or
> > something
> > > > > that
> > > > > > > > runs inside the experimental Traffic Ops)?
> > > > > > > >
> > > > > > > > Is this new component required or optional?
> > > > > > > >
> > > > > > > > —Eric
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > > On May 7, 2017, at 8:28 PM, Jan van Doorn <j...@knutsel.com
> >
> > > > wrote:
> > > > > > > > >
> > > > > > > > > I looked into this a year or so ago, and I couldn't make
> > nginx
> > > or
> > > > > > http
> > > > > > > do
> > > > > > > > > what we need.
> > > > > > > > >
> > > > > > > > > We can still have the services check the auth as well after
> > the
> > > > > proxy
> > > > > > > > auth,
> > > > > > > > > and make things better than today, where we have the same
> > > problem
> > > > > > that
> > > > > > > if
> > > > > > > > > the TO mojo app is compromised, everything is compromised.
> > > > > > > > >
> > > > > > > > > If we always route to TO, we don't untangle the mess of
> being
> > > > > > dependent
> > > > > > > > on
> > > > > > > > > the monolithic TO for everything. Many services today, and
> > more
> > > > in
> > > > > > the
> > > > > > > > > future really just need a check to see if the user is
> > > authorized,
> > > > > and
> > > > > > > > > nothing more.
> > > > > > > > >
> > > > > > > > > On Sun, May 7, 2017 at 11:55 AM Robert Butts <
> > > > > > robert.o.bu...@gmail.com
> > > > > > > >
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > >> What are the advantages of these config files, over an
> > > existing
> > > > > > > reverse
> > > > > > > > >> proxy, like Nginx or httpd? It's just as much work as
> > > > configuring
> > > > > > and
> > > > > > > > >> deploying an existing product, but more code we have to
> > write
> > > > and
> > > > > > > > maintain.
> > > > > > > > >> I'm having trouble seeing the advantage.
> > > > > > > > >>
> > > > > > > > >> -1 on auth rules as a part of the proxy. Making a proxy
> care
> > > > about
> > > > > > > auth
> > > > > > > > >> violates the Single Responsibility Principle, and further,
> > is
> > > a
> > > > > > > security
> > > > > > > > >> risk. It creates unnecessary attack surface. If your proxy
> > app
> > > > or
> > > > > > > > server is
> > > > > > > > >> compromised, the entire framework is now compromised. An
> > > > attacker
> > > > > > > could
> > > > > > > > >> simply rewrite the proxy config to make all routes
> no-auth.
> > > > > > > > >>
> > > > > > > > >> The simple alternative is for the proxy to always route to
> > TO,
> > > > and
> > > > > > TO
> > > > > > > > >> checks the token against the auth service (which may also
> be
> > > > > > proxied),
> > > > > > > > and
> > > > > > > > >> redirects unauthorized requests to a login endpoint (which
> > may
> > > > > also
> > > > > > be
> > > > > > > > >> proxied).
> > > > > > > > >>
> > > > > > > > >> The TO service (and any other service that requires auth)
> > MUST
> > > > hit
> > > > > > the
> > > > > > > > >> database (or the auth service, which itself hits the
> > database)
> > > > to
> > > > > > > verify
> > > > > > > > >> valid tokens' users still have the permissions they did
> when
> > > the
> > > > > > token
> > > > > > > > was
> > > > > > > > >> created. Otherwise, it's impossible to revoke tokens, e.g.
> > if
> > > an
> > > > > > > > employee
> > > > > > > > >> quits, or an attacker gains a token, or a user changes
> their
> > > > > > password.
> > > > > > > > >>
> > > > > > > > >>
> > > > > > > > >> On Sun, May 7, 2017 at 4:35 AM, Amir Yeshurun <
> > > am...@qwilt.com>
> > > > > > > wrote:
> > > > > > > > >>
> > > > > > > > >>> Seems that attachments are stripped on this list.
> Examples
> > > > pasted
> > > > > > > below
> > > > > > > > >>>
> > > > > > > > >>> *rules.json*
> > > > > > > > >>> [
> > > > > > > > >>>    { "host": "localhost", "path": "/login",
> > > > > >  "forward":
> > > > > > > > >>> "localhost:9004", "scheme": "https", "auth": false },
> > > > > > > > >>>    { "host": "localhost", "path": "/api/1.2/innovation/",
> > > > > > "forward":
> > > > > > > > >>> "localhost:8004", "scheme": "http",  "auth": true,
> > > > "routes-file":
> > > > > > > > >>> "innovation.json" },
> > > > > > > > >>>    { "host": "localhost", "path": "/api/1.2/",
> > > > > > "forward":
> > > > > > > > >>> "localhost:3000", "scheme": "http",  "auth": true,
> > > > "routes-file":
> > > > > > > > >>> "traffic-ops-routes.json" },
> > > > > > > > >>>    { "host": "localhost", "path": "/internal/api/1.2/",
> > > > > >  "forward":
> > > > > > > > >>> "localhost:3000", "scheme": "http",  "auth": true,
> > > > "routes-file":
> > > > > > > > >>> "internal-routes.json" }
> > > > > > > > >>> ]
> > > > > > > > >>>
> > > > > > > > >>> *traffic-ops-routes.json (partial)*
> > > > > > > > >>> .
> > > > > > > > >>> .
> > > > > > > > >>> .
> > > > > > > > >>>    { "match": "/cdns/health",
> > "auth":
> > > {
> > > > > > "GET":
> > > > > > > > >>> ["cdn-health-read"] }},
> > > > > > > > >>>    { "match": "/cdns/capacity",
> > "auth":
> > > {
> > > > > > "GET":
> > > > > > > > >>> ["cdn-health-read"] }},
> > > > > > > > >>>    { "match": "/cdns/usage/overview",
> > "auth":
> > > {
> > > > > > "GET":
> > > > > > > > >>> ["cdn-stats-read"] }},
> > > > > > > > >>>    { "match": "/cdns/name/dnsseckeys/generate",
> > "auth":
> > > {
> > > > > > "GET":
> > > > > > > > >>> ["cdn-security-keys-read"] }},
> > > > > > > > >>>    { "match": "/cdns/name/[^\/]+/?",
> >  "auth":
> > > {
> > > > > > "GET":
> > > > > > > > >>> ["cdn-read"] }},
> > > > > > > > >>>    { "match": "/cdns/name/[^\/]+/sslkeys",
> >  "auth":
> > > {
> > > > > > "GET":
> > > > > > > > >>> ["cdn-security-keys-read"] }},
> > > > > > > > >>>    { "match": "/cdns/name/[^\/]+/dnsseckeys",
> > "auth":
> > > {
> > > > > > "GET":
> > > > > > > > >>> ["cdn-security-keys-read"] }},
> > > > > > > > >>>    { "match": "/cdns/name/[^\/]+/dnsseckeys/delete",
> > "auth":
> > > {
> > > > > > "GET":
> > > > > > > > >>> ["cdn-security-keys-write"] }},
> > > > > > > > >>>    { "match": "/cdns/[^\/]+/queue_update",
> >  "auth":
> > > {
> > > > > > > "POST":
> > > > > > > > >>> ["queue-updates-write"] }},
> > > > > > > > >>>    { "match": "/cdns/[^\/]+/snapshot",
> >  "auth":
> > > {
> > > > > > "PUT":
> > > > > > > > >>> ["cdn-config-snapshot-write"] }},
> > > > > > > > >>>    { "match": "/cdns/[^\/]+/health",
> >  "auth":
> > > {
> > > > > > "GET":
> > > > > > > > >>> ["cdn-health-read"] }},
> > > > > > > > >>>    { "match": "/cdns/[^\/]+/?",
> > "auth":
> > > {
> > > > > > "GET":
> > > > > > > > >>> ["cdn-read"], "PUT":  ["cdn-write"], "PATCH":
> > ["cdn-write"],
> > > > > > > "DELETE":
> > > > > > > > >>> ["cdn-write"] }},
> > > > > > > > >>>    { "match": "/cdns",
> >  "auth":
> > > {
> > > > > > "GET":
> > > > > > > > >>> ["cdn-read"], "POST": ["cdn-write"] }},
> > > > > > > > >>>
> > > > > > > > >>> .
> > > > > > > > >>> .
> > > > > > > > >>> .
> > > > > > > > >>>
> > > > > > > > >>>
> > > > > > > > >>> On Sun, May 7, 2017 at 12:39 PM Amir Yeshurun <
> > > am...@qwilt.com
> > > > >
> > > > > > > wrote:
> > > > > > > > >>>
> > > > > > > > >>>> Attached please find examples for forwarding rules file
> > > > > > (rules.json)
> > > > > > > > >> and
> > > > > > > > >>>> the authorization rules file (traffic-ops-routes.json)
> > > > > > > > >>>>
> > > > > > > > >>>>
> > > > > > > > >>>> On Sun, May 7, 2017 at 10:39 AM Amir Yeshurun <
> > > > am...@qwilt.com>
> > > > > > > > wrote:
> > > > > > > > >>>>
> > > > > > > > >>>>> Hi all,
> > > > > > > > >>>>>
> > > > > > > > >>>>> I am about to submit a PR with a first operational
> > version
> > > of
> > > > > the
> > > > > > > API
> > > > > > > > >>> GW,
> > > > > > > > >>>>> to the "experimental" code base.
> > > > > > > > >>>>>
> > > > > > > > >>>>> The API GW forwarding logic is as follow:
> > > > > > > > >>>>>
> > > > > > > > >>>>>   1. Find host to forward the request: Prefix match on
> > the
> > > > > > request
> > > > > > > > >> path
> > > > > > > > >>>>>   against a list of forwarding rules. The matched
> > > forwarding
> > > > > rule
> > > > > > > > >>> defines the
> > > > > > > > >>>>>   target's host, and the target's *authorization
> rules*.
> > > > > > > > >>>>>   2. Authorization: Regex match on the request path
> > > against a
> > > > > > list
> > > > > > > of
> > > > > > > > >>> *authorization
> > > > > > > > >>>>>   rules*. The matched rule defines the required
> > > capabilities
> > > > to
> > > > > > > > >> perform
> > > > > > > > >>>>>   the HTTP method on the route. These capabilities are
> > > > compared
> > > > > > > > >>> against the
> > > > > > > > >>>>>   user's capabilities in the user's JWT
> > > > > > > > >>>>>
> > > > > > > > >>>>> At this moment, the 2 sets of rules are hard-coded in
> > json
> > > > > files.
> > > > > > > The
> > > > > > > > >>>>> files are provided with the API GW distribution and
> > contain
> > > > > > > > >> definitions
> > > > > > > > >>> for
> > > > > > > > >>>>> TC 2.0 API routes. I have tested parts of the API,
> > however,
> > > > > there
> > > > > > > > >> might
> > > > > > > > >>> be
> > > > > > > > >>>>> mistakes in some of the routes. Please be warned.
> > > > > > > > >>>>>
> > > > > > > > >>>>> Considering manageability and high availability, I am
> > aware
> > > > > that
> > > > > > > > using
> > > > > > > > >>>>> local files for storing the set of authorization rules
> is
> > > > > > inferior
> > > > > > > to
> > > > > > > > >>>>> centralized configuration.
> > > > > > > > >>>>>
> > > > > > > > >>>>> We are considering different approaches for centralized
> > > > > > > > configuration,
> > > > > > > > >>>>> having the following points in mind
> > > > > > > > >>>>>
> > > > > > > > >>>>>   - Microservice world: API GW will front multiple
> > > services,
> > > > > not
> > > > > > > only
> > > > > > > > >>>>>   Mojo. It can also front other TC components like
> > Traffic
> > > > > Stats
> > > > > > > and
> > > > > > > > >>> Traffic
> > > > > > > > >>>>>   Monitor. Each service defines its own routes and
> > > > > capabilities.
> > > > > > > Here
> > > > > > > > >>> comes
> > > > > > > > >>>>>   the question of what is the "source of truth" for the
> > > route
> > > > > > > > >>> definitions.
> > > > > > > > >>>>>   - Handling private routes. API GW may front non-TC
> > > > services.
> > > > > > > > >>>>>   - User changes to the AAA scheme. The ability for
> admin
> > > > user
> > > > > to
> > > > > > > > >> makes
> > > > > > > > >>>>>   changes in the required capabilities of a route,
> maybe
> > > even
> > > > > > > define
> > > > > > > > >>> new
> > > > > > > > >>>>>   capability names, was raised in the past as a use
> case
> > > that
> > > > > > > should
> > > > > > > > >> be
> > > > > > > > >>>>>   supported.
> > > > > > > > >>>>>   - Easy development and deployment of new services.
> > > > > > > > >>>>>   - Using TO DB for expediency.
> > > > > > > > >>>>>
> > > > > > > > >>>>> I would appreciate any feedback and views on your
> > approach
> > > to
> > > > > > > manage
> > > > > > > > >>>>> route definitions.
> > > > > > > > >>>>>
> > > > > > > > >>>>> Thanks
> > > > > > > > >>>>> /amiry
> > > > > > > > >>>>>
> > > > > > > > >>>>
> > > > > > > > >>>
> > > > > > > > >>
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: API GW route configuration

Reply via email to