Re: API GW route configuration

Shmulik Asafi Tue, 09 May 2017 12:27:51 -0700

Hi,
Maybe a missing link here is another component in a jwt stateless
architecture which is *blacklisting* malign tokens when necessary.
This is obviously a sort of state which needs to be handled in a datastore;
but it's quite different and easy to scale and has less performance impact
(I guess especially under DDOS) than doing full auth queries.
I believe this should be the approach on the API Gateway roadmap
Thanks


On 9 May 2017 21:14, "Chris Lemmons" <alfic...@gmail.com> wrote:

> I'll second the principle behind "start with security, optimize when
> there's a problem".
>
> It seems to me that in order to maintain security, basically everyone would
> need to dial the revalidate time so close to zero that it does very little
> good as a cache on the credentials. Otherwise, as Rob as pointed out, the
> TTL on your credential cache is effectively "how long am I ok with hackers
> in control after I find them". Practically, it also means that much lag on
> adding or removing permissions. That effectively means a database hit for
> every query, or near enough to every query as not to matter.
>
> That said, you can get the best of multiple worlds, I think. The only DB
> query that really has to be done is "give me the last update time for this
> user". Compare that to the generation time in the token and 99% of the
> time, it's the only query you need. With that check, you can even use
> fairly long-lived tokens. If anything about the user has changed, reject
> the token, generate a new one, send that to the user and use it. The
> regenerate step is somewhat expensive, but still well inside reasonable, I
> think.
>
> On Tue, May 9, 2017 at 11:31 AM Robert Butts <robert.o.bu...@gmail.com>
> wrote:
>
> > > The TO service (and any other service that requires auth) MUST hit the
> > database (or the auth service, which itself hits the database) to verify
> > valid tokens' users still have the permissions they did when the token
> was
> > created. Otherwise, it's impossible to revoke tokens, e.g. if an employee
> > quits, or an attacker gains a token, or a user changes their password.
> >
> > I'm elaborating on this, and moving a discussion from a PR review here.
> >
> > From the code submissions to the repo, it appears the current plan is for
> > the API Gateway to create a JWT, and then for that JWT to be accepted by
> > all Traffic Ops microservices, with no database authentication.
> >
> > It's a common misconception that JWT allows you authenticate without
> > hitting the database. This is an exceedingly dangerous misconception. If
> > you don't check the database when every authenticated route is requested,
> > it's impossible to revoke access. In practice, this means the JWT TTL
> > becomes the length of time _after you discover an attacker is
> manipulating
> > your production system_, before it's _possible_ to evict them.
> >
> > How long do you feel is acceptable to have a hacker in and manipulating
> > your system, after you discover them? A day? An hour? Five minutes?
> > Whatever your TTL, that's the length of time you're willing to allow a
> > hacker to steal and destroy you and your customers' data. Worse, because
> > this is a CDN, it's the length of time you're willing to allow your CDN
> to
> > be used to DDOS a target.
> >
> > Are you going to explain in court that the DDOS your system executed
> lasted
> > 24 hours, or 1 hour, or 10 minutes after you discovered it, because
> that's
> > the TTL you hard-coded? Are you going to explain to a judge and
> prosecuting
> > attorney exactly which sensitive data was stolen in the ten minutes after
> > you discovered the attacker in your system, before their JWT expired?
> >
> > If you're willing to accept the legal consequences, that's your business.
> > Apache Traffic Control should not require users to accept those
> > consequences, and ideally shouldn't make it possible, as many users won't
> > understand the security risks.
> >
> > The argument has been made "authorization does not check the database to
> > avoid congestion" -- Has anyone tested this in practice? The database
> query
> > itself is 50ms. Assuming your database and service are 2500km apart,
> that's
> > another 50ms network latency. Traffic Ops has endpoints that take 10s to
> > generate. Worst-case scenario, this will double the time of tiny
> endpoints
> > to 200ms, and increase large endpoints inconsequentially. It's highly
> > unlikely performance is an issue in practice.
> >
> > As Jan said, we can still have the services check the auth as well after
> > the proxy auth. Moreover, the services don't even have to know about the
> > auth service, they can hit a mapped route on the API Gateway, which gives
> > us better modularisation and separation of concerns.
> >
> > It's not difficult, it can be a trivial endpoint on the auth service,
> > remapped in the API Gateway, which takes the JWT token and returns true
> if
> > it's still authorized in the database. To be clear, this is not a problem
> > today. Traffic Ops still uses the Mojolicious cookie today, so this would
> > only need done if and when we remove that, or if we move authorized
> > endpoints out of Traffic Ops into their own microservices.
> >
> > Considering the significant security and legal risks, we should always
> hit
> > the database to validate requests of authorized endpoints, and reconsider
> > if and when someone observes performance issues in practice.
> >
> >
> > On Tue, May 9, 2017 at 6:56 AM, Dewayne Richardson <dewr...@gmail.com>
> > wrote:
> >
> > > If only the API GW authenticates/authorizes we also have a single point
> > of
> > > entry to test for security instead of having it sprinkled across
> services
> > > in different ways.  It also simplifies the code on the service side and
> > > makes them easier to test with automation.
> > >
> > > -Dew
> > >
> > > On Mon, May 8, 2017 at 8:42 AM, Robert Butts <robert.o.bu...@gmail.com
> >
> > > wrote:
> > >
> > > > > couldn't make nginx or http do what we need.
> > > >
> > > > I was suggesting a different architecture. Not making the proxy do
> > auth,
> > > > only standard proxying.
> > > >
> > > > > We can still have the services check the auth as well after the
> proxy
> > > > auth
> > > >
> > > > +1
> > > >
> > > >
> > > > On Mon, May 8, 2017 at 3:36 AM, Amir Yeshurun <am...@qwilt.com>
> wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > Let me elaborate some more on the purpose of the API GW. I will put
> > up
> > > a
> > > > > wiki page following our discussions here.
> > > > >
> > > > > Main purpose is to allow innovation by creating new services that
> > > handle
> > > > TO
> > > > > functionality, not as a part of the monolithic Mojo app.
> > > > > The long term vision is to de-compose TO into multiple
> microservices,
> > > > > allowing new functionality easily added.
> > > > > Indeed, the goal it to eventually deprecate the current AAA model,
> > and
> > > > > replace it with the new AAA model currently under work (user-roles,
> > > > > role-capabilities)
> > > > >
> > > > > I think that handling authorization in the API layer is a valid
> > > approach.
> > > > > Security wise, I don't see much difference between that, and having
> > > each
> > > > > module access the auth service, as long as the auth service is
> > deployed
> > > > in
> > > > > the backend.
> > > > > Having another proxy (nginx?) fronting the world and forwarding all
> > > > > requests to the backend GW mitigates the risk for compromising the
> > > > > authorization service.
> > > > > However, as mentioned above, we can still have the services check
> the
> > > > auth
> > > > > as well after the proxy auth.
> > > > >
> > > > > It is a standalone process, completely optional at this point. One
> > can
> > > > > choose to deploy it in order to allow integration with additional
> > > > > services. Deployment
> > > > > and management are still T.B.D, and feedback on this is most
> welcome.
> > > > >
> > > > > Regarding token validation and revocation:
> > > > > Tokens have expiration time. Expired tokens do not pass token
> > > validation.
> > > > > In production, expiration should be set to relatively short time,
> > say 5
> > > > > minute.
> > > > > This way revocation is automatic. Re-authentication is handled via
> > > > refresh
> > > > > tokens (not implemented yet). Hitting the DB upon every API call
> > cause
> > > > > congestion on users DB.
> > > > > To avoid that, we chose to have all user information self-contained
> > > > inside
> > > > > the JWT.
> > > > >
> > > > > Thanks
> > > > > /amiry
> > > > >
> > > > > On Mon, May 8, 2017 at 5:42 AM Jan van Doorn <j...@knutsel.com>
> > wrote:
> > > > >
> > > > > > It's the reverse proxy we've discussed for the "micro services"
> > > version
> > > > > for
> > > > > > a while now (as in
> > > > > >
> > https://cwiki.apache.org/confluence/display/TC/Design+Overview+v3.0
> > > ).
> > > > > >
> > > > > > On Sun, May 7, 2017 at 7:22 PM Eric Friedrich (efriedri) <
> > > > > > efrie...@cisco.com>
> > > > > > wrote:
> > > > > >
> > > > > > > From a higher level- what is purpose of the API Gateway?  It
> > seems
> > > > like
> > > > > > > there may have been some previous discussions about API
> Gateway.
> > > Are
> > > > > > there
> > > > > > > any notes or description that I can catch up on?
> > > > > > >
> > > > > > > How will it be deployed? (Is it a standalone service or
> something
> > > > that
> > > > > > > runs inside the experimental Traffic Ops)?
> > > > > > >
> > > > > > > Is this new component required or optional?
> > > > > > >
> > > > > > > —Eric
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > > On May 7, 2017, at 8:28 PM, Jan van Doorn <j...@knutsel.com>
> > > wrote:
> > > > > > > >
> > > > > > > > I looked into this a year or so ago, and I couldn't make
> nginx
> > or
> > > > > http
> > > > > > do
> > > > > > > > what we need.
> > > > > > > >
> > > > > > > > We can still have the services check the auth as well after
> the
> > > > proxy
> > > > > > > auth,
> > > > > > > > and make things better than today, where we have the same
> > problem
> > > > > that
> > > > > > if
> > > > > > > > the TO mojo app is compromised, everything is compromised.
> > > > > > > >
> > > > > > > > If we always route to TO, we don't untangle the mess of being
> > > > > dependent
> > > > > > > on
> > > > > > > > the monolithic TO for everything. Many services today, and
> more
> > > in
> > > > > the
> > > > > > > > future really just need a check to see if the user is
> > authorized,
> > > > and
> > > > > > > > nothing more.
> > > > > > > >
> > > > > > > > On Sun, May 7, 2017 at 11:55 AM Robert Butts <
> > > > > robert.o.bu...@gmail.com
> > > > > > >
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > >> What are the advantages of these config files, over an
> > existing
> > > > > > reverse
> > > > > > > >> proxy, like Nginx or httpd? It's just as much work as
> > > configuring
> > > > > and
> > > > > > > >> deploying an existing product, but more code we have to
> write
> > > and
> > > > > > > maintain.
> > > > > > > >> I'm having trouble seeing the advantage.
> > > > > > > >>
> > > > > > > >> -1 on auth rules as a part of the proxy. Making a proxy care
> > > about
> > > > > > auth
> > > > > > > >> violates the Single Responsibility Principle, and further,
> is
> > a
> > > > > > security
> > > > > > > >> risk. It creates unnecessary attack surface. If your proxy
> app
> > > or
> > > > > > > server is
> > > > > > > >> compromised, the entire framework is now compromised. An
> > > attacker
> > > > > > could
> > > > > > > >> simply rewrite the proxy config to make all routes no-auth.
> > > > > > > >>
> > > > > > > >> The simple alternative is for the proxy to always route to
> TO,
> > > and
> > > > > TO
> > > > > > > >> checks the token against the auth service (which may also be
> > > > > proxied),
> > > > > > > and
> > > > > > > >> redirects unauthorized requests to a login endpoint (which
> may
> > > > also
> > > > > be
> > > > > > > >> proxied).
> > > > > > > >>
> > > > > > > >> The TO service (and any other service that requires auth)
> MUST
> > > hit
> > > > > the
> > > > > > > >> database (or the auth service, which itself hits the
> database)
> > > to
> > > > > > verify
> > > > > > > >> valid tokens' users still have the permissions they did when
> > the
> > > > > token
> > > > > > > was
> > > > > > > >> created. Otherwise, it's impossible to revoke tokens, e.g.
> if
> > an
> > > > > > > employee
> > > > > > > >> quits, or an attacker gains a token, or a user changes their
> > > > > password.
> > > > > > > >>
> > > > > > > >>
> > > > > > > >> On Sun, May 7, 2017 at 4:35 AM, Amir Yeshurun <
> > am...@qwilt.com>
> > > > > > wrote:
> > > > > > > >>
> > > > > > > >>> Seems that attachments are stripped on this list. Examples
> > > pasted
> > > > > > below
> > > > > > > >>>
> > > > > > > >>> *rules.json*
> > > > > > > >>> [
> > > > > > > >>>    { "host": "localhost", "path": "/login",
> > > > >  "forward":
> > > > > > > >>> "localhost:9004", "scheme": "https", "auth": false },
> > > > > > > >>>    { "host": "localhost", "path": "/api/1.2/innovation/",
> > > > > "forward":
> > > > > > > >>> "localhost:8004", "scheme": "http",  "auth": true,
> > > "routes-file":
> > > > > > > >>> "innovation.json" },
> > > > > > > >>>    { "host": "localhost", "path": "/api/1.2/",
> > > > > "forward":
> > > > > > > >>> "localhost:3000", "scheme": "http",  "auth": true,
> > > "routes-file":
> > > > > > > >>> "traffic-ops-routes.json" },
> > > > > > > >>>    { "host": "localhost", "path": "/internal/api/1.2/",
> > > > >  "forward":
> > > > > > > >>> "localhost:3000", "scheme": "http",  "auth": true,
> > > "routes-file":
> > > > > > > >>> "internal-routes.json" }
> > > > > > > >>> ]
> > > > > > > >>>
> > > > > > > >>> *traffic-ops-routes.json (partial)*
> > > > > > > >>> .
> > > > > > > >>> .
> > > > > > > >>> .
> > > > > > > >>>    { "match": "/cdns/health",
> "auth":
> > {
> > > > > "GET":
> > > > > > > >>> ["cdn-health-read"] }},
> > > > > > > >>>    { "match": "/cdns/capacity",
> "auth":
> > {
> > > > > "GET":
> > > > > > > >>> ["cdn-health-read"] }},
> > > > > > > >>>    { "match": "/cdns/usage/overview",
> "auth":
> > {
> > > > > "GET":
> > > > > > > >>> ["cdn-stats-read"] }},
> > > > > > > >>>    { "match": "/cdns/name/dnsseckeys/generate",
> "auth":
> > {
> > > > > "GET":
> > > > > > > >>> ["cdn-security-keys-read"] }},
> > > > > > > >>>    { "match": "/cdns/name/[^\/]+/?",
>  "auth":
> > {
> > > > > "GET":
> > > > > > > >>> ["cdn-read"] }},
> > > > > > > >>>    { "match": "/cdns/name/[^\/]+/sslkeys",
>  "auth":
> > {
> > > > > "GET":
> > > > > > > >>> ["cdn-security-keys-read"] }},
> > > > > > > >>>    { "match": "/cdns/name/[^\/]+/dnsseckeys",
> "auth":
> > {
> > > > > "GET":
> > > > > > > >>> ["cdn-security-keys-read"] }},
> > > > > > > >>>    { "match": "/cdns/name/[^\/]+/dnsseckeys/delete",
> "auth":
> > {
> > > > > "GET":
> > > > > > > >>> ["cdn-security-keys-write"] }},
> > > > > > > >>>    { "match": "/cdns/[^\/]+/queue_update",
>  "auth":
> > {
> > > > > > "POST":
> > > > > > > >>> ["queue-updates-write"] }},
> > > > > > > >>>    { "match": "/cdns/[^\/]+/snapshot",
>  "auth":
> > {
> > > > > "PUT":
> > > > > > > >>> ["cdn-config-snapshot-write"] }},
> > > > > > > >>>    { "match": "/cdns/[^\/]+/health",
>  "auth":
> > {
> > > > > "GET":
> > > > > > > >>> ["cdn-health-read"] }},
> > > > > > > >>>    { "match": "/cdns/[^\/]+/?",
> "auth":
> > {
> > > > > "GET":
> > > > > > > >>> ["cdn-read"], "PUT":  ["cdn-write"], "PATCH":
> ["cdn-write"],
> > > > > > "DELETE":
> > > > > > > >>> ["cdn-write"] }},
> > > > > > > >>>    { "match": "/cdns",
>  "auth":
> > {
> > > > > "GET":
> > > > > > > >>> ["cdn-read"], "POST": ["cdn-write"] }},
> > > > > > > >>>
> > > > > > > >>> .
> > > > > > > >>> .
> > > > > > > >>> .
> > > > > > > >>>
> > > > > > > >>>
> > > > > > > >>> On Sun, May 7, 2017 at 12:39 PM Amir Yeshurun <
> > am...@qwilt.com
> > > >
> > > > > > wrote:
> > > > > > > >>>
> > > > > > > >>>> Attached please find examples for forwarding rules file
> > > > > (rules.json)
> > > > > > > >> and
> > > > > > > >>>> the authorization rules file (traffic-ops-routes.json)
> > > > > > > >>>>
> > > > > > > >>>>
> > > > > > > >>>> On Sun, May 7, 2017 at 10:39 AM Amir Yeshurun <
> > > am...@qwilt.com>
> > > > > > > wrote:
> > > > > > > >>>>
> > > > > > > >>>>> Hi all,
> > > > > > > >>>>>
> > > > > > > >>>>> I am about to submit a PR with a first operational
> version
> > of
> > > > the
> > > > > > API
> > > > > > > >>> GW,
> > > > > > > >>>>> to the "experimental" code base.
> > > > > > > >>>>>
> > > > > > > >>>>> The API GW forwarding logic is as follow:
> > > > > > > >>>>>
> > > > > > > >>>>>   1. Find host to forward the request: Prefix match on
> the
> > > > > request
> > > > > > > >> path
> > > > > > > >>>>>   against a list of forwarding rules. The matched
> > forwarding
> > > > rule
> > > > > > > >>> defines the
> > > > > > > >>>>>   target's host, and the target's *authorization rules*.
> > > > > > > >>>>>   2. Authorization: Regex match on the request path
> > against a
> > > > > list
> > > > > > of
> > > > > > > >>> *authorization
> > > > > > > >>>>>   rules*. The matched rule defines the required
> > capabilities
> > > to
> > > > > > > >> perform
> > > > > > > >>>>>   the HTTP method on the route. These capabilities are
> > > compared
> > > > > > > >>> against the
> > > > > > > >>>>>   user's capabilities in the user's JWT
> > > > > > > >>>>>
> > > > > > > >>>>> At this moment, the 2 sets of rules are hard-coded in
> json
> > > > files.
> > > > > > The
> > > > > > > >>>>> files are provided with the API GW distribution and
> contain
> > > > > > > >> definitions
> > > > > > > >>> for
> > > > > > > >>>>> TC 2.0 API routes. I have tested parts of the API,
> however,
> > > > there
> > > > > > > >> might
> > > > > > > >>> be
> > > > > > > >>>>> mistakes in some of the routes. Please be warned.
> > > > > > > >>>>>
> > > > > > > >>>>> Considering manageability and high availability, I am
> aware
> > > > that
> > > > > > > using
> > > > > > > >>>>> local files for storing the set of authorization rules is
> > > > > inferior
> > > > > > to
> > > > > > > >>>>> centralized configuration.
> > > > > > > >>>>>
> > > > > > > >>>>> We are considering different approaches for centralized
> > > > > > > configuration,
> > > > > > > >>>>> having the following points in mind
> > > > > > > >>>>>
> > > > > > > >>>>>   - Microservice world: API GW will front multiple
> > services,
> > > > not
> > > > > > only
> > > > > > > >>>>>   Mojo. It can also front other TC components like
> Traffic
> > > > Stats
> > > > > > and
> > > > > > > >>> Traffic
> > > > > > > >>>>>   Monitor. Each service defines its own routes and
> > > > capabilities.
> > > > > > Here
> > > > > > > >>> comes
> > > > > > > >>>>>   the question of what is the "source of truth" for the
> > route
> > > > > > > >>> definitions.
> > > > > > > >>>>>   - Handling private routes. API GW may front non-TC
> > > services.
> > > > > > > >>>>>   - User changes to the AAA scheme. The ability for admin
> > > user
> > > > to
> > > > > > > >> makes
> > > > > > > >>>>>   changes in the required capabilities of a route, maybe
> > even
> > > > > > define
> > > > > > > >>> new
> > > > > > > >>>>>   capability names, was raised in the past as a use case
> > that
> > > > > > should
> > > > > > > >> be
> > > > > > > >>>>>   supported.
> > > > > > > >>>>>   - Easy development and deployment of new services.
> > > > > > > >>>>>   - Using TO DB for expediency.
> > > > > > > >>>>>
> > > > > > > >>>>> I would appreciate any feedback and views on your
> approach
> > to
> > > > > > manage
> > > > > > > >>>>> route definitions.
> > > > > > > >>>>>
> > > > > > > >>>>> Thanks
> > > > > > > >>>>> /amiry
> > > > > > > >>>>>
> > > > > > > >>>>
> > > > > > > >>>
> > > > > > > >>
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: API GW route configuration

Reply via email to