Re: Kafka and Proposal on a future architecture of OpenWhisk

Tyson Norris Wed, 22 Aug 2018 15:34:04 -0700

Hi - thanks for the discussion! More inline...

On 8/22/18, 2:55 PM, "Markus Thömmes" <[email protected]> wrote:


    Hi Tyson,
    
    Am Mi., 22. Aug. 2018 um 23:37 Uhr schrieb Tyson Norris
    <[email protected]>:
    
    > Hi -
    >     >
    >     > When exactly is the case that a ContainerRouter should put a 
blocking
    >     > activation to a queue for stealing? Since a) it is not spawning
    > containers
    >     > and b) it is not parsing request/response bodies, can we say this
    > would
    >     > only happen when a ContainerRouter maxes out its incoming request
    > handling?
    >     >
    >
    >     That's exactly the idea! The work-stealing queue will only be used if
    > the
    >     Router where to request landed cannot serve the demand right now. For
    >     example, if it maxed out the slots it has for a certain action (all
    >     containers are working to their full extent) it requests more
    > resources and
    >     puts the request-token on the work-stealing queue.
    >
    > So to clarify, ContainerRouter "load" (which can trigger use of queue) is
    > mostly (only?) based on:
    > * the number of Container references
    > * the number of outstanding inbound  HTTP requests, e.g. when lots of
    > requests can be routed to the same container
    > * the number of outstand outbound HTTP requests to remote action
    > containers (assume all are remote)
    > It is unclear the order of magnitude considered for "maxed out slots",
    > since container refs should be simple (like ip+port, action metadata,
    > activation count, and warm state), inbound connection handling is 
basically
    > a http server, and outbound is a connection pool per action container
    > (let's presume connection reuse for the moment).
    > I think it will certainly need testing to determine these and to be
    > configurable in any case, for each of these separate stats.. Is there
    > anything else that affects the load for ContainerRouter?
    >
    
    "Overload" is determined by the availability of free slots on any container
    being able to serve the current action invocation (or rather the absence
    thereof). An example:
    Say RouterA has 2 containers for action X. Each container has an allowed
    concurrency of 10. On each of those 2 there are 10 active invocations
    already running (the ContainerRouter knows this, these are open connections
    to the containers). If another request comes in for X, we know we don't
    have capacity for it. We request more resources and offer the work we got
    for stealing.
    
    I don't think there are tweaks needed here. The Router keeps an
    "activeInvocations" number per container and compares that to the allowed
    concurrency on that container. If activeInvocations == allowedConcurrency
    we're out of capacity and need more.
    
    We need a work-stealing queue here to dynamically rebalance between the
    Routers since the layer above the Routers has no idea about capacity and
    (at least that's my assumption) schedules randomly.
    
I think it is confusing to say that the ContainerRouter doesn't have capacity 
for it - rather, the existing set of continers in the ContainerRouter don't 
have capacity for it. I understand now, in any case.
So there are a couple of active paths in ContainerRouter, still only 
considering sync/blocking activations:
* warmpath - run immediately
* coldpath - send to queue

And each ContainerRouter has a queue consumer that presumably pulls from the 
queue constantly? Or is consumption based on something else? If all 
ContainerRouters are consuming at the same rate, then while this does 
distribute the load across ContainerRouters, it doesn't really guarantee any 
similar state (number of containers, active connections, etc) at each 
ContainerRouter, I think. Maybe I am missing something here?
  



    
    >
    >     That request-token will then be taken by any Router that has free
    > capacity
    >     for that action (note: this is not simple with kafka, but might be
    > simpler
    >     with other MQ technologies). Since new resources have been requested,
    > it is
    >     guaranteed that one Router will eventually become free.
    >
    > Is "requests resources" here requesting new action containers, which it
    > won't be able to process itself immediately, but should startup + warm and
    > be provided to "any ContainerRouter"? This makes, sense, just want to
    > clarify that "resources == containers".
    >
    
    Yes, resources == containers.
    
    
    >
    >     >
    >     > If ContainerManager has enough awareness of ContainerRouters'
    > states, I'm
    >     > not sure where using a queue would be used (for redirecting to other
    >     > ContainerRouters) vs ContainerManager responding with a
    > ContainerRouters
    >     > reference (instead of an action container reference) - I'm not
    > following
    >     > the logic of the edge case in the proposal - there is mention of
    > "which
    >     > controller the request needs to go", but maybe this is a typo and
    > should
    >     > say ContainerRouter?
    >     >
    >
    >     Indeed that's a typo, it should say ContainerRouter.
    >
    >     The ContainerManager only knows which Router has which Container. It
    > does
    >     not know whether the respective Router has capacity on that container
    > (the
    >     capacity metric is very hard to share since it's ever changing).
    >
    >     Hence, in an edge-case where there are less Containers than Routers,
    > the
    >     ContainerManager can hand out references to the Routers it gave
    > Containers
    >     to the Routers that have none. (This is the edge-case described in the
    >     proposal).
    >
    > I'm not sure why in this case the ContainerManager does not just create a
    > new container, instead of sending to another Router? If there is some
    > intended limit on "number of containers for a particular action", that
    > would be a reason, but given that the ContainerManager cannot know the
    > state of the existing containers, it seems like sending to another Router
    > which has the container, but may not be able to use it immediately, may
    > cause failures in some cases.
    >
    
    The edge-case here is for very slow load. It's minimizing the amount of
    Containers needed. Another example:
    Say you have 3 Routers. A request for action X comes in, goes to Router1.
    It requests a container, puts the work on the queue, nobody steals it, as
    soon as the Container gets ready, the work is taken from the queue and
    executed. All nice and dandy.
    
    Important remark: The Router that requested more Containers is not
    necessarily the one that's getting the Containers. We need to make sure to
    evenly distribute Containers across the system.
    
    So back to our example: What happens if requests for action X are made one
    after the other? Well, the layer above the Routers (something needs to
    loadbalance them, be it DNS or another type of routing layer) isn't aware
    of the locality of the Container that we created to execute action X. As it
    schedules fairly randomly (round-robin in a multi-tenant system is
    essentially random) the action will hit each Router once very soon. As
    we're only generating one request after the other, arguably we only want to
    create only one container.
    
    That's why in this example the 2 remaining Routers with no container get a
    reference to Router1.
    
    In the case you mentioned:
    > it seems like sending to another Router which has the container, but may
    not be able to use it immediately, may cause failures in some cases.
    
    I don't recall if it's in the document or in the discussion on the
    dev-list: The router would respond to the proxied request with a 503
    immediatly. That tells the proxying router: Oh, apparently we need more
    resources. So it requests another container etc etc.
    
    Does that clarify that specific edge-case?

Yes, but I would not call this an edge-case -  I think it is more of a ramp up 
to maximum container reuse, and will probably dramatically impacted by 
containers that do NOT support concurrency (will get a 503 when a single 
activation is in flight, vs high concurrency container, which would cause 503 
only once max concurrency reached).
If each ContainerRouter is as likely to receive the original request, and each 
is also as likely to receive the queued item from the stealing queue, then 
there will be a lot of cross traffic during the ramp up from 1 container to 
<Router count> containers. E.g.

From client:
Request1 -> Router 1 -> queue (no containers)
Request2 -> Router 2 -> queue (no containers)
Request3 -> Router 3 -> queue (no containers)
From queue:
Request1 -> Router1  -> create and use container
Reuqest2 -> Router2 -> Router1 -> 503 -> create container
Request3 -> Router3 -> Router1 -> 503 -> Router2 -> 503 -> create container

In other words - the 503 may help when there is one container existing, and it 
is deemed to be busy, but what if there are 10 containers existing (on 
different Routers other than where the request was pulled from the stealing 
queue) - do you make HTTP requests to all 10 Routers to see if they are busy 
before creating a new container?

    
    Memo to self: I need to incooperate all the stuff from these discussions
    into the document.
    
    
    >
    >
    >     The work-stealing queue though is used to rebalance work in case one
    > of the
    >     Routers get overloaded.
    >
    > Got it.
    >
    > Thanks
    > Tyson
    >
    >
    >

Re: Kafka and Proposal on a future architecture of OpenWhisk

Reply via email to