Re: Kafka and Proposal on a future architecture of OpenWhisk

Markus Thömmes Thu, 23 Aug 2018 13:06:21 -0700

Hi Tyson,

Am Do., 23. Aug. 2018 um 21:28 Uhr schrieb Tyson Norris
<tnor...@adobe.com.invalid>:


>     >
>     > And each ContainerRouter has a queue consumer that presumably pulls
> from
>     > the queue constantly? Or is consumption based on something else? If
> all
>     > ContainerRouters are consuming at the same rate, then while this does
>     > distribute the load across ContainerRouters, it doesn't really
> guarantee
>     > any similar state (number of containers, active connections, etc) at
> each
>     > ContainerRouter, I think. Maybe I am missing something here?
>     >
>
>
>     The idea is that ContainerRouters do **not** pull from the queue
>     constantly. They pull work for actions that they have idle containers
> for.
>
> Router is not pulling at queue for "specific actions", just for any action
> that might replace idle containers - right? This is complicated with
> concurrency though since while a container is not idle (paused +
> removable), it may be useable, but only if the action received is the same
> as one existing warm container, and that container has concurrency slots
> available for additional activations. It may be helpful to diagram some of
> this stealing queue flow a bit more, I'm not seeing how it will work out
> other than creating more containers than is absolutely required, which may
> be ok, not sure.
>

Yes, I will diagram things out soonish, I'm a little bit narrow on time
currently.

The idea is that indeed the Router pulls for *specific* actions. This is a
problem when using Kafka, but might be solvable when we don't require
Kafka. I have to test this for feasibility though.


>
>     Similar state in terms of number of containers is done via the
>     ContainerManager. Active connections should roughly even out with the
> queue
>     being pulled on idle.
>
> Yeah carefully defining "idle" may be tricky, if we want to achieve
> absolute minimum containers in use for a specific action at any time.
>
>
>     >
>     >     The edge-case here is for very slow load. It's minimizing the
> amount of
>     >     Containers needed. Another example:
>     >     Say you have 3 Routers. A request for action X comes in, goes to
>     > Router1.
>     >     It requests a container, puts the work on the queue, nobody
> steals it,
>     > as
>     >     soon as the Container gets ready, the work is taken from the
> queue and
>     >     executed. All nice and dandy.
>     >
>     >     Important remark: The Router that requested more Containers is
> not
>     >     necessarily the one that's getting the Containers. We need to
> make
>     > sure to
>     >     evenly distribute Containers across the system.
>     >
>     >     So back to our example: What happens if requests for action X
> are made
>     > one
>     >     after the other? Well, the layer above the Routers (something
> needs to
>     >     loadbalance them, be it DNS or another type of routing layer)
> isn't
>     > aware
>     >     of the locality of the Container that we created to execute
> action X.
>     > As it
>     >     schedules fairly randomly (round-robin in a multi-tenant system
> is
>     >     essentially random) the action will hit each Router once very
> soon. As
>     >     we're only generating one request after the other, arguably we
> only
>     > want to
>     >     create only one container.
>     >
>     >     That's why in this example the 2 remaining Routers with no
> container
>     > get a
>     >     reference to Router1.
>     >
>     >     In the case you mentioned:
>     >     > it seems like sending to another Router which has the
> container, but
>     > may
>     >     not be able to use it immediately, may cause failures in some
> cases.
>     >
>     >     I don't recall if it's in the document or in the discussion on
> the
>     >     dev-list: The router would respond to the proxied request with a
> 503
>     >     immediatly. That tells the proxying router: Oh, apparently we
> need more
>     >     resources. So it requests another container etc etc.
>     >
>     >     Does that clarify that specific edge-case?
>     >
>     > Yes, but I would not call this an edge-case -  I think it is more of
> a
>     > ramp up to maximum container reuse, and will probably dramatically
> impacted
>     > by containers that do NOT support concurrency (will get a 503 when a
> single
>     > activation is in flight, vs high concurrency container, which would
> cause
>     > 503 only once max concurrency reached).
>     > If each ContainerRouter is as likely to receive the original
> request, and
>     > each is also as likely to receive the queued item from the stealing
> queue,
>     > then there will be a lot of cross traffic during the ramp up from 1
>     > container to <Router count> containers. E.g.
>     >
>     > From client:
>     > Request1 -> Router 1 -> queue (no containers)
>     > Request2 -> Router 2 -> queue (no containers)
>     > Request3 -> Router 3 -> queue (no containers)
>     > From queue:
>     > Request1 -> Router1  -> create and use container
>     > Reuqest2 -> Router2 -> Router1 -> 503 -> create container
>     > Request3 -> Router3 -> Router1 -> 503 -> Router2 -> 503 -> create
> container
>     >
>     > In other words - the 503 may help when there is one container
> existing,
>     > and it is deemed to be busy, but what if there are 10 containers
> existing
>     > (on different Routers other than where the request was pulled from
> the
>     > stealing queue) - do you make HTTP requests to all 10 Routers to see
> if
>     > they are busy before creating a new container?
>     >
>
>     Good point, haven't really thought about that to be frank. Gut feeling
> is
>     that we should only have 1 direct reference per Router/Action to
> another
>     Router. If that yields a 503, just generate new resources immediately.
> That
>     might overshoot what we really need, but might just be good enough?
> Maybe
>     I'm overcomplicating here...
>
>     Alternative to the whole work-stealing algorithm (which could unify
> this
>     ramp-up phase and work-stealing itself nicely):
>     What if it is possible to detect that when an event is pushed to the
>     work-stealing queue, there are no active consumers for that event right
>     now. If that is possible, we could use that to signal that more
> Containers
>     are needed, because apparently nobody has idling resources for a
> specific
>     container.
>
> I think this is only possible in the case where 0 containers (for that
> action) existed before; in other cases, without having Router state
> visibility, it will be impossible to detect Routers that have capacity to
> service that event.
>

Why? If Routers have a container that has capacity, they will be listening
for more work (per my answer above).


>
>     We would then also use this mechanism to generate Containers in
> general.
>     The ContainerRouters would **never** request new containers. They would
>     instead put their work on the stealing queue to see let someone else
> work
>     on it. If the system detects that nobody is waiting, it requests a new
>     Container (maybe the ContainerManager could be used to detect that
> signal?)
>
>     For our "edge-case" that'd mean: No references to other Routers are
> handed
>     out at all. If a request arrives at a Router that has no Container to
> serve
>     it, it just puts it on the queue. If there's a consumer for it, great,
>     done. If not, we know  we need more resources.
>
>     This boils down to needing an efficient mechanism to signal free
> capacities
>     though. Something to think deeper about, thanks for bringing it up!
>
> Yes, I think "capacity" is more accurate way to think of it, than "idle",
> and yes I think a 503 should generate container immediately, but I think we
> need to supply more data to intelligently limit the 503 workflows.
>
> What about this:
> - ContainerRouters report capacity ("possible capacity", or "busy") to
> ContainerManager periodically (not accurate at point in time, but close
> enough in some cases)
> - When no capacity exists in ContainRouter's existing warm containers,
> ContainRouter requests container from ContainerManager
> - ContainerManager responds with either ContainerRouter (if one exists
> with "possible capacity"), or Container (if none exists with "possible
> capacity")
>

I think these 3 points are addressed by being able to pull for specific
actions and/or being able to detect that pull signal. In fact, it's the
same mechanism: The ContainerManager essentially is your work-stealing
backend.


> - In case Container is returned, if C > 1, its capacity state should start
> as "possible capacity"; otherwise, "busy" (we know it will be immediately
> used)
> - In case ContainerRouter is returned, attempt (once) proxy to that
> ContainerRouter
> - On a proxied request to ContainerRouter CR2 either service the request,
> OR immediately create a container (in CR2), OR we may limit the number of
> same action containers in a ContainerRouter so return 503 at which point
> immediately create a container (in CR1).
>
> This would:
> - encourage multiple containers for same action to be managed in a subset
> of routers (better chance of reusing containers)

- not restrict the number of routers used for a specific action when under
> load (e.g. say that each router can handle active requests for n
> containers, meaning n connection pools).
> - allow designating a ContainerRouter capacity config for both a)
> activations on warm containers as configured per action and b) overall
> number of containers (connection pools) as configured at ContainerRouter
> (to smooth hot spots across ContainerRouters).
>
> I think the biggest problem would be cases where containers hover around
> "busy" state (C requests), causing requests to CM, and inaccurate data in
> CM (causing extra containers to be created), but it may be OK for most load
> patterns, nothing will work perfectly for all.
>
> Reporting capacity can either be metrics based or a health ping similar to
> what exists today (but with more details of pool states).
>
> I will try to diagram this, it's getting complicated...
>

Yes, I will as well.


>
> Thanks
> Tyson
>
>
>     >
>     >     Memo to self: I need to incooperate all the stuff from these
>     > discussions
>     >     into the document.
>     >
>     >
>     >     >
>     >     >
>     >     >     The work-stealing queue though is used to rebalance work
> in case
>     > one
>     >     > of the
>     >     >     Routers get overloaded.
>     >     >
>     >     > Got it.
>     >     >
>     >     > Thanks
>     >     > Tyson
>     >     >
>     >     >
>     >     >
>     >
>     >
>     >
>
>
>

Re: Kafka and Proposal on a future architecture of OpenWhisk

Reply via email to