That was the idea I was proposing, to have 2 buckets one for sync invokes
(request/response) overflow, and one for async invokes.
Routers will steal pull work from the buckets, it can be a single set of
routers (same code) or be deploy in two groups for sharding and
traffic/risk isolation.

The Routers dealing with async will pull from the bucket/bus as they have
capacity, so there is never an overflow for async invokes just a delay, if
you want async invokes to be pikcup faster then add more Routers and
backend container worker resources (i.e. kube nodes, vm, physical). Bursts
will just go the queue to be eventually process but never throttle or drop.

The Routers dealing with sync will push to overflow bucket as they can't
handle it at the moment and another Router or maybe itself will pull from
overflow bucket.
Which in practice  overflow should be a rare case if system is over
provision, and maybe it happens in some spikes/bursts for sync.

-- Carlos

On Tue, Aug 21, 2018 at 1:37 PM Tyson Norris <tnor...@adobe.com.invalid>
wrote:

>     > Tracking these metrics consistently will introduce the same problem
> as
>     > precisely tracking throttling numbers across multiple controllers, I
> think,
>     > where either there is delay introduced to use remote data, or
> eventual
>     > consistency will introduce inaccurate data.
>     >
>
>     If you're talking about limit enforcement, you're right! Regarding the
>     concurrency on each container though, we are able to accurately track
> that
>     and we need to be able to make sure that actual concurrency is always
> <= C.
>
>
>     >
>     > I’m interested to know if this accuracy is important as long as
> actual
>     > concurrency <= C?
>     >
>
>     I don't think it is as much, no. But how do you keep <= C if you don't
>     accurately track?
>
> Maybe I should say that while we cannot accurately track, we can still
> guarantee <= C, we just cannot guarantee maximizing concurrency up to C.
>
> Since the HTTP requests are done via futures in proxy, the messaging
> between pool and proxy doesn't have an accurate way to get exactly C
> requests in flight, but can prevent ever sending > C messages that cause
> the HTTP requests. The options for this are:
> - track in flight requests in the pool; passing C will cause more
> containers to be used, but probably the container will always only have < C
> in flight.
> - track in flight requests in the proxy; passing C will cause the message
> in proxy to be stashed/delayed until some HTTP requests are completed, and
> if the >C state remains, the pool will eventually learn this state and
> cause more containers to be used.
>
> (current impl in the PR does the latter)
>
> Thanks
> Tyson
>
>
>

Reply via email to