Hi -
>
> When exactly is the case that a ContainerRouter should put a blocking
> activation to a queue for stealing? Since a) it is not spawning containers
> and b) it is not parsing request/response bodies, can we say this would
> only happen when a ContainerRouter maxes out its incoming request
handling?
>
That's exactly the idea! The work-stealing queue will only be used if the
Router where to request landed cannot serve the demand right now. For
example, if it maxed out the slots it has for a certain action (all
containers are working to their full extent) it requests more resources and
puts the request-token on the work-stealing queue.
So to clarify, ContainerRouter "load" (which can trigger use of queue) is
mostly (only?) based on:
* the number of Container references
* the number of outstanding inbound HTTP requests, e.g. when lots of requests
can be routed to the same container
* the number of outstand outbound HTTP requests to remote action containers
(assume all are remote)
It is unclear the order of magnitude considered for "maxed out slots", since
container refs should be simple (like ip+port, action metadata, activation
count, and warm state), inbound connection handling is basically a http server,
and outbound is a connection pool per action container (let's presume
connection reuse for the moment).
I think it will certainly need testing to determine these and to be
configurable in any case, for each of these separate stats.. Is there anything
else that affects the load for ContainerRouter?
That request-token will then be taken by any Router that has free capacity
for that action (note: this is not simple with kafka, but might be simpler
with other MQ technologies). Since new resources have been requested, it is
guaranteed that one Router will eventually become free.
Is "requests resources" here requesting new action containers, which it won't
be able to process itself immediately, but should startup + warm and be
provided to "any ContainerRouter"? This makes, sense, just want to clarify that
"resources == containers".
>
> If ContainerManager has enough awareness of ContainerRouters' states, I'm
> not sure where using a queue would be used (for redirecting to other
> ContainerRouters) vs ContainerManager responding with a ContainerRouters
> reference (instead of an action container reference) - I'm not following
> the logic of the edge case in the proposal - there is mention of "which
> controller the request needs to go", but maybe this is a typo and should
> say ContainerRouter?
>
Indeed that's a typo, it should say ContainerRouter.
The ContainerManager only knows which Router has which Container. It does
not know whether the respective Router has capacity on that container (the
capacity metric is very hard to share since it's ever changing).
Hence, in an edge-case where there are less Containers than Routers, the
ContainerManager can hand out references to the Routers it gave Containers
to the Routers that have none. (This is the edge-case described in the
proposal).
I'm not sure why in this case the ContainerManager does not just create a new
container, instead of sending to another Router? If there is some intended
limit on "number of containers for a particular action", that would be a
reason, but given that the ContainerManager cannot know the state of the
existing containers, it seems like sending to another Router which has the
container, but may not be able to use it immediately, may cause failures in
some cases.
The work-stealing queue though is used to rebalance work in case one of the
Routers get overloaded.
Got it.
Thanks
Tyson