Hi, Am Fr., 24. Aug. 2018 um 00:07 Uhr schrieb Tyson Norris <tnor...@adobe.com.invalid>:
> > Router is not pulling at queue for "specific actions", just for any > action > > that might replace idle containers - right? This is complicated with > > concurrency though since while a container is not idle (paused + > > removable), it may be useable, but only if the action received is > the same > > as one existing warm container, and that container has concurrency > slots > > available for additional activations. It may be helpful to diagram > some of > > this stealing queue flow a bit more, I'm not seeing how it will work > out > > other than creating more containers than is absolutely required, > which may > > be ok, not sure. > > > > Yes, I will diagram things out soonish, I'm a little bit narrow on time > currently. > > The idea is that indeed the Router pulls for *specific* actions. This > is a > problem when using Kafka, but might be solvable when we don't require > Kafka. I have to test this for feasibility though. > > > Hmm OK - it's not clear how a router that is empty (not servicing any > activations) becomes a router that is pulling for that specific action, > when other routers pulling for that action are at capacity (so new > containers are needed) > Disclaimer: All of this is very much in the idea and not part of the original proposal. That's where the second part of the "idea" that I had above comes in: If we can somehow detect, that nobody is pulling (all are at capacity or have no container), that is then the moment where we need to create new containers. I proposed further above in the discussion that the Routers do *not* ask for more Containers but rather that signal of the MQ (hey, nobody is having capacity for that action apparently... so let's create capacity) could be used to create Containers. That's a very blunt assumption though, I don't know if that's feasible at all. There might be a performant way of implementing just that signal in a distributed way though (the signal being: We are out of capacity over the whole cluster, we need more). A second idea that comes to my mind would be to implement a shared counter (it can easily be eventually consistent, consistency is I think not a concern here). Once that counter drops to 0 or below 0, we need more Containers. This is I think were the prototyping thread comes in: Personally, I don't feel comfortable stating that any of the approaches outlined by me really do work without having tried it out.