Right, I think the UI workflows are just an example of apps that are latency
sensitive in general.
I had a discussion with Stephen Fink on the matter of detecting ourselves that
an action is latency sensitive by using the blocking parameter or as mentioned
the user's configuration in terms of web-action vs. non-web action. The
conclusion there was, that we probably cannot reliably detect latency
sensitivity without asking the user to do so. Having such an option has
implications on other aspects of the platform: Why would one not choose that
option?
To Rodric's points I think there are two topics to speak about and discuss:
1. The programming model: The current model encourages users to break their actions apart in
"functions" that take payload and return payload. Having a deployment model outlined
could as noted encourage users to use OpenWhisk as a way to rapidly deploy/undeploy their usual
webserver based applications. The current model is nice in that it solves a lot of problems for the
customer in terms of scalability and "crash safeness".
2. Raw throughput of our deployment model: Setting the concerns aside I think it is valid
to explore concurrent invocations of actions on the same container. This does not
necessarily mean that users start to deploy monolithic apps as noted above, but it
certainly could. Keeping our JSON-in/JSON-out at least for now though, could encourage
users to continue to think in functions. Having a toggle per action which is disabled by
default might be a good way to start here, since many users might need to change action
code to support that notion and for some applications it might not be valid at all. I
think it was also already noted, that this imposes some of the "old-fashioned"
problems on the user, like: How many concurrent requests will my action be able to
handle? That kinda defeats the seemless-scalability point of serverless.
Cheers,
Markus
Am 02. Juli 2017 um 10:42 schrieb Rodric Rabbah <rod...@gmail.com>:
The thoughts I shared around how to realize better packing with intrinsic
actions are aligned with the your goals: getting more compute density with a
smaller number of machines. This is a very worthwhile goal.
I noted earlier that packing more activations into a single container warrants a different resource manager with its own container life cycle management (e.g., it's almost at the level of: provision a container for me quickly and let me have it to run my monolithic code for as long as I want).
Already some challenges were mentioned, wrt sharing state, resource leaks and possible data races. Perhaps defining the resource isolation model intra container - processes, threads, "node vm", ... - is helpful as you refine your proposal. This can address how one might deal with intra container noisy neighbors as well.
Hence in terms of resource management as the platform level, I think it would
be a mistake to treat intra container concurrency the same way as ephemeral
activations, that are run and done. Once the architecture and scheduler
supports a heterogenous mix of resources, then treating some actions as
intrinsic operations becomes easier to realize; in other words complementary to
the overall proposed direction if the architecture is done right.
To Alex's point, when you're optimizing for latency, you don't need to be
constrained to UI applications. Maybe this is more of a practical motivation
based on your workloads.
-r
On Jul 2, 2017, at 2:32 AM, Dascalita Dragos <ddrag...@gmail.com> wrote:
I think the opportunities for packing computation at finer granularity
will be there. In your approach you're tending, it seems, toward taking
monolithic codes and overlapping their computation. I tend to think this
will work better with another approach.
+1 to making the serverless system smarter in managing and running the code
at scale. I don't think the current state is there right now. There are
limitations which could be improved by simply allowing developers to
control which action can be invoked concurrently. We could also consider
designing the system to "learn" this intent by observing how the action is
configured by the developer: if it's an HTTP endpoint, or an event handler.
As long as today we can improve the performance by allowing concurrency in
actions, and by invoking them faster, why would we not benefit from this
now, and update the implementation later, once the system improves ? Or are
there better ways available now to match this performance that are not
captured in the proposal ?