Right, I think the UI workflows are just an example of apps that are latency 
sensitive in general.

I had a discussion with Stephen Fink on the matter of detecting ourselves that 
an action is latency sensitive by using the blocking parameter or as mentioned 
the user's configuration in terms of web-action vs. non-web action. The 
conclusion there was, that we probably cannot reliably detect latency 
sensitivity without asking the user to do so. Having such an option has 
implications on other aspects of the platform: Why would one not choose that 
option?

To Rodric's points I think there are two topics to speak about and discuss:

1. The programming model: The current model encourages users to break their actions apart in 
"functions" that take payload and return payload. Having a deployment model outlined 
could as noted encourage users to use OpenWhisk as a way to rapidly deploy/undeploy their usual 
webserver based applications. The current model is nice in that it solves a lot of problems for the 
customer in terms of scalability and "crash safeness".

2. Raw throughput of our deployment model: Setting the concerns aside I think it is valid 
to explore concurrent invocations of actions on the same container. This does not 
necessarily mean that users start to deploy monolithic apps as noted above, but it 
certainly could. Keeping our JSON-in/JSON-out at least for now though, could encourage 
users to continue to think in functions. Having a toggle per action which is disabled by 
default might be a good way to start here, since many users might need to change action 
code to support that notion and for some applications it might not be valid at all. I 
think it was also already noted, that this imposes some of the "old-fashioned" 
problems on the user, like: How many concurrent requests will my action be able to 
handle? That kinda defeats the seemless-scalability point of serverless.

Cheers,
Markus

Am 02. Juli 2017 um 10:42 schrieb Rodric Rabbah <rod...@gmail.com>:

The thoughts I shared around how to realize better packing with intrinsic 
actions are aligned with the your goals: getting more compute density with a 
smaller number of machines. This is a very worthwhile goal.

I noted earlier that packing more activations into a single container warrants a different resource manager with its own container life cycle management (e.g., it's almost at the level of: provision a container for me quickly and let me have it to run my monolithic code for as long as I want). Already some challenges were mentioned, wrt sharing state, resource leaks and possible data races. Perhaps defining the resource isolation model intra container - processes, threads, "node vm", ... - is helpful as you refine your proposal. This can address how one might deal with intra container noisy neighbors as well.
Hence in terms of resource management as the platform level, I think it would 
be a mistake to treat intra container concurrency the same way as ephemeral 
activations, that are run and done. Once the architecture and scheduler 
supports a heterogenous mix of resources, then treating some actions as 
intrinsic operations becomes easier to realize; in other words complementary to 
the overall proposed direction if the architecture is done right.

To Alex's point, when you're optimizing for latency, you don't need to be 
constrained to UI applications. Maybe this is more of a practical motivation 
based on your workloads.

-r

On Jul 2, 2017, at 2:32 AM, Dascalita Dragos <ddrag...@gmail.com> wrote:

I think the opportunities for packing computation at finer granularity
will be there. In your approach you're tending, it seems, toward taking
monolithic codes and overlapping their computation. I tend to think this
will work better with another approach.

+1 to making the serverless system smarter in managing and running the code
at scale. I don't think the current state is there right now. There are
limitations which could be improved by simply allowing developers to
control which action can be invoked concurrently. We could also consider
designing the system to "learn" this intent by observing how the action is
configured by the developer: if it's an HTTP endpoint, or an event handler.

As long as today we can improve the performance by allowing concurrency in
actions, and by invoking them faster, why would we not benefit from this
now, and update the implementation later, once the system improves ? Or are
there better ways available now to match this performance that are not
captured in the proposal ?

Reply via email to