Re: Performance findings on OpenWhisk

Brendan McAdams Mon, 02 Apr 2018 14:30:13 -0700

This is great analysis on performance.

I've opened a PR, https://github.com/apache/incubator-openwhisk/pull/3515,
which *may* help some of this. At the moment, all of the akka code -
including the couch interfaces, actors, akka-http, and kafka interfaces are
sharing a single dispatcher. In heavy usage this can lead to contention for
the executor pool, especially if something blocking is holding onto a
thread. The PR adds separate dispatchers for couch & kafka which should
help alleviate having everyone piling onto the same dispatcher.


I haven't looked at the akka-http stuff, which you mentioned, but that may
be another area worth adding a separate dispatcher to tune against. I'd be
curious if you see any change in profile against that PR, versus your
original run.

-b

On Wed, Mar 28, 2018 at 7:59 PM, Dominic Kim <[email protected]> wrote:

> Dear all.
> 
> I have tested OpenWhisk in terms of performance for last few months.
> I want to share my experience on performance of OpenWhisk and discuss the
> right direction to go further.
> I hope this would be a good starting point to improve performance and helps
> someone facing similar issue with me.
>
> As per my observation, there are three big parts which have performance
> issue, akka-http, couchdb, dockerd.
> In this thread, I will only focus on dockerd.
>
> Since OpenWhisk can serve many different kinds of actions from different
> namespaces with different runtimes and container sizes, it would be more
> precise to assume many heterogenous requests come to OpenWhisk.
> In this sense, to use a single action to measure performance of OpenWhisk
> may not be realistic in the production environment.
> 
> With current code base, if we run benchmark with many different number of
> actions, performance is severely degraded.
> When I tested with 100 actions, I got about 30~40 TPS with 3 invoker
> machines with 40 cores, 128GB memory, 2TB ssd.
> (There were 3 invoker containers on each hosts, so total 9 invokers were
> running)
>
> 
>
> There could be some differences based on deployment, configuration, the
> number of components and so on.
> But I got about 20K TPS with same setup using 1 action, it seems obvious
> there is a huge performance degradation along different number of actions.
>
>
> 
> 
>
> After deep investigation, I found the main reason is reuse of containers.
> When we use only 1 action, logically all containers are reused.
> But if we use 100 actions, containers are not fully reused and deletion
> and creation of containers are occurred. (Contianers are reused only if
> namespace and action name are same.)
> Since performance of docker daemon is not good, it causes TPS dropping.
> 
> When one of my colleagues did benchmark against Docker daemon directly, we
> observed only 30~40 TPS with `pause/unpause`.
> (He also performed similar `pause/unpause` test with runc and only got
> about 300 TPS.)
> When we included `run/rm` as well, TPS dropped to 1~10 TPS.
>
> It means if containers are not reused well, it makes dockerd
> create/remote/unpause/pause containers and TPS is dropped due to poor
> performance of docker daemon.
> I think this issue exists in all serverless frameworks which are based on
> docker.
>
> So docker daemon is surely performance bottleneck in OpenWhisk, and it's
> not that simple to resolve it.
> Even though huge performance improvement is made on docker daemon, I doubt
> it can support more than 1K ~ 2K TPS.
> (Because docker is not designed to support this kind of traffic, creating
> about 1K containers in a second is not that plausible.)
> 
>
> Then, in OpenWhisk, main key factor of performance is reuse of containers,
> in other word, reducing loads to docker daemon.
> With 1 actions, there were not many docker traffics, invoker just called
> `/run` requests against running containers again and again.
> However, with 100 actions, many containers should be deleted and created,
> this caused huge performance degradation.
>
>
> *Reuse rate: 95%*
>
> In above graph, the red part in a circle is the ratio of reused(warmed)
> containers.
> Reuse rate was more than 95%, but I got only 6K TPS.
>
> *5 namespaces + 2 actions each with 100 threads*
>
>
> As reuse rate is decreased, TPS is getting worse.
>
> *Reuse rate: 60%*
>
>
>
> *10 namespaces + 10 actions each with 400 threads*
>
>
>
>
> 
> I changed `ContinerProxy` code to reuse container based on runtime and
> container memory size, and also changed `gracePause` time to 10 mins.
> So once container is created, it is not paused in 10 minutes. No matter
> requests come from any namespaces, if runtime and memory size are same,
> containers are reused.
> And I got about 20K TPS back with this changes.
>
> *100 actions with 740 threads*
>
> 
>
>
>
> There could be some side effects and hidden defects of this implementation.
> But anyway I got about 20K TPS with 100 actions.
> This is because all 100 actions are nodejs6 actions and require same
> container memory size, accordingly all containers are reused.
>
> In current code, TPS is decreased as the number of actions or number of
> namespaces are increasing.
> Since there is no limit on the number of namespaces and the number of
> actions in them, more and more namespace and actions are used, performance
> is getting worse.
>
> With this change, now container reuse rate only depends on runtime and
> container memory size.
> OpenWhisk has 6 runtimes(nodejs6, nodejs8, python, php, swift, java), it
> can vastly increase reuse rate.
> And even if more and more actions and namespaces are created, there is no
> performance degradation if same runtime is used.
>
>  One more good thing of this implementation is, it reduces the traffic
> against dockerd, I could run benchmark for long-term without any docker
> daemon problem under heavy loads.
>
> Since I am not that an expert on OpenWhisk, I want to discuss whether
> there could be any issues or side effects with this change.
> With my shallow understanding on OpenWhisk, one security issue could
> happen when user accesses files in a container.
> If user creates any files in the code, files could be exposed to other
> users as same container can be reused among many users.
> (But I am not sure OpenWhisk should also guarantee such kinds of stateful
> approach because serverless is intrinsically stateless.)
> Other than that, I have no idea currently.
>
>
> Apart from my suggestion, I think one more thing worth to try is this one:
> https://github.com/apache/incubator-openwhisk/pull/2795
>
>
> I want to listen any opinions and feedback.
>
>
> Thanks in advance
> Regards
> Dominic.
>
>
>
>

Re: Performance findings on OpenWhisk

Reply via email to