[jira] [Commented] (MESOS-6405) Benchmark call ingestion path on the Mesos master.
[ https://issues.apache.org/jira/browse/MESOS-6405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16969499#comment-16969499 ] Andrei Sekretenko commented on MESOS-6405: -- Tried to run both benchmarks (existing SchedulerReconcileTasks_BENCHMARK_Test and Anand's r53113) on the current master head. Both show noticeable added overhead for V1 API (~10x lower throughput). It should be noted that both benchmarks run against an empty Mesos master, i.e. what they show is basically an overhead due to HTTP/(de)serialization/authentication/etc...) It turns out that issues exposed by these two benchmarks are totally different, which is not surprising at all: the first sends a single call and receives a multitude of events, whereas the second one sends one call in repetition but receives no API events. The largest issue which shows up in the SchedulerReconcileTasks_BENCHMARK_Test are inefficiencies in the V1 C++ scheduler client library (spawning/terminating an AsyncExecutor process per each event and so on). See [^SchedulerReconcileTasks_BENCHMARK_Test.SchedulerLibrary_flamegraph.svg] The benchmark from r53113 (SchedulerCallIngestion_BENCHMARK_Test) shows surprisingly large overhead of using process::Sequence (master only?) and HttpProxy (both sides?), and also of per-request authentication (on both sides). Compare: [^SchedulerCallIngestion_BENCHMARK_Test.SchedulerLibrary_flamegraph.svg] and [^SchedulerCallIngestion_BENCHMARK_Test.SchedulerDriver_flamegraph.svg] > Benchmark call ingestion path on the Mesos master. > -- > > Key: MESOS-6405 > URL: https://issues.apache.org/jira/browse/MESOS-6405 > Project: Mesos > Issue Type: Improvement > Components: master, scheduler api >Reporter: Anand Mazumdar >Assignee: Anand Mazumdar >Priority: Critical > Labels: mesosphere > Attachments: > SchedulerCallIngestion_BENCHMARK_Test.SchedulerDriver_flamegraph.svg, > SchedulerCallIngestion_BENCHMARK_Test.SchedulerDriver_stacks.gz, > SchedulerCallIngestion_BENCHMARK_Test.SchedulerLibrary_flamegraph.svg, > SchedulerCallIngestion_BENCHMARK_Test.SchedulerLibrary_stacks.gz, > SchedulerReconcileTasks_BENCHMARK_Test.SchedulerLibrary_flamegraph.svg, > SchedulerReconcileTasks_BENCHMARK_Test.SchedulerLibrary_stacks.gz > > > [~drexin] reported on the user mailing > [list|http://mail-archives.apache.org/mod_mbox/mesos-user/201610.mbox/%3C6B42E374-9AB7--A315-A6558753E08B%40apple.com%3E] > that there seems to be a significant regression in performance on the call > ingestion path on the Mesos master wrt to the scheduler driver (v0 API). > We should create a benchmark to first get a sense of the numbers and then go > about fixing the performance issues. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (MESOS-6405) Benchmark call ingestion path on the Mesos master.
[ https://issues.apache.org/jira/browse/MESOS-6405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16624836#comment-16624836 ] Benjamin Mahler commented on MESOS-6405: [~greggomann] we have the following: https://github.com/apache/mesos/blob/1.7.0/src/tests/scheduler_tests.cpp#L2164-L2281 > Benchmark call ingestion path on the Mesos master. > -- > > Key: MESOS-6405 > URL: https://issues.apache.org/jira/browse/MESOS-6405 > Project: Mesos > Issue Type: Improvement > Components: master, scheduler api >Reporter: Anand Mazumdar >Assignee: Anand Mazumdar >Priority: Critical > Labels: mesosphere > > [~drexin] reported on the user mailing > [list|http://mail-archives.apache.org/mod_mbox/mesos-user/201610.mbox/%3C6B42E374-9AB7--A315-A6558753E08B%40apple.com%3E] > that there seems to be a significant regression in performance on the call > ingestion path on the Mesos master wrt to the scheduler driver (v0 API). > We should create a benchmark to first get a sense of the numbers and then go > about fixing the performance issues. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (MESOS-6405) Benchmark call ingestion path on the Mesos master.
[ https://issues.apache.org/jira/browse/MESOS-6405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16619534#comment-16619534 ] Greg Mann commented on MESOS-6405: -- [~bmahler] do you know if this is still an issue? Do we have any benchmarks at this point which can probe this? > Benchmark call ingestion path on the Mesos master. > -- > > Key: MESOS-6405 > URL: https://issues.apache.org/jira/browse/MESOS-6405 > Project: Mesos > Issue Type: Improvement > Components: master, scheduler api >Reporter: Anand Mazumdar >Assignee: Anand Mazumdar >Priority: Critical > Labels: mesosphere > > [~drexin] reported on the user mailing > [list|http://mail-archives.apache.org/mod_mbox/mesos-user/201610.mbox/%3C6B42E374-9AB7--A315-A6558753E08B%40apple.com%3E] > that there seems to be a significant regression in performance on the call > ingestion path on the Mesos master wrt to the scheduler driver (v0 API). > We should create a benchmark to first get a sense of the numbers and then go > about fixing the performance issues. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (MESOS-6405) Benchmark call ingestion path on the Mesos master.
[ https://issues.apache.org/jira/browse/MESOS-6405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15924548#comment-15924548 ] Anand Mazumdar commented on MESOS-6405: --- Reasons for slowness observed based on running the benchmark: On the master: - The root cause for most of the slowness observed in the benchmark when comparing v0/v1 is due to the extra copies of the {{Request}} object being created from the time the decoder parses the object to the time the actual HTTP handler is invoked. We are creating ~5 copies of the {{Request}} object. On the client: - The other major cause of slowness observed in the benchmark is due to the client (scheduler library). Most of the slowness there is due to the {{Call}} object being copied multiple times and also due to creating a couple of {{Request}} object copies in the {{Connection}} abstraction. > Benchmark call ingestion path on the Mesos master. > -- > > Key: MESOS-6405 > URL: https://issues.apache.org/jira/browse/MESOS-6405 > Project: Mesos > Issue Type: Improvement > Components: master, scheduler api >Reporter: Anand Mazumdar >Assignee: Anand Mazumdar >Priority: Critical > Labels: mesosphere > > [~drexin] reported on the user mailing > [list|http://mail-archives.apache.org/mod_mbox/mesos-user/201610.mbox/%3C6B42E374-9AB7--A315-A6558753E08B%40apple.com%3E] > that there seems to be a significant regression in performance on the call > ingestion path on the Mesos master wrt to the scheduler driver (v0 API). > We should create a benchmark to first get a sense of the numbers and then go > about fixing the performance issues. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (MESOS-6405) Benchmark call ingestion path on the Mesos master.
[ https://issues.apache.org/jira/browse/MESOS-6405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15824789#comment-15824789 ] Adam B commented on MESOS-6405: --- [~anandmazumdar], [~vinodkone], this ticket/patch hasn't been updated in months. Do you still think we can get it into Mesos 1.2? > Benchmark call ingestion path on the Mesos master. > -- > > Key: MESOS-6405 > URL: https://issues.apache.org/jira/browse/MESOS-6405 > Project: Mesos > Issue Type: Improvement > Components: master, scheduler api >Reporter: Anand Mazumdar >Assignee: Anand Mazumdar >Priority: Critical > Labels: mesosphere > > [~drexin] reported on the user mailing > [list|http://mail-archives.apache.org/mod_mbox/mesos-user/201610.mbox/%3C6B42E374-9AB7--A315-A6558753E08B%40apple.com%3E] > that there seems to be a significant regression in performance on the call > ingestion path on the Mesos master wrt to the scheduler driver (v0 API). > We should create a benchmark to first get a sense of the numbers and then go > about fixing the performance issues. -- This message was sent by Atlassian JIRA (v6.3.4#6332)