[ 
https://issues.apache.org/jira/browse/MESOS-6405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16969499#comment-16969499
 ] 

Andrei Sekretenko commented on MESOS-6405:
------------------------------------------

Tried to run both benchmarks (existing SchedulerReconcileTasks_BENCHMARK_Test 
and Anand's r53113) on the current master head.

Both show noticeable added overhead for V1 API (~10x lower throughput). 
It should be noted that both benchmarks run against an empty Mesos master, i.e. 
what they show is basically an overhead due to 
HTTP/(de)serialization/authentication/etc...)

It turns out that issues exposed by these two benchmarks are totally different, 
 which is not surprising at all: the first sends a single call and receives a 
multitude of events, whereas the second one sends one call in repetition but 
receives no API events.

The largest issue which shows up in the SchedulerReconcileTasks_BENCHMARK_Test 
are inefficiencies in the V1 C++ scheduler client library (spawning/terminating 
an AsyncExecutor process per each event and so on).
See [^SchedulerReconcileTasks_BENCHMARK_Test.SchedulerLibrary_flamegraph.svg]

The benchmark from r53113 (SchedulerCallIngestion_BENCHMARK_Test) shows 
surprisingly large overhead of using process::Sequence  (master only?) and 
HttpProxy (both sides?), and also of per-request authentication (on both 
sides). 
Compare: 
[^SchedulerCallIngestion_BENCHMARK_Test.SchedulerLibrary_flamegraph.svg] and 
[^SchedulerCallIngestion_BENCHMARK_Test.SchedulerDriver_flamegraph.svg]

> Benchmark call ingestion path on the Mesos master.
> --------------------------------------------------
>
>                 Key: MESOS-6405
>                 URL: https://issues.apache.org/jira/browse/MESOS-6405
>             Project: Mesos
>          Issue Type: Improvement
>          Components: master, scheduler api
>            Reporter: Anand Mazumdar
>            Assignee: Anand Mazumdar
>            Priority: Critical
>              Labels: mesosphere
>         Attachments: 
> SchedulerCallIngestion_BENCHMARK_Test.SchedulerDriver_flamegraph.svg, 
> SchedulerCallIngestion_BENCHMARK_Test.SchedulerDriver_stacks.gz, 
> SchedulerCallIngestion_BENCHMARK_Test.SchedulerLibrary_flamegraph.svg, 
> SchedulerCallIngestion_BENCHMARK_Test.SchedulerLibrary_stacks.gz, 
> SchedulerReconcileTasks_BENCHMARK_Test.SchedulerLibrary_flamegraph.svg, 
> SchedulerReconcileTasks_BENCHMARK_Test.SchedulerLibrary_stacks.gz
>
>
> [~drexin] reported on the user mailing 
> [list|http://mail-archives.apache.org/mod_mbox/mesos-user/201610.mbox/%3C6B42E374-9AB7-4444-A315-A6558753E08B%40apple.com%3E]
>  that there seems to be a significant regression in performance on the call 
> ingestion path on the Mesos master wrt to the scheduler driver (v0 API). 
> We should create a benchmark to first get a sense of the numbers and then go 
> about fixing the performance issues. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to