Hi folks,

Here are some notes from the performance meeting today.

(1) First I did a demo of flamescope, you can find it here:
https://github.com/Netflix/flamescope

It's a very useful tool, hopefully we can make it easier for users to
generate the data that we can drop into flamescope when reporting any
performance issues. One of the open questions is how `perf --call-graph
dwarf` compares to `perf -g` but with mesos compiled with frame pointers. I
haven't had time to check this yet.

When playing with the tool, it was easy to find some hot spots in the given
cluster I was looking at (which was not necessarily representative). For
the agent, jie filed:

https://issues.apache.org/jira/browse/MESOS-8901

And for the master, I noticed that metrics, state json generation (no
surprise), and a particular spot in the allocator were very expensive.

Metrics we'd like to address via migration to push gauges (Zhitao has
offered to help with this effort):

https://issues.apache.org/jira/browse/MESOS-8914

The state generation we'd like to address via streaming state into a
separate actor (and providing filtering as well), this will get further
investigated / prioritized very soon:

https://issues.apache.org/jira/browse/MESOS-8345

(2) Kapil discussed benchmarks for the long standing "offer starvation"
issue:

https://issues.apache.org/jira/browse/MESOS-3202

I'll send out an email or document soon with some background on this issue
as well as our options to address it.

Let me know if you have any questions or feedback!

Ben

Reply via email to