[ 
https://issues.apache.org/jira/browse/MESOS-1862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14161297#comment-14161297
 ] 

Benjamin Mahler commented on MESOS-1862:
----------------------------------------

https://reviews.apache.org/r/26392/

> Performance regression in the Master's http metrics.
> ----------------------------------------------------
>
>                 Key: MESOS-1862
>                 URL: https://issues.apache.org/jira/browse/MESOS-1862
>             Project: Mesos
>          Issue Type: Bug
>          Components: master
>    Affects Versions: 0.21.0
>            Reporter: Benjamin Mahler
>            Assignee: Benjamin Mahler
>            Priority: Blocker
>
> As part of the change to hold on to terminal unacknowledged tasks in the 
> master, we introduced a performance regression during the following patch:
> https://github.com/apache/mesos/commit/0760b007ad65bc91e8cea377339978c78d36d247
> {noformat}
> commit 0760b007ad65bc91e8cea377339978c78d36d247
> Author: Benjamin Mahler <bmah...@twitter.com>
> Date:   Thu Sep 11 10:48:20 2014 -0700
>     Minor cleanups to the Master code.
>     Review: https://reviews.apache.org/r/25566
> {noformat}
> Rather than keeping a running count of allocated resources, we now compute 
> resources on-demand. This was done in order to ignore terminal task's 
> resources.
> As a result of this change, the /stats.json and /metrics/snapshot endpoints 
> on the master have slowed down substantially on large clusters.
> {noformat}
> $ time curl localhost:5050/health
> real  0m0.004s
> user  0m0.001s
> sys   0m0.002s
> $ time curl localhost:5050/stats.json > /dev/null
> real  0m15.402s
> user  0m0.001s
> sys   0m0.003s
> $ time curl localhost:5050/metrics/snapshot > /dev/null
> real  0m6.059s
> user  0m0.002s
> sys   0m0.002s
> {noformat}
> {{perf top}} reveals some of the resource computation during a request to 
> stats.json:
> {noformat: perf top}
> Events: 36K cycles
>  10.53%  libc-2.5.so             [.] _int_free
>   9.90%  libc-2.5.so             [.] malloc
>   8.56%  libmesos-0.21.0.so  [.] std::_Rb_tree<process::ProcessBase*, 
> process::ProcessBase*, std::_Identity<process::ProcessBase*>, 
> std::less<process::ProcessBase*>, std::allocator<process::ProcessBase*> >::
>   8.23%  libc-2.5.so             [.] _int_malloc
>   5.80%  libstdc++.so.6.0.8      [.] 
> std::_Rb_tree_increment(std::_Rb_tree_node_base*)
>   5.33%  [kernel]                [k] _raw_spin_lock
>   3.13%  libstdc++.so.6.0.8      [.] std::string::assign(std::string const&)
>   2.95%  libmesos-0.21.0.so  [.] 
> process::SocketManager::exited(process::ProcessBase*)
>   2.43%  libmesos-0.21.0.so  [.] mesos::Resource::MergeFrom(mesos::Resource 
> const&)
>   1.88%  libmesos-0.21.0.so  [.] mesos::internal::master::Slave::used() const
>   1.48%  libstdc++.so.6.0.8      [.] __gnu_cxx::__atomic_add(int volatile*, 
> int)
>   1.45%  [kernel]                [k] find_busiest_group
>   1.41%  libc-2.5.so             [.] free
>   1.38%  libmesos-0.21.0.so  [.] 
> mesos::Value_Range::MergeFrom(mesos::Value_Range const&)
>   1.13%  libmesos-0.21.0.so  [.] 
> mesos::Value_Scalar::MergeFrom(mesos::Value_Scalar const&)
>   1.12%  libmesos-0.21.0.so  [.] mesos::Resource::SharedDtor()
>   1.07%  libstdc++.so.6.0.8      [.] __gnu_cxx::__exchange_and_add(int 
> volatile*, int)
>   0.94%  libmesos-0.21.0.so  [.] 
> google::protobuf::UnknownFieldSet::MergeFrom(google::protobuf::UnknownFieldSet
>  const&)
>   0.92%  libstdc++.so.6.0.8      [.] operator new(unsigned long)
>   0.88%  libmesos-0.21.0.so  [.] 
> mesos::Value_Ranges::MergeFrom(mesos::Value_Ranges const&)
>   0.75%  libmesos-0.21.0.so  [.] mesos::matches(mesos::Resource const&, 
> mesos::Resource const&)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to