----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/38074/#review106958 -----------------------------------------------------------
include/mesos/mesos.proto (line 907) <https://reviews.apache.org/r/38074/#comment165762> s/percentile/percentiles/ src/slave/containerizer/isolators/cgroups/perf_event.hpp (line 97) <https://reviews.apache.org/r/38074/#comment165768> linux/cgroups has an internal {{Result<string> cgroup(pid_t pid, const string& subsystem)}} that is wrapped for cpu, cpuacct, etc. Can you add an wrapper for the perf_event cgroup and use that? Yes, it may be less efficient than looking at the hashmap, but it'll reuse the code and keep consistency. src/slave/containerizer/isolators/cgroups/perf_event.hpp (line 140) <https://reviews.apache.org/r/38074/#comment165769> Format to 80 cols, please. src/slave/containerizer/isolators/cgroups/perf_event.cpp (lines 116 - 118) <https://reviews.apache.org/r/38074/#comment165773> What is this conditional on? It looks like any use of the perf event isolator will enable the scheduler latency functionality... src/slave/containerizer/isolators/cgroups/perf_event.cpp (line 130) <https://reviews.apache.org/r/38074/#comment165795> const string& src/slave/containerizer/isolators/cgroups/perf_event.cpp (lines 466 - 476) <https://reviews.apache.org/r/38074/#comment165774> remove this, see earlier comment. src/slave/containerizer/isolators/cgroups/perf_event.cpp (line 480) <https://reviews.apache.org/r/38074/#comment165794> Suggest that you use a hashmap<pid_t, string>, i.e., just directly store the cgroup for each pid. Slight tradeoff in size but then you can avoid findCgroupByPid() and simplify the code. src/slave/containerizer/isolators/cgroups/perf_event.cpp (line 482) <https://reviews.apache.org/r/38074/#comment165775> const string& src/slave/containerizer/isolators/cgroups/perf_event.cpp (lines 483 - 484) <https://reviews.apache.org/r/38074/#comment165776> Please clarify the language, is it "should not" or "does not" :-) src/slave/containerizer/isolators/cgroups/perf_event.cpp (line 485) <https://reviews.apache.org/r/38074/#comment165778> const Future<>& src/slave/containerizer/isolators/cgroups/perf_event.cpp (line 486) <https://reviews.apache.org/r/38074/#comment165779> Is each Future guaranteed to be ready? src/slave/containerizer/isolators/cgroups/perf_event.cpp (line 487) <https://reviews.apache.org/r/38074/#comment165780> Please clarify language, "should" or "is"? src/slave/containerizer/isolators/cgroups/perf_event.cpp (line 489) <https://reviews.apache.org/r/38074/#comment165781> Call this variable child? src/slave/containerizer/isolators/cgroups/perf_event.cpp (line 490) <https://reviews.apache.org/r/38074/#comment165782> We don't use CHECKs in code unless it truly is an unrecoverable error. Please fail the future or act appropriately. src/slave/containerizer/isolators/cgroups/perf_event.cpp (lines 491 - 492) <https://reviews.apache.org/r/38074/#comment165785> No snake_case please, here and everywhere. Use C++ cast, not C style. src/slave/containerizer/isolators/cgroups/perf_event.cpp (line 492) <https://reviews.apache.org/r/38074/#comment165786> ditch the intermediate variable and just insert the cast child.get()? and, using my suggestion above, processes[pid] = cgroup src/slave/containerizer/isolators/cgroups/perf_event.cpp (line 496) <https://reviews.apache.org/r/38074/#comment165797> You don't actually use this as a map? It would be much clearer to use a vector and explicitly sort with a comparision function on the timestamp. Suggest just calling it events. ``` events.push_back(event); } std::sort(begin(events), end(events), [](...){}); ``` src/slave/containerizer/isolators/cgroups/perf_event.cpp (line 506) <https://reviews.apache.org/r/38074/#comment165798> const TraceEvent& src/slave/containerizer/isolators/cgroups/perf_event.cpp (line 514) <https://reviews.apache.org/r/38074/#comment165814> It would be nice to have helpers which returned the correct type for event fields. src/slave/containerizer/isolators/cgroups/perf_event.cpp (line 515) <https://reviews.apache.org/r/38074/#comment165787> No CHECKs. src/slave/containerizer/isolators/cgroups/perf_event.cpp (line 516) <https://reviews.apache.org/r/38074/#comment165788> C++ cast. src/slave/containerizer/isolators/cgroups/perf_event.cpp (line 526) <https://reviews.apache.org/r/38074/#comment165789> ditto. src/slave/containerizer/isolators/cgroups/perf_event.cpp (line 527) <https://reviews.apache.org/r/38074/#comment165790> ditto. src/slave/containerizer/isolators/cgroups/perf_event.cpp (line 529) <https://reviews.apache.org/r/38074/#comment165791> ditto. src/slave/containerizer/isolators/cgroups/perf_event.cpp (line 530) <https://reviews.apache.org/r/38074/#comment165792> ditto. src/slave/containerizer/isolators/cgroups/perf_event.cpp (lines 536 - 537) <https://reviews.apache.org/r/38074/#comment165800> Just continue if pid == 0, rather than nesting this conditional block on != 0. src/slave/containerizer/isolators/cgroups/perf_event.cpp (line 539) <https://reviews.apache.org/r/38074/#comment165793> ditto. src/slave/containerizer/isolators/cgroups/perf_event.cpp (line 541) <https://reviews.apache.org/r/38074/#comment165801> Are states enumerated somewhere? src/slave/containerizer/isolators/cgroups/perf_event.cpp (line 544) <https://reviews.apache.org/r/38074/#comment165802> newline between blocks src/slave/containerizer/isolators/cgroups/perf_event.cpp (line 554) <https://reviews.apache.org/r/38074/#comment165804> sort the latencies in schedLatency here? src/slave/containerizer/isolators/cgroups/perf_event.cpp (line 556) <https://reviews.apache.org/r/38074/#comment165803> const string& grab the cgroup and vector together rather than indexing repeatedly... ``` foreachpair(const string& cgroup, const vector<uint64_t>& latencies, schedLatency) ``` src/slave/containerizer/isolators/cgroups/perf_event.cpp (line 606) <https://reviews.apache.org/r/38074/#comment165805> why not call the variable threads? src/slave/containerizer/isolators/cgroups/perf_event.cpp (line 608) <https://reviews.apache.org/r/38074/#comment165806> also log the error, {{process.error()}}. src/slave/containerizer/isolators/cgroups/perf_event.cpp (lines 608 - 609) <https://reviews.apache.org/r/38074/#comment165815> Could you just skip rather than failing for all cgroups? src/slave/containerizer/isolators/cgroups/perf_event.cpp (line 611) <https://reviews.apache.org/r/38074/#comment165810> as above, suggest stored pid -> cgroup mapping denormalized. src/slave/containerizer/isolators/cgroups/perf_event.cpp (line 615) <https://reviews.apache.org/r/38074/#comment165809> const string& The key is an event? If so, it should be named as such. src/slave/containerizer/isolators/cgroups/perf_event.cpp (lines 619 - 621) <https://reviews.apache.org/r/38074/#comment165807> How expensive is this...? src/slave/containerizer/isolators/cgroups/perf_event.cpp (lines 623 - 626) <https://reviews.apache.org/r/38074/#comment165808> Users have different kernels, code should determine the version at run time and act accordingly. src/slave/containerizer/isolators/cgroups/perf_event.cpp (line 633) <https://reviews.apache.org/r/38074/#comment165811> Include error message _events.error(). src/slave/containerizer/isolators/cgroups/perf_event.cpp (line 636) <https://reviews.apache.org/r/38074/#comment165812> const string&. I don't think there's another cgroup variable in scope so just use cgroup, not _cgroup. - Ian Downes On Sept. 29, 2015, 5:15 p.m., Cong Wang wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/38074/ > ----------------------------------------------------------- > > (Updated Sept. 29, 2015, 5:15 p.m.) > > > Review request for mesos, Chi Zhang, Ian Downes, Jie Yu, and Vinod Kone. > > > Bugs: MESOS-2769 > https://issues.apache.org/jira/browse/MESOS-2769 > > > Repository: mesos > > > Description > ------- > > Finally, calculate schedule latency with the sched trace events, and add it > to our statistics > > > Diffs > ----- > > include/mesos/mesos.proto 4a16be1f570769f3ce42a50a9da9f4fb1c227999 > src/slave/containerizer/isolators/cgroups/perf_event.hpp > 1f722ef3ef7ab7fce5542d4affae961d6cec2406 > src/slave/containerizer/isolators/cgroups/perf_event.cpp > 03035dfbfb84470ba39ed9ecfd1eb73890e3f784 > src/slave/flags.hpp f76f0f6a0c7a01c7f061a225d7f6ef52be0ee7b5 > src/slave/flags.cpp 029aa1eb00e1aa3e92c2155925022c17fd905862 > > Diff: https://reviews.apache.org/r/38074/diff/ > > > Testing > ------- > > manual tests > > > Thanks, > > Cong Wang > >