Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/15798 )
Change subject: IMPALA-9382: part 1: transposed profile prototype ...................................................................... Patch Set 17: (11 comments) http://gerrit.cloudera.org:8080/#/c/15798/17/be/src/util/runtime-profile-test.cc File be/src/util/runtime-profile-test.cc: http://gerrit.cloudera.org:8080/#/c/15798/17/be/src/util/runtime-profile-test.cc@626 PS17, Line 626: RuntimeProfile::AveragedCounter bytes_avg(TUnit::BYTES, NUM_COUNTERS); : for (int i = 0; i < NUM_COUNTERS; ++i) { : bytes_avg.UpdateCounter(counters[i], i); : } : RuntimeProfile::AveragedCounter::Stats<int64_t> stats = bytes_avg.GetStats<int64_t>(); : EXPECT_EQ(NUM_COUNTERS, stats.num_vals); : EXPECT_EQ(100, stats.min); : EXPECT_EQ(199, stats.max); : EXPECT_EQ(149, stats.mean); : EXPECT_EQ(149, stats.p50); : EXPECT_EQ(174, stats.p75); : EXPECT_EQ(189, stats.p90); : EXPECT_EQ(194, stats.p95); Something else we could test is to/from thrift. AveragedCounter -> TAggCounter -> AveragedCounter http://gerrit.cloudera.org:8080/#/c/15798/17/be/src/util/runtime-profile.cc File be/src/util/runtime-profile.cc: http://gerrit.cloudera.org:8080/#/c/15798/17/be/src/util/runtime-profile.cc@211 PS17, Line 211: thread Should be "thrift"? http://gerrit.cloudera.org:8080/#/c/15798/17/be/src/util/runtime-profile.cc@247 PS17, Line 247: RuntimeProfile::CreateFromThriftHelper I think this should be "RuntimeProfileBase::CreateFromThriftHelper" http://gerrit.cloudera.org:8080/#/c/15798/17/be/src/util/runtime-profile.cc@356 PS17, Line 356: auto& entry = summary_stats_map_[src_entry.first]; Nit: One thing that I found confusing is that the source entry coming from the RuntimeProfile has a different type than the destination entry, even though they are both from their corresponding summary_stats_map_ fields. It made sense once I looked at it again, but it would be good to make it clearer. One option is to tweak the variable names (don't use 'entry' for both or make the names more specific to the structure e.g. src_entry -> src_stats_counter or entry -> agg_entry). Another thought is that maybe this 'auto' shouldn't be an 'auto'. http://gerrit.cloudera.org:8080/#/c/15798/17/be/src/util/runtime-profile.cc@380 PS17, Line 380: Update()/Update() Redundant? (Or needs some other change?) http://gerrit.cloudera.org:8080/#/c/15798/17/be/src/util/runtime-profile.cc@526 PS17, Line 526: Update()/Update() Redundant? (Or needs some other change?) http://gerrit.cloudera.org:8080/#/c/15798/17/be/src/util/runtime-profile.cc@936 PS17, Line 936: "" Nit: Not your change, but it might be clearer to use ROOT_COUNTER here rather than empty string. http://gerrit.cloudera.org:8080/#/c/15798/17/be/src/util/runtime-profile.cc@1080 PS17, Line 1080: // Print a sorted vector of indices in compressed form with subsequent indices : // printed as ranges. E.g. [1, 2, 3, 4, 6] would result in "1-4,6". : static void PrettyPrintIndexRanges(ostream* s, const vector<int32_t>& indices) { : if (indices.empty()) return; : ostream& stream = *s; : int32_t start_idx = indices[0]; : int32_t prev_idx = indices[0]; : for (int i = 0; i < indices.size(); ++i) { : int32_t idx = indices[i]; : if (idx > prev_idx + 1) { : // Start of a new range. Print the previous range of values. : stream << start_idx; : if (start_idx < prev_idx) stream << "-" << prev_idx; : stream << ","; : start_idx = idx; : } : prev_idx = idx; : } : // Print the last range of values. : stream << start_idx; : if (start_idx < prev_idx) stream << "-" << prev_idx; : } Nit: Should this be closer to the code that calls it (down in AggregatedRuntimeProfile below)? http://gerrit.cloudera.org:8080/#/c/15798/17/be/src/util/runtime-profile.cc@1892 PS17, Line 1892: RuntimeProfile::SummaryStatsCounter::ToThrift Do we care about whether all these declarations use RuntimeProfile vs RuntimeProfileBase? It makes sense to me that we expose these types via RuntimeProfile so the rest of the code doesn't need to change. Technically, the underlying type is on RuntimeProfileBase. http://gerrit.cloudera.org:8080/#/c/15798/17/be/src/util/runtime-profile.cc@2066 PS17, Line 2066: "" Nit: Might consider using ROOT_COUNTER rather than "" http://gerrit.cloudera.org:8080/#/c/15798/17/be/src/util/runtime-profile.cc@2128 PS17, Line 2128: void AggregatedRuntimeProfile::PrettyPrintInfoStrings( : ostream* s, const string& prefix) const { : ostream& stream = *s; : if (FLAGS_gen_experimental_profile) { : lock_guard<SpinLock> l(input_profile_name_lock_); : if (!input_profile_names_.empty()) { : // TODO: IMPALA-9382: improve pretty-printing here : stream << prefix : << "Instances: " << boost::algorithm::join(input_profile_names_, ", ") : << endl; : } : } : : { : lock_guard<SpinLock> l(agg_info_strings_lock_); : for (const auto& entry : agg_info_strings_) { : map<string, vector<int32_t>> distinct_vals = GroupDistinctInfoStrings(entry.second); : for (const auto& distinct_entry : distinct_vals) { : stream << prefix << " " << entry.first; : stream << "["; : PrettyPrintIndexRanges(s, distinct_entry.second); : stream << "]: " << distinct_entry.first << endl; : } : } : } : } Nit: From what I understand, this doesn't print anything when it isn't the experimental profile. agg_info_strings_ should be empty. Maybe bail early if it isn't the aggregated profile like PrettyPrintSubclassCounters? -- To view, visit http://gerrit.cloudera.org:8080/15798 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0838c6a0872f57c696267ff4e92d29c08748eb7a Gerrit-Change-Number: 15798 Gerrit-PatchSet: 17 Gerrit-Owner: Tim Armstrong <tarmstr...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Joe McDonnell <joemcdonn...@cloudera.com> Gerrit-Reviewer: Kurt Deschler <kdesc...@cloudera.com> Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com> Gerrit-Comment-Date: Tue, 08 Sep 2020 20:56:27 +0000 Gerrit-HasComments: Yes