Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15798 )

Change subject: IMPALA-9382: part 1: transposed profile prototype
......................................................................


Patch Set 17:

(11 comments)

http://gerrit.cloudera.org:8080/#/c/15798/17/be/src/util/runtime-profile-test.cc
File be/src/util/runtime-profile-test.cc:

http://gerrit.cloudera.org:8080/#/c/15798/17/be/src/util/runtime-profile-test.cc@626
PS17, Line 626:   RuntimeProfile::AveragedCounter bytes_avg(TUnit::BYTES, 
NUM_COUNTERS);
              :   for (int i = 0; i < NUM_COUNTERS; ++i) {
              :     bytes_avg.UpdateCounter(counters[i], i);
              :   }
              :   RuntimeProfile::AveragedCounter::Stats<int64_t> stats = 
bytes_avg.GetStats<int64_t>();
              :   EXPECT_EQ(NUM_COUNTERS, stats.num_vals);
              :   EXPECT_EQ(100, stats.min);
              :   EXPECT_EQ(199, stats.max);
              :   EXPECT_EQ(149, stats.mean);
              :   EXPECT_EQ(149, stats.p50);
              :   EXPECT_EQ(174, stats.p75);
              :   EXPECT_EQ(189, stats.p90);
              :   EXPECT_EQ(194, stats.p95);
Something else we could test is to/from thrift. AveragedCounter -> TAggCounter 
-> AveragedCounter


http://gerrit.cloudera.org:8080/#/c/15798/17/be/src/util/runtime-profile.cc
File be/src/util/runtime-profile.cc:

http://gerrit.cloudera.org:8080/#/c/15798/17/be/src/util/runtime-profile.cc@211
PS17, Line 211: thread
Should be "thrift"?


http://gerrit.cloudera.org:8080/#/c/15798/17/be/src/util/runtime-profile.cc@247
PS17, Line 247: RuntimeProfile::CreateFromThriftHelper
I think this should be "RuntimeProfileBase::CreateFromThriftHelper"


http://gerrit.cloudera.org:8080/#/c/15798/17/be/src/util/runtime-profile.cc@356
PS17, Line 356:       auto& entry = summary_stats_map_[src_entry.first];
Nit: One thing that I found confusing is that the source entry coming from the 
RuntimeProfile has a different type than the destination entry, even though 
they are both from their corresponding summary_stats_map_ fields. It made sense 
once I looked at it again, but it would be good to make it clearer. One option 
is to tweak the variable names (don't use 'entry' for both or make the names 
more specific to the structure e.g. src_entry -> src_stats_counter or entry -> 
agg_entry). Another thought is that maybe this 'auto' shouldn't be an 'auto'.


http://gerrit.cloudera.org:8080/#/c/15798/17/be/src/util/runtime-profile.cc@380
PS17, Line 380: Update()/Update()
Redundant? (Or needs some other change?)


http://gerrit.cloudera.org:8080/#/c/15798/17/be/src/util/runtime-profile.cc@526
PS17, Line 526: Update()/Update()
Redundant? (Or needs some other change?)


http://gerrit.cloudera.org:8080/#/c/15798/17/be/src/util/runtime-profile.cc@936
PS17, Line 936: ""
Nit: Not your change, but it might be clearer to use ROOT_COUNTER here rather 
than empty string.


http://gerrit.cloudera.org:8080/#/c/15798/17/be/src/util/runtime-profile.cc@1080
PS17, Line 1080: // Print a sorted vector of indices in compressed form with 
subsequent indices
               : // printed as ranges. E.g. [1, 2, 3, 4, 6] would result in 
"1-4,6".
               : static void PrettyPrintIndexRanges(ostream* s, const 
vector<int32_t>& indices) {
               :   if (indices.empty()) return;
               :   ostream& stream = *s;
               :   int32_t start_idx = indices[0];
               :   int32_t prev_idx = indices[0];
               :   for (int i = 0; i < indices.size(); ++i) {
               :     int32_t idx = indices[i];
               :     if (idx > prev_idx + 1) {
               :       // Start of a new range. Print the previous range of 
values.
               :       stream << start_idx;
               :       if (start_idx < prev_idx) stream << "-" << prev_idx;
               :       stream << ",";
               :       start_idx = idx;
               :     }
               :     prev_idx = idx;
               :   }
               :   // Print the last range of values.
               :   stream << start_idx;
               :   if (start_idx < prev_idx) stream << "-" << prev_idx;
               : }
Nit: Should this be closer to the code that calls it (down in 
AggregatedRuntimeProfile below)?


http://gerrit.cloudera.org:8080/#/c/15798/17/be/src/util/runtime-profile.cc@1892
PS17, Line 1892: RuntimeProfile::SummaryStatsCounter::ToThrift
Do we care about whether all these declarations use RuntimeProfile vs 
RuntimeProfileBase?

It makes sense to me that we expose these types via RuntimeProfile so the rest 
of the code doesn't need to change. Technically, the underlying type is on 
RuntimeProfileBase.


http://gerrit.cloudera.org:8080/#/c/15798/17/be/src/util/runtime-profile.cc@2066
PS17, Line 2066: ""
Nit: Might consider using ROOT_COUNTER rather than ""


http://gerrit.cloudera.org:8080/#/c/15798/17/be/src/util/runtime-profile.cc@2128
PS17, Line 2128: void AggregatedRuntimeProfile::PrettyPrintInfoStrings(
               :     ostream* s, const string& prefix) const {
               :   ostream& stream = *s;
               :   if (FLAGS_gen_experimental_profile) {
               :     lock_guard<SpinLock> l(input_profile_name_lock_);
               :     if (!input_profile_names_.empty()) {
               :       // TODO: IMPALA-9382: improve pretty-printing here
               :       stream << prefix
               :              << "Instances: " << 
boost::algorithm::join(input_profile_names_, ", ")
               :              << endl;
               :     }
               :   }
               :
               :   {
               :     lock_guard<SpinLock> l(agg_info_strings_lock_);
               :     for (const auto& entry : agg_info_strings_) {
               :       map<string, vector<int32_t>> distinct_vals = 
GroupDistinctInfoStrings(entry.second);
               :       for (const auto& distinct_entry : distinct_vals) {
               :         stream << prefix << "  " << entry.first;
               :         stream << "[";
               :         PrettyPrintIndexRanges(s, distinct_entry.second);
               :         stream << "]: " << distinct_entry.first << endl;
               :       }
               :     }
               :   }
               : }
Nit: From what I understand, this doesn't print anything when it isn't the 
experimental profile. agg_info_strings_ should be empty. Maybe bail early if it 
isn't the aggregated profile like PrettyPrintSubclassCounters?



--
To view, visit http://gerrit.cloudera.org:8080/15798
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0838c6a0872f57c696267ff4e92d29c08748eb7a
Gerrit-Change-Number: 15798
Gerrit-PatchSet: 17
Gerrit-Owner: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <joemcdonn...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kdesc...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-Comment-Date: Tue, 08 Sep 2020 20:56:27 +0000
Gerrit-HasComments: Yes

Reply via email to