Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14776 )

Change subject: IMPALA-7550: Add documentation to profile counters
......................................................................


Patch Set 1:

(10 comments)

I like the direction this is going.

http://gerrit.cloudera.org:8080/#/c/14776/1/be/src/exec/hdfs-scan-node-base.h
File be/src/exec/hdfs-scan-node-base.h:

http://gerrit.cloudera.org:8080/#/c/14776/1/be/src/exec/hdfs-scan-node-base.h@525
PS1, Line 525:   /// Disk accessed bitmap
> I am not too sure about the following two counters. Wondering if we should
These are weird cause they're not actually included in the profile directly - 
they're used for other purposes. I'd leave them for now.


http://gerrit.cloudera.org:8080/#/c/14776/1/be/src/exec/hdfs-scan-node-base.cc
File be/src/exec/hdfs-scan-node-base.cc:

http://gerrit.cloudera.org:8080/#/c/14776/1/be/src/exec/hdfs-scan-node-base.cc@127
PS1, Line 127: 
PROFILE_DEFINE_SUMMARY_STATS_COUNTER(ParquetUncompressedBytesReadPerColumn, 
STABLE,
> This seems redundant, any advice to modify it will be appreciated.
I'm OK with the redundant text, but it might be reasonable to use a template 
string and substitute in "compressed/uncompressed".

I've seen attempts to avoid this by constructing the strings with Substitute(), 
e.g. num_io_threads_per_rotational_disk_help_msg in disk-io-mgr.cc. This is OK 
if you're just substituting things into a fixed template but gets confusing if 
you're piecing together more complex strings..


http://gerrit.cloudera.org:8080/#/c/14776/1/be/src/exec/hdfs-scan-node-base.cc@139
PS1, Line 139: PROFILE_DEFINE_COUNTER(DataCacheHitCount, STABLE, HIGH, 
TUnit::UNIT,
> Did not find the explanation for the following data cache counters. Might n
Maybe Joe McDonnell can provide some text here?


http://gerrit.cloudera.org:8080/#/c/14776/1/be/src/exec/scan-node.cc
File be/src/exec/scan-node.cc:

http://gerrit.cloudera.org:8080/#/c/14776/1/be/src/exec/scan-node.cc@82
PS1, Line 82: PROFILE_DEFINE_TIMER(RowBatchQueueGetWaitTime, UNSTABLE, LOW, 
"Wall clock time that the "
> Hi, Tim
We could probably define an enum or something like that. I'd be ok with just 
including it in the description text, I think.


http://gerrit.cloudera.org:8080/#/c/14776/1/be/src/exec/scan-node.cc@95
PS1, Line 95: // PROFILE_BYTES_READ_COUNTER. We can replace them with 
PROFILE_BytesRead.name() where
I think doing this makes sense - that would force all counters to be defined in 
this consistent way.


http://gerrit.cloudera.org:8080/#/c/14776/1/be/src/runtime/coordinator-backend-state.cc
File be/src/runtime/coordinator-backend-state.cc:

http://gerrit.cloudera.org:8080/#/c/14776/1/be/src/runtime/coordinator-backend-state.cc@60
PS1, Line 60: // TODO: the counters are inside the impala namespace, so we need 
to declare them there,
This approach of converting the namespaces where needed seems fine to me. The 
code is inconsistent about whether to use "using namespace impala" or 
"namespace impala {" within .cc files. I prefer the second, personally, but 
there's no strong reason to do one or the other.


http://gerrit.cloudera.org:8080/#/c/14776/1/be/src/util/runtime-profile-counters.h
File be/src/util/runtime-profile-counters.h:

http://gerrit.cloudera.org:8080/#/c/14776/1/be/src/util/runtime-profile-counters.h@154
PS1, Line 154:         STABLE,
nit: formatting is weird, I'd expect the enum values to only be indented two 
spaces.


http://gerrit.cloudera.org:8080/#/c/14776/1/be/src/util/runtime-profile-counters.h@164
PS1, Line 164:   enum class Significance {
> I add this Significance to measure if the counter is important to a user. M
I think we should probably combine them, because they're pretty correlated - 
UNSTABLE and DEBUG would only really make sense as low level counters.

*Maybe* we could split stable into stable_high_level and stable_low_level or 
something like that, in which case stable_high_level would be things that are 
interesting to users and stable_low_level would be mainly exposed for analysis 
by tools.

For me it's important that there are fairly clear criteria for developers to 
decide how to classify counters. I think with 3 x 5 options it gets too hard to 
decide how to classify each counter.


http://gerrit.cloudera.org:8080/#/c/14776/1/be/src/util/runtime-profile-counters.h@189
PS1, Line 189:   // TODO
Are you planning to do this?


http://gerrit.cloudera.org:8080/#/c/14776/1/www/profile_docs.tmpl
File www/profile_docs.tmpl:

http://gerrit.cloudera.org:8080/#/c/14776/1/www/profile_docs.tmpl@1
PS1, Line 1: <!--
> Screenshot of this can be found here:
We should link to this from somewhere in the debug UI so it's discoverable.

Not sure if it merits a top-level link, but we could link from the 
/query_profile page.



--
To view, visit http://gerrit.cloudera.org:8080/14776
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idc03faddb27754001290bb6d899840e2cbe7ccb7
Gerrit-Change-Number: 14776
Gerrit-PatchSet: 1
Gerrit-Owner: Jiawei Wang <jiawei.w...@cloudera.com>
Gerrit-Reviewer: Andrew Sherman <asher...@cloudera.com>
Gerrit-Reviewer: Balazs Jeszenszky <jes...@gmail.com>
Gerrit-Reviewer: David Rorke <dro...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Jiawei Wang <jiawei.w...@cloudera.com>
Gerrit-Reviewer: Lars Volker <l...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-Comment-Date: Tue, 26 Nov 2019 21:28:30 +0000
Gerrit-HasComments: Yes

Reply via email to