Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/14776 )
Change subject: IMPALA-7550: Add documentation to profile counters ...................................................................... Patch Set 1: (10 comments) I like the direction this is going. http://gerrit.cloudera.org:8080/#/c/14776/1/be/src/exec/hdfs-scan-node-base.h File be/src/exec/hdfs-scan-node-base.h: http://gerrit.cloudera.org:8080/#/c/14776/1/be/src/exec/hdfs-scan-node-base.h@525 PS1, Line 525: /// Disk accessed bitmap > I am not too sure about the following two counters. Wondering if we should These are weird cause they're not actually included in the profile directly - they're used for other purposes. I'd leave them for now. http://gerrit.cloudera.org:8080/#/c/14776/1/be/src/exec/hdfs-scan-node-base.cc File be/src/exec/hdfs-scan-node-base.cc: http://gerrit.cloudera.org:8080/#/c/14776/1/be/src/exec/hdfs-scan-node-base.cc@127 PS1, Line 127: PROFILE_DEFINE_SUMMARY_STATS_COUNTER(ParquetUncompressedBytesReadPerColumn, STABLE, > This seems redundant, any advice to modify it will be appreciated. I'm OK with the redundant text, but it might be reasonable to use a template string and substitute in "compressed/uncompressed". I've seen attempts to avoid this by constructing the strings with Substitute(), e.g. num_io_threads_per_rotational_disk_help_msg in disk-io-mgr.cc. This is OK if you're just substituting things into a fixed template but gets confusing if you're piecing together more complex strings.. http://gerrit.cloudera.org:8080/#/c/14776/1/be/src/exec/hdfs-scan-node-base.cc@139 PS1, Line 139: PROFILE_DEFINE_COUNTER(DataCacheHitCount, STABLE, HIGH, TUnit::UNIT, > Did not find the explanation for the following data cache counters. Might n Maybe Joe McDonnell can provide some text here? http://gerrit.cloudera.org:8080/#/c/14776/1/be/src/exec/scan-node.cc File be/src/exec/scan-node.cc: http://gerrit.cloudera.org:8080/#/c/14776/1/be/src/exec/scan-node.cc@82 PS1, Line 82: PROFILE_DEFINE_TIMER(RowBatchQueueGetWaitTime, UNSTABLE, LOW, "Wall clock time that the " > Hi, Tim We could probably define an enum or something like that. I'd be ok with just including it in the description text, I think. http://gerrit.cloudera.org:8080/#/c/14776/1/be/src/exec/scan-node.cc@95 PS1, Line 95: // PROFILE_BYTES_READ_COUNTER. We can replace them with PROFILE_BytesRead.name() where I think doing this makes sense - that would force all counters to be defined in this consistent way. http://gerrit.cloudera.org:8080/#/c/14776/1/be/src/runtime/coordinator-backend-state.cc File be/src/runtime/coordinator-backend-state.cc: http://gerrit.cloudera.org:8080/#/c/14776/1/be/src/runtime/coordinator-backend-state.cc@60 PS1, Line 60: // TODO: the counters are inside the impala namespace, so we need to declare them there, This approach of converting the namespaces where needed seems fine to me. The code is inconsistent about whether to use "using namespace impala" or "namespace impala {" within .cc files. I prefer the second, personally, but there's no strong reason to do one or the other. http://gerrit.cloudera.org:8080/#/c/14776/1/be/src/util/runtime-profile-counters.h File be/src/util/runtime-profile-counters.h: http://gerrit.cloudera.org:8080/#/c/14776/1/be/src/util/runtime-profile-counters.h@154 PS1, Line 154: STABLE, nit: formatting is weird, I'd expect the enum values to only be indented two spaces. http://gerrit.cloudera.org:8080/#/c/14776/1/be/src/util/runtime-profile-counters.h@164 PS1, Line 164: enum class Significance { > I add this Significance to measure if the counter is important to a user. M I think we should probably combine them, because they're pretty correlated - UNSTABLE and DEBUG would only really make sense as low level counters. *Maybe* we could split stable into stable_high_level and stable_low_level or something like that, in which case stable_high_level would be things that are interesting to users and stable_low_level would be mainly exposed for analysis by tools. For me it's important that there are fairly clear criteria for developers to decide how to classify counters. I think with 3 x 5 options it gets too hard to decide how to classify each counter. http://gerrit.cloudera.org:8080/#/c/14776/1/be/src/util/runtime-profile-counters.h@189 PS1, Line 189: // TODO Are you planning to do this? http://gerrit.cloudera.org:8080/#/c/14776/1/www/profile_docs.tmpl File www/profile_docs.tmpl: http://gerrit.cloudera.org:8080/#/c/14776/1/www/profile_docs.tmpl@1 PS1, Line 1: <!-- > Screenshot of this can be found here: We should link to this from somewhere in the debug UI so it's discoverable. Not sure if it merits a top-level link, but we could link from the /query_profile page. -- To view, visit http://gerrit.cloudera.org:8080/14776 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idc03faddb27754001290bb6d899840e2cbe7ccb7 Gerrit-Change-Number: 14776 Gerrit-PatchSet: 1 Gerrit-Owner: Jiawei Wang <jiawei.w...@cloudera.com> Gerrit-Reviewer: Andrew Sherman <asher...@cloudera.com> Gerrit-Reviewer: Balazs Jeszenszky <jes...@gmail.com> Gerrit-Reviewer: David Rorke <dro...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Jiawei Wang <jiawei.w...@cloudera.com> Gerrit-Reviewer: Lars Volker <l...@cloudera.com> Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com> Gerrit-Comment-Date: Tue, 26 Nov 2019 21:28:30 +0000 Gerrit-HasComments: Yes