Hello Matthew Jacobs, I'd like you to reexamine a change. Please visit
http://gerrit.cloudera.org:8080/4371 to look at the new patch set (#6). Change subject: IMPALA-3823: Add timer to measure Parquet footer reads ...................................................................... IMPALA-3823: Add timer to measure Parquet footer reads It's been observed that Parquet footer reads perform poorly especially when reading from S3. This patch adds a timer "FooterProcessingTimer" which keeps a track of the average time each split of each scan node spends in reading and processing the parquet footer. Added a new utility counter called SummaryStatsCounter which keeps track of the min, max and average values seen so far from a set of values. This counter is used to calculate the min, max and average time taken to scan and process Parquet footers per query per node. The RuntimeProfile has also been updated to keep a track of, display and serialize this new counter to thrift. BE tests have been added to verify that this counter works fine. Change-Id: Icf87bad90037dd0cea63b10c537382ec0f980cbf --- M be/src/exec/hdfs-parquet-scanner.cc M be/src/exec/hdfs-parquet-scanner.h M be/src/util/runtime-profile-counters.h M be/src/util/runtime-profile-test.cc M be/src/util/runtime-profile.cc M be/src/util/runtime-profile.h M common/thrift/RuntimeProfile.thrift M tests/query_test/test_scanners.py 8 files changed, 286 insertions(+), 16 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/71/4371/6 -- To view, visit http://gerrit.cloudera.org:8080/4371 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Icf87bad90037dd0cea63b10c537382ec0f980cbf Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Sailesh Mukil <sail...@cloudera.com> Gerrit-Reviewer: Dan Hecht <dhe...@cloudera.com> Gerrit-Reviewer: Matthew Jacobs <m...@cloudera.com> Gerrit-Reviewer: Sailesh Mukil <sail...@cloudera.com>