Quanlong Huang has uploaded this change for review. (
http://gerrit.cloudera.org:8080/22062
Change subject: IMPALA-13487: Add profile counters for memory allocation
......................................................................
IMPALA-13487: Add profile counters for memory allocation
This patch adds some profile counters to identify memory-bound queries,
i.e. query duration mostly spent in memory operations.
The following counters are added:
Here is an example of a memory-bound query:
Fragment Instance
- RowBatchResetTime: 472.001ms
- TotalStorageWaitTime: 556.001ms
- TotalThreadsInvoluntaryContextSwitches: 5 (5)
- TotalThreadsMajorPageFaults: 0 (0)
- TotalThreadsMinorPageFaults: 578.62K (578620)
- TotalThreadsTotalWallClockTime: 8s809ms
- TotalThreadsSysTime: 1s435ms
- TotalThreadsUserTime: 1s761ms
- TotalThreadsVoluntaryContextSwitches: 4.77K (4769)
- TotalTime: 8s819ms
HDFS_SCAN_NODE (id=0):
- DecompressionTime: 1s038ms
- MaterializeCollectionGetMemTime: 4s153ms
- MaterializeTupleTime: 5s948ms
- ScannerIoWaitTime: 556.001ms
- ScratchBatchMemAllocDuration: 4s290ms
- ScratchBatchMemAllocTimes: 2.56K (2560)
- ScratchBatchMemFreeDuration: 0.000ns
- ScratchBatchMemFreeTimes: 0 (0)
- TotalRawHdfsOpenFileTime: 0.000ns
- TotalRawHdfsReadTime: 247.000ms
- TotalTime: 8s258ms
The fragment instance took 8s819ms to finish. 472ms spent in resetting
the final RowBatch. The majority of the time is spent in the scan node
(8s258ms). Mostly it's DecompressionTime + MaterializeTupleTime +
ScannerIoWaitTime. The majority is MaterializeTupleTime (5s948ms).
ScratchBatchMemAllocDuration shows that invoking std::malloc() took
4s290ms. MaterializeCollectionGetMemTime shows that allocating memory
for collections and copying memory in doubling the tuple buffer took
4s153ms. So materializing the collections took most of the time.
Implementation of MemPool counters
Add optional MemPoolCounters owned by MemPool callers (e.g. scanner) so
they can have longer life cycle than MemPools. Note that some counters
are updated in the destructor of MemPool so need longer life cycle.
MemPoolCounters is currently an optional parameter in the MemPool
constructor. MemPool is widely used in the code base. Callers that don't
need to track MemPool counters keep setting it as nullptr. Currently,
only track MemPool counters of the scratch batch in columnar scanners.
Tests
- tested in manually reproducing the memory-bound queries
Change-Id: I982315d96e6de20a3616f3bd2a2b4866d1ff4710
---
M be/src/exec/hdfs-columnar-scanner.cc
M be/src/exec/hdfs-columnar-scanner.h
M be/src/exec/hdfs-scanner.cc
M be/src/exec/hdfs-scanner.h
M be/src/exec/scratch-tuple-batch.h
M be/src/runtime/fragment-instance-state.cc
M be/src/runtime/mem-pool.cc
M be/src/runtime/mem-pool.h
M be/src/util/runtime-profile-counters.h
M be/src/util/runtime-profile.cc
10 files changed, 146 insertions(+), 37 deletions(-)
git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/62/22062/1
--
To view, visit http://gerrit.cloudera.org:8080/22062
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I982315d96e6de20a3616f3bd2a2b4866d1ff4710
Gerrit-Change-Number: 22062
Gerrit-PatchSet: 1
Gerrit-Owner: Quanlong Huang <[email protected]>