Hello Quanlong Huang, Yongzhi Chen, Vihang Karajgaonkar, Impala Public Jenkins,
I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/14611 to look at the new patch set (#7). Change subject: IMPALA-9110: Add table loading time break-down metrics for HdfsTable ...................................................................... IMPALA-9110: Add table loading time break-down metrics for HdfsTable A. Problem: Catalog table loading currently only records the total loading time. We will need some break-down times, i.e. more detailed time recording on each loading function. Also, the table schema loading is not taken into account for load-duration. We will need to add some more metrics for that. B. Solution: - We added "hms-load-tbl-schema", "load-duration.all-column-stats", "load-duration.all-partitions.total-time", "load-duration.all-partitions.file-metadata". Also, we logged the loadValidWriteIdList() time. So now we have a more detailed breakdown time for table loading info. The table loading time metrics for HDFS tables are in the following hierarchy: - Table Schema Loading - Table Metadata Loading - total time - all column stats loading time - ValidWriteIds loading time - all partitions loading time - total time - file metadata loading time - storage-metadata-loading-time(standalone metric) 1. Table Schema Loading: * Meaning: The time for HMS to fetch table object and the real schema loading time. Normally, the code path is "msClient.getHiveClient().getTable(dbName, tblName)" * Metric : hms-load-tbl-schema 2. Table Metadata Loading -- total time * Meaning: The time to load all the table metadata. The code path is load() function in HdfsTable.load() function. * Metric: load-duration.total-time 2.1 Table Metadata Loading -- all column stats * Meaning: load all column stats, this is part of table metadata loading The code path is HdfsTable.loadAllColumnStats() * Metric: load-duration.all-column-stats 2.2 Table Metadata Loading -- loadValidWriteIdList * Meaning: fetch ValidWriteIds from HMS The code path is HdfsTable.loadValidWriteIdList() * Metric: no metric recorded for this one. Instead, a debug log is generated. 2.3 Table Metadata Loading -- storage metadata loading(standalone metric) * Meaning: Storage related to file system operations during metadata loading.(The amount of time spent loading metadata from the underlying storage layer.) * Metric: we rename it to load-duration.storage-metadata. This is a metric introduced by IMPALA-7322 2.4 Table Metadata Loading -- load all partitions * Meaning: Load all partitions time, including fetching all partitions from HMS and loading all partitions. The code path is MetaStoreUtil.fetchAllPartitions() and HdfsTable.loadAllPartitions() * Metric: load-duration.all-partitions 2.4.1 Table Metadata Loading -- load all partitions -- load file metadata * Meaning: The file metadata loading for all all partitions. (This is part of 2.4). Code path: loadFileMetadataForPartitions() inside loadAllPartitions() * Metric: load-duration.all-partitions.file-metadata C. Extra thing in this commit: 1. Add PrintUtils.printTimeNs for PrettyPrint time in FrontEnd 2. Add explanation for table loading manager D. Test: 1. Add Unit tests for PrintUtils.printTime() function 2. Manual describe table and verify the table loading metrics are correct. Change-Id: I5381f9316df588b2004876c6cd9fb7e674085b10 --- M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/HBaseTable.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/KuduTable.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/catalog/TableLoader.java M fe/src/main/java/org/apache/impala/catalog/TableLoadingMgr.java M fe/src/main/java/org/apache/impala/common/PrintUtils.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/test/java/org/apache/impala/util/PrintUtilsTest.java 10 files changed, 186 insertions(+), 35 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/11/14611/7 -- To view, visit http://gerrit.cloudera.org:8080/14611 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I5381f9316df588b2004876c6cd9fb7e674085b10 Gerrit-Change-Number: 14611 Gerrit-PatchSet: 7 Gerrit-Owner: Jiawei Wang <jiawei.w...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Jiawei Wang <jiawei.w...@cloudera.com> Gerrit-Reviewer: Quanlong Huang <huangquanl...@gmail.com> Gerrit-Reviewer: Vihang Karajgaonkar <vih...@cloudera.com> Gerrit-Reviewer: Yongzhi Chen <yc...@cloudera.com>