Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/20491 )
Change subject: IMPALA-12443: Add catalog timeline for all DDL profiles ...................................................................... Patch Set 12: (12 comments) Thanks for the feedbacks! http://gerrit.cloudera.org:8080/#/c/20491/12//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/20491/12//COMMIT_MSG@31 PS12, Line 31: unloaded > reloaded? The REFRESH command is executed when the table is in unloaded state (IncompleteTable). Updated the sentence. http://gerrit.cloudera.org:8080/#/c/20491/12/be/src/exec/catalog-op-executor.cc File be/src/exec/catalog-op-executor.cc: http://gerrit.cloudera.org:8080/#/c/20491/12/be/src/exec/catalog-op-executor.cc@125 PS12, Line 125: catalog_profile_ = make_unique<TRuntimeProfileNode>(exec_response_->profile); > Would it make sense to do a sanity check that we didn't get an awful lot of It's possible that we get a very long timeline for global INVALIDATE METADATA commands, e.g. Catalog Server Operation: 301.618ms - Got catalog version write lock: 9.908ms (9.908ms) - Got Metastore client: 9.922ms (14.013us) - Got database list: 11.396ms (1.473ms) - Loaded functions of default: 44.919ms (33.523ms) - Loaded TableMeta of 82 tables in database default: 47.524ms (2.604ms) - Loaded functions of functional: 50.846ms (3.321ms) - Loaded TableMeta of 101 tables in database functional: 52.580ms (1.734ms) - Loaded functions of functional_avro: 54.861ms (2.281ms) - Loaded TableMeta of 35 tables in database functional_avro: 55.789ms (928.120us) - Loaded functions of functional_avro_def: 58.193ms (2.403ms) - Loaded TableMeta of 34 tables in database functional_avro_def: 59.433ms (1.240ms) - Loaded functions of functional_avro_gzip: 61.639ms (2.206ms) - Loaded TableMeta of 42 tables in database functional_avro_gzip: 62.655ms (1.015ms) - Loaded functions of functional_avro_snap: 64.879ms (2.224ms) - Loaded TableMeta of 55 tables in database functional_avro_snap: 66.150ms (1.270ms) ... This patch adds events for loading table and function list for each db. So we can see which db have most of the tables and takes the longest time to load (admin can blacklist it). http://gerrit.cloudera.org:8080/#/c/20491/12/be/src/service/client-request-state.cc File be/src/service/client-request-state.cc: http://gerrit.cloudera.org:8080/#/c/20491/12/be/src/service/client-request-state.cc@1596 PS12, Line 1596: catalog_timeline.name > nit: could be 'timeline_name' Done http://gerrit.cloudera.org:8080/#/c/20491/12/be/src/service/client-request-state.cc@1773 PS12, Line 1773: if (catalog_op_executor_->catalog_profile() != nullptr) { : for (const TEventSequence& catalog_timeline : : catalog_op_executor_->catalog_profile()->event_sequences) { : summary_profile_->AddEventSequence(catalog_timeline.name, catalog_timeline); : } : } > nit: same as L711-L716. Could be moved to a member function. Done http://gerrit.cloudera.org:8080/#/c/20491/12/common/thrift/CatalogService.thrift File common/thrift/CatalogService.thrift: http://gerrit.cloudera.org:8080/#/c/20491/12/common/thrift/CatalogService.thrift@296 PS12, Line 296: DDL > Maybe DDL/DML? E.g. this will also contain information about Iceberg DML op Done http://gerrit.cloudera.org:8080/#/c/20491/12/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/20491/12/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2014 PS12, Line 2014: Loaded TableMeta > nit: I found the camelcase word a bit weird here, so maybe 'Loaded table me "table metadata" might be misunderstood as the HMS table object which contains the schema, properties, etc. What about "Loaded 100 table names of database db1" ? By default, TableMeta just contains the table name. When catalogd is started with --pull_table_types_and_comments=true, TableMeta will contain the table type and comment. But it's off by default. http://gerrit.cloudera.org:8080/#/c/20491/12/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2312 PS12, Line 2312: versionLock_.readLock().lock(); : catalogTimeline.markEvent(GOT_CATALOG_VERSION_READ_LOCK); > Would it make sense to create a method that encapsulates these two? Done http://gerrit.cloudera.org:8080/#/c/20491/12/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2347 PS12, Line 2347: tbl.readLock().lock(); : catalogTimeline.markEvent(GOT_TABLE_READ_LOCK); > Would it make sense to create a method that encapsulates these two? Done http://gerrit.cloudera.org:8080/#/c/20491/12/fe/src/main/java/org/apache/impala/catalog/HBaseTable.java File fe/src/main/java/org/apache/impala/catalog/HBaseTable.java: http://gerrit.cloudera.org:8080/#/c/20491/12/fe/src/main/java/org/apache/impala/catalog/HBaseTable.java@115 PS12, Line 115: hbase > nit: HBase? Done http://gerrit.cloudera.org:8080/#/c/20491/12/fe/src/main/java/org/apache/impala/catalog/KuduTable.java File fe/src/main/java/org/apache/impala/catalog/KuduTable.java: http://gerrit.cloudera.org:8080/#/c/20491/12/fe/src/main/java/org/apache/impala/catalog/KuduTable.java@324 PS12, Line 324: kudu > nit: Kudu Done http://gerrit.cloudera.org:8080/#/c/20491/12/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/20491/12/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1514 PS12, Line 1514: catalogTimeline.markEvent("Updated table in Iceberg"); > This might be misleading, as we don't update the table until we commit. So Maybe we can use a better name unless the above codes won't invoke any external RPCs. What about "Ready to update Iceberg table"? http://gerrit.cloudera.org:8080/#/c/20491/12/fe/src/main/java/org/apache/impala/util/EventSequence.java File fe/src/main/java/org/apache/impala/util/EventSequence.java: http://gerrit.cloudera.org:8080/#/c/20491/12/fe/src/main/java/org/apache/impala/util/EventSequence.java@50 PS12, Line 50: new EventSequence("unused"); > Not sure if it worth the effort, but we could inherit from EventSequence, e Good point! -- To view, visit http://gerrit.cloudera.org:8080/20491 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ifbceefaeb24c66eb1a064c449d6f56077ea347c5 Gerrit-Change-Number: 20491 Gerrit-PatchSet: 12 Gerrit-Owner: Quanlong Huang <huangquanl...@gmail.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Quanlong Huang <huangquanl...@gmail.com> Gerrit-Reviewer: Zihao Ye <eyiz...@163.com> Gerrit-Reviewer: Zoltan Borok-Nagy <borokna...@cloudera.com> Gerrit-Comment-Date: Tue, 02 Jan 2024 14:50:34 +0000 Gerrit-HasComments: Yes