Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/20367 )
Change subject: IMPALA-10976: Sync db/table to latest HMS event for all DDL/DMLs ...................................................................... Patch Set 7: (10 comments) http://gerrit.cloudera.org:8080/#/c/20367/7//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/20367/7//COMMIT_MSG@7 PS7, Line 7: IMPALA-10976: Sync db/table to latest HMS event : for all DDL/DMLs nit: Let's put the title in one line http://gerrit.cloudera.org:8080/#/c/20367/7//COMMIT_MSG@29 PS7, Line 29: invalidate_hms_cache_on_ddls How does this flag impact the feature added by this patch? I think it only matters if --start_hms_server=true but it defaults to false. http://gerrit.cloudera.org:8080/#/c/20367/7//COMMIT_MSG@31 PS7, Line 31: Note: We don't modify the cache using MetastoreEventsProcessor for : alter table rename operation as this is a complex operation regarding : cache modification. Please file a follow-up JIRA for this. http://gerrit.cloudera.org:8080/#/c/20367/7/fe/src/main/java/org/apache/impala/catalog/TableLoader.java File fe/src/main/java/org/apache/impala/catalog/TableLoader.java: http://gerrit.cloudera.org:8080/#/c/20367/7/fe/src/main/java/org/apache/impala/catalog/TableLoader.java@147 PS7, Line 147: atleast nit: please fix this typo BTW - "at least" http://gerrit.cloudera.org:8080/#/c/20367/6/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/20367/6/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1145 PS6, Line 1145: if (refreshedTable != null) { > Ack Can we add some e2e tests? This patch modifies almost all the DDL/DML code paths. I'm not confident that the new FE tests are enough. We have e2e tests on DDLs under tests/metadata, e.g. test_ddl.py. It'd be nice if we could copy some of them to be custom-cluster tests with the enableSyncToLatestEventOnDdls flag on. http://gerrit.cloudera.org:8080/#/c/20367/7/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/20367/7/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@6666 PS7, Line 6666: updatedThriftTable = getUpdatedThriftTable(tbl, resultType); This seems redundant. http://gerrit.cloudera.org:8080/#/c/20367/7/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@6667 PS7, Line 6667: Preconditions.checkArgument(BackendConfig.INSTANCE.enableReloadEvents(), : "Set enable_reload_events=true to refresh table with enable HMS " + : "sync on table option"); I'm not sure if we should enforce this. Intuitively we can reload the table and update lastSyncedEventId as lastRefreshEventId. One difference between REFRESH and other DDLs is that REFRESH won't update the schema in HMS. It just reloads the schema and the file metadata. I just realized we already do so in catalog_.reloadTable() which is invoked by the current method: https://github.com/apache/impala/blob/9af97895f891c465e3723be9140001ebd60d747b/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java#L2619-L2628 Probably we don't need this if-clause. http://gerrit.cloudera.org:8080/#/c/20367/7/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@6690 PS7, Line 6690: cmdString); When syncToLatestEventId is on, can we add a warning saying events on the table might not be synced for partition level REFRESH? http://gerrit.cloudera.org:8080/#/c/20367/7/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@7745 PS7, Line 7745: * @param tbl : * @return nit: remove such empty comments http://gerrit.cloudera.org:8080/#/c/20367/7/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java File fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java: http://gerrit.cloudera.org:8080/#/c/20367/7/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@3495 PS7, Line 3495: * @param dbName : * @param tblName : * @return : * @throws ImpalaException nit: please remove these actually empty comments -- To view, visit http://gerrit.cloudera.org:8080/20367 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia250d0a943838086c187e5cb7c60035e5a564bbf Gerrit-Change-Number: 20367 Gerrit-PatchSet: 7 Gerrit-Owner: Sai Hemanth Gantasala <saihema...@cloudera.com> Gerrit-Reviewer: Anonymous Coward <k.venureddy2...@gmail.com> Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Quanlong Huang <huangquanl...@gmail.com> Gerrit-Reviewer: Sai Hemanth Gantasala <saihema...@cloudera.com> Gerrit-Comment-Date: Thu, 09 Nov 2023 10:03:29 +0000 Gerrit-HasComments: Yes