Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20367 )

Change subject: IMPALA-10976: Sync db/table to latest HMS event for all DDL/DMLs
......................................................................


Patch Set 23:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/20367/21//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/20367/21//COMMIT_MSG@27
PS21, Line 27: Set 'enable_sync_to_latest_event_on_ddls' to true.
> We just need to set the flag 'enable_sync_to_latest_event_on_ddls' right? I
I mean this paragraph is incomplete. This patch doesn't set this flag to true 
by default. Do you want to say "Set 'enable_sync_to_latest_event_on_ddls' to 
true to enable this feature" ?

We'd better mention the performance impact on this flag, i.e. DDL/DML might 
need more time to execute due to fetching and applying other events.


http://gerrit.cloudera.org:8080/#/c/20367/21/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/20367/21/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2676
PS21, Line 2676:         org.apache.hadoop.hive.metastore.api.Table msTbl = 
null;
> When the table is reloaded and it is not a fullReload, then we need to 
> setRefreshEventId to currentHMSEventId so that older events can be skipped.

What if the current reload just reloads the file metadata, and there is an 
older event that needs to reload the table schema? The older event shouldn't be 
skipped in this case.

I think the explanation should be: if 'eventId' is 'currentHmsEventId', all the 
older events on this table should have been processed so it's ok to update the 
lastRefreshEventId.


http://gerrit.cloudera.org:8080/#/c/20367/23/tests/custom_cluster/test_sync_to_latest_hms_events.py
File tests/custom_cluster/test_sync_to_latest_hms_events.py:

http://gerrit.cloudera.org:8080/#/c/20367/23/tests/custom_cluster/test_sync_to_latest_hms_events.py@60
PS23, Line 60:     class_instance.test_truncate_cleans_hdfs_files(self.client, 
self.filesystem_client,
Can we use TestDdlStatements.test_truncate_cleans_hdfs_files() directly?



--
To view, visit http://gerrit.cloudera.org:8080/20367
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia250d0a943838086c187e5cb7c60035e5a564bbf
Gerrit-Change-Number: 20367
Gerrit-PatchSet: 23
Gerrit-Owner: Sai Hemanth Gantasala <saihema...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <k.venureddy2...@gmail.com>
Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <huangquanl...@gmail.com>
Gerrit-Reviewer: Sai Hemanth Gantasala <saihema...@cloudera.com>
Gerrit-Comment-Date: Fri, 05 Jan 2024 12:34:58 +0000
Gerrit-HasComments: Yes

Reply via email to