Sai Hemanth Gantasala has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20367 )

Change subject: IMPALA-10976: Sync db/table to latest HMS event for all DDL/DMLs
......................................................................


Patch Set 12:

(10 comments)

http://gerrit.cloudera.org:8080/#/c/20367/7//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/20367/7//COMMIT_MSG@29
PS7, Line 29: et 'file_metadata_reload_pro
> 'invalidate_hms_cache_on_ddls' is used only if 'start_hms_server' is true.
Ack. There is no code change that directly depends on this patch. We'll only 
need this flag only if start_hms_server is enabled.


http://gerrit.cloudera.org:8080/#/c/20367/7//COMMIT_MSG@31
PS7, Line 31:
            : Note: We don't modify the cache using MetastoreEventsProcessor for
            : alter table rename
> Yeah, we can mention it.
Ack


http://gerrit.cloudera.org:8080/#/c/20367/12/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/20367/12/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@2816
PS12, Line 2816:         accessLevel_ = getAvailableAccessLevel(getFullName(), 
tblLocation,
> Could you add a comment on why this is not updated in processing the events
I checked hdfsPartition, it is not required there. I'll add some comments on 
why we need this here.


http://gerrit.cloudera.org:8080/#/c/20367/12/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java:

http://gerrit.cloudera.org:8080/#/c/20367/12/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@766
PS12, Line 766:
> nit: duplicated whitespaces
Ack


http://gerrit.cloudera.org:8080/#/c/20367/12/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1530
PS12, Line 1530:       skipTableMetadataReload_ = 
canSkipTableMetadataReload(tableBefore_, tableAfter_);
> It's unclear to me why we can skip reloading the HMS table if StorageDescri
Some tests are failing because the Alter table with Set row format, set 
fileformat/serde, set location doesn't require reloading of table schema when 
the DDL is executed when enable enable_sync_to_latest_event_on_ddls is set to 
false. So, this should be consistent when even 
enable_sync_to_latest_event_on_ddls is set to true. I'll add some comments on 
why I'm doing this.
So I believe skipping table schema set be turned by default in the event 
processor.


http://gerrit.cloudera.org:8080/#/c/20367/7/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/20367/7/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@6667
PS7, Line 6667:               //     ACID tables, there is a Jira to cover 
this: HIVE-22062.
              :               //   2: If no need for a full table reload then 
fetch partition level
              :               //     writeIds and reload only
> Sorry that I was wrong in that comment since I didn't consider the use case
Ack


http://gerrit.cloudera.org:8080/#/c/20367/7/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@6690
PS7, Line 6690:       if (updatedThriftTable == null) {
> The test just adds a new partition in HMS and then runs partition-level REF
Yeah totally agree, I have seen these scenarios in the tests and the tests were 
failing when enable_sync_to_latest_event_on_ddls is enabled by default. So, in 
the latest patch, I have reverted the changes, so the cache is updated in place 
for refresh/invalidate statements (Tests were then green).


http://gerrit.cloudera.org:8080/#/c/20367/12/tests/custom_cluster/test_sync_to_latest_hms_events.py
File tests/custom_cluster/test_sync_to_latest_hms_events.py:

http://gerrit.cloudera.org:8080/#/c/20367/12/tests/custom_cluster/test_sync_to_latest_hms_events.py@35
PS12, Line 35: "--start_hms_server=true "\
             :                                      "--hms_port=5899 "\
             :                                      
"--fallback_to_hms_on_errors=true "\
             :                                      
"--invalidate_hms_cache_on_ddls=false "\
> I don't see we have HMS clients connecting to this port (5899) in this test
Ack


http://gerrit.cloudera.org:8080/#/c/20367/12/tests/custom_cluster/test_sync_to_latest_hms_events.py@57
PS12, Line 57:   @classmethod
             :   def get_workload(self):
             :     return 'functional-query'
> nit: move this after the below comment.
Ack


http://gerrit.cloudera.org:8080/#/c/20367/12/tests/metadata/test_recover_partitions.py
File tests/metadata/test_recover_partitions.py:

http://gerrit.cloudera.org:8080/#/c/20367/12/tests/metadata/test_recover_partitions.py@261
PS12, Line 261:   @SkipIfCatalogV2.impala_8489()
> We can remove this since IMPALA-8489 has been resolved.
Ack



--
To view, visit http://gerrit.cloudera.org:8080/20367
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia250d0a943838086c187e5cb7c60035e5a564bbf
Gerrit-Change-Number: 20367
Gerrit-PatchSet: 12
Gerrit-Owner: Sai Hemanth Gantasala <saihema...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <k.venureddy2...@gmail.com>
Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <huangquanl...@gmail.com>
Gerrit-Reviewer: Sai Hemanth Gantasala <saihema...@cloudera.com>
Gerrit-Comment-Date: Tue, 05 Dec 2023 21:25:30 +0000
Gerrit-HasComments: Yes

Reply via email to