Sai Hemanth Gantasala has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20022 )

Change subject: IMPALA-11535: Skip older events in the event processor based on 
the latestRefreshEventID
......................................................................


Patch Set 4:

(11 comments)

http://gerrit.cloudera.org:8080/#/c/20022/3/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/20022/3/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1285
PS3, Line 1285:                 refreshUpdatedPartitions, partitionToEventId, 
debugAction);
> This and the comment at L1298 confuse me since no codes after them are actu
Ack


http://gerrit.cloudera.org:8080/#/c/20022/3/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1923
PS3, Line 1923:     for (HdfsPartition.Builder p : partBuilders) {
> I think we should get this before loadFileMetadataForPartitions() as well.
Ack


http://gerrit.cloudera.org:8080/#/c/20022/3/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1930
PS3, Line 1930:         p.setLastRefreshEventId(latestEventId);
> We'd better check and only update it if 'latestEventId' is not -1 or lower
Ack


http://gerrit.cloudera.org:8080/#/c/20022/3/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1932
PS3, Line 1932:     }
> Let's add the table name in the log.
Ack


http://gerrit.cloudera.org:8080/#/c/20022/3/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@2947
PS3, Line 2947:
> This seems a static method and can be moved to MetastoreEventsProcessor. Ac
Ack


http://gerrit.cloudera.org:8080/#/c/20022/3/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java:

http://gerrit.cloudera.org:8080/#/c/20022/3/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1053
PS3, Line 1053:           String messageStr = partitionEventObj == null ? 
"Skipping the event since the" +
> We'd better add more info about why an event is skipped or not skipped. The
Ack


http://gerrit.cloudera.org:8080/#/c/20022/3/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/20022/3/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1529
PS3, Line 1529:       } else {
> This happens after we reload the table metadata which is unsafe if there ar
I'll fetch the current event id before l#1522 and set it after HDFS#load(). I 
fear that we might lose these values if we set these values before loading the 
metadata.

Why is it any different for non-hdfs tables? we treat all the tables as same 
whenever we are deciding, whether to skip the event or not right?


http://gerrit.cloudera.org:8080/#/c/20022/3/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1537
PS3, Line 1537:     }
> It seems too strict to fail the DDL just for failures in getting the latest
Ack


http://gerrit.cloudera.org:8080/#/c/20022/3/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1615
PS3, Line 1615:         tbl.load(true, msClient.getHiveClient(), msTbl, "ALTER 
VIEW");
> The same as above, I think this should run before the load() and exceptions
Ack


http://gerrit.cloudera.org:8080/#/c/20022/3/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
File 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java:

http://gerrit.cloudera.org:8080/#/c/20022/3/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@3219
PS3, Line 3219: setEnableSyncToLatestEventOnDdls(true)
> Is this required to turn on this optimization?
Unfortunately yes. I'm reusing currentEventId fetched from this flag in reload 
table logic here:
https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java#L2579
https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java#L2619


http://gerrit.cloudera.org:8080/#/c/20022/3/tests/custom_cluster/test_events_custom_configs.py
File tests/custom_cluster/test_events_custom_configs.py:

http://gerrit.cloudera.org:8080/#/c/20022/3/tests/custom_cluster/test_events_custom_configs.py@316
PS3, Line 316: older_events(self, unique_d
> Do we need this flag?
Actually no. I'll take out this flag.



--
To view, visit http://gerrit.cloudera.org:8080/20022
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic0dc5c7396d80616680d8a5805ce80db293b72e1
Gerrit-Change-Number: 20022
Gerrit-PatchSet: 4
Gerrit-Owner: Sai Hemanth Gantasala <saihema...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <huangquanl...@gmail.com>
Gerrit-Reviewer: Sai Hemanth Gantasala <saihema...@cloudera.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <borokna...@cloudera.com>
Gerrit-Comment-Date: Wed, 02 Aug 2023 01:47:29 +0000
Gerrit-HasComments: Yes

Reply via email to