Sai Hemanth Gantasala has posted comments on this change. ( http://gerrit.cloudera.org:8080/20022 )
Change subject: IMPALA-11535: Skip older events in the event processor based on the latestRefreshEventID ...................................................................... Patch Set 4: (11 comments) http://gerrit.cloudera.org:8080/#/c/20022/3/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java: http://gerrit.cloudera.org:8080/#/c/20022/3/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1285 PS3, Line 1285: refreshUpdatedPartitions, partitionToEventId, debugAction); > This and the comment at L1298 confuse me since no codes after them are actu Ack http://gerrit.cloudera.org:8080/#/c/20022/3/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1923 PS3, Line 1923: for (HdfsPartition.Builder p : partBuilders) { > I think we should get this before loadFileMetadataForPartitions() as well. Ack http://gerrit.cloudera.org:8080/#/c/20022/3/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1930 PS3, Line 1930: p.setLastRefreshEventId(latestEventId); > We'd better check and only update it if 'latestEventId' is not -1 or lower Ack http://gerrit.cloudera.org:8080/#/c/20022/3/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1932 PS3, Line 1932: } > Let's add the table name in the log. Ack http://gerrit.cloudera.org:8080/#/c/20022/3/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@2947 PS3, Line 2947: > This seems a static method and can be moved to MetastoreEventsProcessor. Ac Ack http://gerrit.cloudera.org:8080/#/c/20022/3/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java: http://gerrit.cloudera.org:8080/#/c/20022/3/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1053 PS3, Line 1053: String messageStr = partitionEventObj == null ? "Skipping the event since the" + > We'd better add more info about why an event is skipped or not skipped. The Ack http://gerrit.cloudera.org:8080/#/c/20022/3/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/20022/3/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1529 PS3, Line 1529: } else { > This happens after we reload the table metadata which is unsafe if there ar I'll fetch the current event id before l#1522 and set it after HDFS#load(). I fear that we might lose these values if we set these values before loading the metadata. Why is it any different for non-hdfs tables? we treat all the tables as same whenever we are deciding, whether to skip the event or not right? http://gerrit.cloudera.org:8080/#/c/20022/3/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1537 PS3, Line 1537: } > It seems too strict to fail the DDL just for failures in getting the latest Ack http://gerrit.cloudera.org:8080/#/c/20022/3/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1615 PS3, Line 1615: tbl.load(true, msClient.getHiveClient(), msTbl, "ALTER VIEW"); > The same as above, I think this should run before the load() and exceptions Ack http://gerrit.cloudera.org:8080/#/c/20022/3/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java File fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java: http://gerrit.cloudera.org:8080/#/c/20022/3/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@3219 PS3, Line 3219: setEnableSyncToLatestEventOnDdls(true) > Is this required to turn on this optimization? Unfortunately yes. I'm reusing currentEventId fetched from this flag in reload table logic here: https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java#L2579 https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java#L2619 http://gerrit.cloudera.org:8080/#/c/20022/3/tests/custom_cluster/test_events_custom_configs.py File tests/custom_cluster/test_events_custom_configs.py: http://gerrit.cloudera.org:8080/#/c/20022/3/tests/custom_cluster/test_events_custom_configs.py@316 PS3, Line 316: older_events(self, unique_d > Do we need this flag? Actually no. I'll take out this flag. -- To view, visit http://gerrit.cloudera.org:8080/20022 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic0dc5c7396d80616680d8a5805ce80db293b72e1 Gerrit-Change-Number: 20022 Gerrit-PatchSet: 4 Gerrit-Owner: Sai Hemanth Gantasala <saihema...@cloudera.com> Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Quanlong Huang <huangquanl...@gmail.com> Gerrit-Reviewer: Sai Hemanth Gantasala <saihema...@cloudera.com> Gerrit-Reviewer: Zoltan Borok-Nagy <borokna...@cloudera.com> Gerrit-Comment-Date: Wed, 02 Aug 2023 01:47:29 +0000 Gerrit-HasComments: Yes