[ https://issues.apache.org/jira/browse/IMPALA-12543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17839700#comment-17839700 ]
ASF subversion and git services commented on IMPALA-12543: ---------------------------------------------------------- Commit e1bbdacc5133d36d997e0e19c52753df90376a1e in impala's branch refs/heads/branch-4.4.0 from Riza Suminto [ https://gitbox.apache.org/repos/asf?p=impala.git;h=e1bbdacc5 ] IMPALA-12543: Detect self-events before finishing DDL test_iceberg_self_events has been flaky for not having tbls_refreshed_before equal to tbls_refreshed_after in-between query executions. Further investigation reveals concurrency bug due to db/table level lock is not taken during db/table self-events check (IMPALA-12461 part1). The order of ALTER TABLE operation is as follow: 1. alter table starts in CatalogOpExecutor 2. table level lock is taken 3. HMS RPC starts (CatalogOpExecutor.applyAlterTable()) 4. HMS generates the event 5. HMS RPC returns 6. table is reloaded 7. catalog version is added to inflight event list 8. table level lock is released Meanwhile the event processor thread fetches the new event after 4 and before 7. Because of IMPALA-12461 (part 1), it can also finish self-events checking before reaching 7. Before IMPALA-12461, self-events would have needed to wait for 8. Note that this issue is only relevant for table level events, as self-events checking for partition level events still takes table lock. This patch fix the issue by adding newCatalogVersion to the table's inflight event list before updating HMS using helper class InProgressTableModification. If HMS update does not complete (ie., an exception is thrown), the new newCatalogVersion that was added is then removed. This patch also fix few smaller issues, including: - Avoid incrementing EVENTS_SKIPPED_METRIC if numFilteredEvents == 0 in MetastoreEventFactory.getFilteredEvents(). - Increment EVENTS_SKIPPED_METRIC in MetastoreTableEvent.reloadTableFromCatalog() if table is already in the middle of reloading (revealed through flaky test_skipping_older_events). - Rephrase misleading log message in MetastoreEventProcessor.getNextMetastoreEvents(). Testing: - Add TestEventProcessingWithImpala, run it with debug_action and sync_ddl dimensions. - Pass exhaustive tests. Change-Id: I8365c934349ad21a4d9327fc11594d2fc3445f79 Reviewed-on: http://gerrit.cloudera.org:8080/21029 Reviewed-by: Riza Suminto <riza.sumi...@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenk...@cloudera.com> > test_iceberg_self_events failed in JDK11 build > ---------------------------------------------- > > Key: IMPALA-12543 > URL: https://issues.apache.org/jira/browse/IMPALA-12543 > Project: IMPALA > Issue Type: Bug > Reporter: Riza Suminto > Assignee: Riza Suminto > Priority: Major > Labels: broken-build > Fix For: Impala 4.4.0 > > Attachments: catalogd.INFO, std_err.txt > > > test_iceberg_self_events failed in JDK11 build with following error. > > {code:java} > Error Message > assert 0 == 1 > Stacktrace > custom_cluster/test_events_custom_configs.py:637: in test_iceberg_self_events > check_self_events("ALTER TABLE {0} ADD COLUMN j INT".format(tbl_name)) > custom_cluster/test_events_custom_configs.py:624: in check_self_events > assert tbls_refreshed_before == tbls_refreshed_after > E assert 0 == 1 {code} > This test still pass before IMPALA-11387 merged. > -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org