[ https://issues.apache.org/jira/browse/IMPALA-12712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Quanlong Huang updated IMPALA-12712: ------------------------------------ Labels: catalog-2024 (was: ) > INVALIDATE METADATA <table> should set a better createEventId > ------------------------------------------------------------- > > Key: IMPALA-12712 > URL: https://issues.apache.org/jira/browse/IMPALA-12712 > Project: IMPALA > Issue Type: Bug > Components: Catalog > Reporter: Quanlong Huang > Assignee: Sai Hemanth Gantasala > Priority: Critical > Labels: catalog-2024 > > "INVALIDATE METADATA <table>" can be used to bring up a table in Impala's > catalog cache if the table exists in HMS. For instance, when HMS event > processing is disabled, we can use it in Impala to bring up tables that are > created outside Impala. > The createEventId for such tables are always set as -1: > [https://github.com/apache/impala/blob/6ddd69c605d4c594e33fdd39a2ca888538b4b8d7/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java#L2243-L2246] > This is problematic when event-processing is enabled. DropTable events and > RenameTable events use the createEventId to decide whether to remove the > table in catalog cache. -1 will lead to always removing the table. Though it > might be added back shortly in follow-up CreateTable events, in the period > between them the table is missing in Impala, causing test failures like > IMPALA-12266. > A simpler reproducing of the issue is creating a table in Hive and launching > Impala with a long event polling interval to mimic the delay on events. Note > that we start Impala cluster after creating the table so Impala don't need to > process the CREATE_TABLE event. > {noformat} > hive> create table debug_tbl (i int); > bin/start-impala-cluster.py --catalogd_args=--hms_event_polling_interval_s=60 > {noformat} > Drop the table in Impala and recreate it in Hive, so it doesn't exist in the > catalog cache but exist in HMS. Run "INVALIDATE METADATA <table>" in Impala > to bring it up before the DROP_TABLE event come. > {noformat} > impala> drop table debug_tbl; > hive> create table debug_tbl (i int, j int); > impala> invalidate metadata debug_tbl; > {noformat} > The table will be dropped by the DROP_TABLE event and then added back by the > CREATE_TABLE event. Shown in catalogd logs: > {noformat} > I0115 16:30:15.376713 3208 JniUtil.java:177] > 02457b6d5f174d1f:3bdeee1400000000] Finished execDdl request: DROP_TABLE > default.debug_tbl issued by quanlong. Time spent: 417ms > I0115 16:30:23.390962 3208 CatalogServiceCatalog.java:2777] > 1840bd101f78d611:22079a5a00000000] Invalidating table metadata: > default.debug_tbl > I0115 16:30:23.404150 3208 Table.java:234] > 1840bd101f78d611:22079a5a00000000] createEventId_ for table: > default.debug_tbl set to: -1 > I0115 16:30:23.405138 3208 JniUtil.java:177] > 1840bd101f78d611:22079a5a00000000] Finished resetMetadata request: INVALIDATE > TABLE default.debug_tbl issued by quanlong. Time spent: 17ms > I0115 16:30:55.108006 32760 MetastoreEvents.java:637] EventId: 8668853 > EventType: DROP_TABLE Successfully removed table default.debug_tbl > I0115 16:30:55.108459 32760 MetastoreEvents.java:637] EventId: 8668855 > EventType: CREATE_TABLE Successfully added table default.debug_tbl > {noformat} > CC [~VenuReddy], [~hemanth619] -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org