[jira] [Updated] (IMPALA-12712) INVALIDATE METADATA should set a better createEventId

Mon, 22 Apr 2024 18:43:16 -0700


     [ 
https://issues.apache.org/jira/browse/IMPALA-12712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Quanlong Huang updated IMPALA-12712:
------------------------------------
    Labels: catalog-2024  (was: )

> INVALIDATE METADATA <table> should set a better createEventId
> -------------------------------------------------------------
>
>                 Key: IMPALA-12712
>                 URL: https://issues.apache.org/jira/browse/IMPALA-12712
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Catalog
>            Reporter: Quanlong Huang
>            Assignee: Sai Hemanth Gantasala
>            Priority: Critical
>              Labels: catalog-2024
>
> "INVALIDATE METADATA <table>" can be used to bring up a table in Impala's 
> catalog cache if the table exists in HMS. For instance, when HMS event 
> processing is disabled, we can use it in Impala to bring up tables that are 
> created outside Impala.
> The createEventId for such tables are always set as -1:
> [https://github.com/apache/impala/blob/6ddd69c605d4c594e33fdd39a2ca888538b4b8d7/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java#L2243-L2246]
> This is problematic when event-processing is enabled. DropTable events and 
> RenameTable events use the createEventId to decide whether to remove the 
> table in catalog cache. -1 will lead to always removing the table. Though it 
> might be added back shortly in follow-up CreateTable events, in the period 
> between them the table is missing in Impala, causing test failures like 
> IMPALA-12266.
> A simpler reproducing of the issue is creating a table in Hive and launching 
> Impala with a long event polling interval to mimic the delay on events. Note 
> that we start Impala cluster after creating the table so Impala don't need to 
> process the CREATE_TABLE event.
> {noformat}
> hive> create table debug_tbl (i int);
> bin/start-impala-cluster.py --catalogd_args=--hms_event_polling_interval_s=60
> {noformat}
> Drop the table in Impala and recreate it in Hive, so it doesn't exist in the 
> catalog cache but exist in HMS. Run "INVALIDATE METADATA <table>" in Impala 
> to bring it up before the DROP_TABLE event come.
> {noformat}
> impala> drop table debug_tbl;
> hive> create table debug_tbl (i int, j int);
> impala> invalidate metadata debug_tbl;
> {noformat}
> The table will be dropped by the DROP_TABLE event and then added back by the 
> CREATE_TABLE event. Shown in catalogd logs:
> {noformat}
> I0115 16:30:15.376713  3208 JniUtil.java:177] 
> 02457b6d5f174d1f:3bdeee1400000000] Finished execDdl request: DROP_TABLE 
> default.debug_tbl issued by quanlong. Time spent: 417ms
> I0115 16:30:23.390962  3208 CatalogServiceCatalog.java:2777] 
> 1840bd101f78d611:22079a5a00000000] Invalidating table metadata: 
> default.debug_tbl
> I0115 16:30:23.404150  3208 Table.java:234] 
> 1840bd101f78d611:22079a5a00000000] createEventId_ for table: 
> default.debug_tbl set to: -1
> I0115 16:30:23.405138  3208 JniUtil.java:177] 
> 1840bd101f78d611:22079a5a00000000] Finished resetMetadata request: INVALIDATE 
> TABLE default.debug_tbl issued by quanlong. Time spent: 17ms
> I0115 16:30:55.108006 32760 MetastoreEvents.java:637] EventId: 8668853 
> EventType: DROP_TABLE Successfully removed table default.debug_tbl
> I0115 16:30:55.108459 32760 MetastoreEvents.java:637] EventId: 8668855 
> EventType: CREATE_TABLE Successfully added table default.debug_tbl
> {noformat}
> CC [~VenuReddy], [~hemanth619]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

Reply via email to