[jira] [Created] (IMPALA-12712) INVALIDATE METADATA should set a better createEventId

Mon, 15 Jan 2024 00:35:22 -0800

Quanlong Huang created IMPALA-12712:
---------------------------------------

             Summary: INVALIDATE METADATA <table> should set a better 
createEventId
                 Key: IMPALA-12712
                 URL: https://issues.apache.org/jira/browse/IMPALA-12712
             Project: IMPALA
          Issue Type: Bug
          Components: Catalog
            Reporter: Quanlong Huang


"INVALIDATE METADATA <table>" can be used to bring up a table in Impala's 
catalog cache if the table exists in HMS. For instance, when HMS event 
processing is disabled, we can use it in Impala to bring up tables that are 
created outside Impala.

The createEventId for such tables are always set as -1:
[https://github.com/apache/impala/blob/6ddd69c605d4c594e33fdd39a2ca888538b4b8d7/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java#L2243-L2246]

This is problematic when event-processing is enabled. DropTable events and 
RenameTable events use the createEventId to decide whether to remove the table 
in catalog cache. -1 will lead to always removing the table. Though it might be 
added back shortly in follow-up CreateTable events, in the period between them 
the table is missing in Impala, causing test failures like IMPALA-12266.

A simpler reproducing of the issue is creating a table in Hive and launching 
Impala with a long event polling interval to mimic the delay on events. Note 
that we start Impala cluster after creating the table so Impala don't need to 
process the CREATE_TABLE event.
{noformat}
hive> create table debug_tbl (i int);

bin/start-impala-cluster.py --catalogd_args=--hms_event_polling_interval_s=60
{noformat}
Drop the table in Impala and recreate it in Hive, so it doesn't exist in the 
catalog cache but exist in HMS. Run "INVALIDATE METADATA <table>" in Impala to 
bring it up before the DROP_TABLE event come.
{noformat}
impala> drop table debug_tbl;
hive> create table debug_tbl (i int, j int);
impala> invalidate metadata debug_tbl;
{noformat}
The table will be dropped by the DROP_TABLE event and then added back by the 
CREATE_TABLE event. Shown in catalogd logs:
{noformat}
I0115 16:30:15.376713  3208 JniUtil.java:177] 
02457b6d5f174d1f:3bdeee1400000000] Finished execDdl request: DROP_TABLE 
default.debug_tbl issued by quanlong. Time spent: 417ms

I0115 16:30:23.390962  3208 CatalogServiceCatalog.java:2777] 
1840bd101f78d611:22079a5a00000000] Invalidating table metadata: 
default.debug_tbl
I0115 16:30:23.404150  3208 Table.java:234] 1840bd101f78d611:22079a5a00000000] 
createEventId_ for table: default.debug_tbl set to: -1
I0115 16:30:23.405138  3208 JniUtil.java:177] 
1840bd101f78d611:22079a5a00000000] Finished resetMetadata request: INVALIDATE 
TABLE default.debug_tbl issued by quanlong. Time spent: 17ms

I0115 16:30:55.108006 32760 MetastoreEvents.java:637] EventId: 8668853 
EventType: DROP_TABLE Successfully removed table default.debug_tbl

I0115 16:30:55.108459 32760 MetastoreEvents.java:637] EventId: 8668855 
EventType: CREATE_TABLE Successfully added table default.debug_tbl
{noformat}
CC [~VenuReddy], [~hemanth619]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to