[ 
https://issues.apache.org/jira/browse/IMPALA-13631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17946853#comment-17946853
 ] 

Michael Smith commented on IMPALA-13631:
----------------------------------------

I have some concerns about this patch in busy environments. I think the chances 
of running into
{quote}CatalogException: Table ... was modified while operation was in 
progress, aborting execution{quote}
are unchanged, but we started seeing
{quote}Table/view rename succeeded in the Hive Metastore, but failed in 
Impala's Catalog Server.{quote}
when {{INVALIDATE METADATA}} is run during an {{ALTER TABLE RENAME}}. This 
shows up in the new statements added to test_concurrent_ddls.py.

I can reproduce this error by adding a delay after HMS {{alter_table}} RPC 
completes (and before we getNextMetastoreEventsForTableIfEnabled) and running 
{{INVALIDATE METADATA}} from another session. I think that suggests the 
scenario as:
# alter_table RPC completes
# Impala invalidate metadata executes and processes alter_table event
# alterTableOrViewRename runs catalog_.alterTable, but old table has already 
been removed from the catalog so it fails

This may be relatively rare, but relatively rare stuff can be common in very 
busy environments. I think we can address this with better error handling in 
{{alterTableOrViewRename}}.

> alterTableOrViewRename shouldn't hold catalogVersionLock during external RPCs
> -----------------------------------------------------------------------------
>
>                 Key: IMPALA-13631
>                 URL: https://issues.apache.org/jira/browse/IMPALA-13631
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Catalog
>    Affects Versions: Impala 4.5.0
>            Reporter: Quanlong Huang
>            Assignee: Quanlong Huang
>            Priority: Critical
>             Fix For: Impala 5.0.0
>
>
> CatalogOpExecutor.alterTableOrViewRename() requires holding the 
> catalogVersion writeLock, as the comment mentioned:
> {code:java}
>         // RENAME is implemented as an ADD + DROP, so we need to execute it 
> as we hold
>         // the catalog lock.
>         try {
>           alterTableOrViewRename(tbl,
>               
> TableName.fromThrift(params.getRename_params().getNew_table_name()),
>               modification, wantMinimalResult, response, catalogTimeline);
>           modification.validateInProgressModificationComplete();
>           return;
>         } finally {
>           // release the version taken in the tryLock call above
>           catalog_.getLock().writeLock().unlock();
>         } {code}
> https://github.com/apache/impala/blob/0bbd2b684ddc7dcf8b6c16f1f7c6fab15291f782/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java#L1221-L1232
> However, alterTableOrViewRename() triggers external RPCs, e.g. HMS 
> alter_table RPC, which could hang due to external issues. Holding the 
> catalogVersion writeLock blocks all other catalog operations, including all 
> the read requests like getPartialCatalogObject or collecting catalog updates. 
> This will impact the whole Impala cluster. Lots of queries will be blocked in 
> the CREATED state.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to