[ 
https://issues.apache.org/jira/browse/IMPALA-13631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17946915#comment-17946915
 ] 

ASF subversion and git services commented on IMPALA-13631:
----------------------------------------------------------

Commit ef8f8ca27b52f7fd842a7a887d5c9a8db9831f79 in impala's branch 
refs/heads/master from Michael Smith
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=ef8f8ca27 ]

IMPALA-13631: (Addendum) Retry aborted concurrent DDLs

TestConcurrentDdls has several exceptions it considers acceptable for
testing; it would accept the query failure and continue with other
cases. That was fine for existing queries, but if an ALTER RENAME fails
subsequent queries will also fail because the table does not have the
expected name.

With IMPALA-13631, there are three exception cases we need to handle:
1. "Table/view rename succeeded in the Hive Metastore, but failed in
   Impala's Catalog Server" happens when the HMS alter_table RPC
   succeeds but local catalog has changed. INVALIDATE METADATA on the
   target table is sufficient to bring things in sync.
2. "CatalogException: Table ... was modified while operation was in
   progress, aborting execution" can safely be retried.
3. "Couldn't retrieve the catalog topic update for the SYNC_DDL
   operation" happens when SYNC_DDL=1 and the DDL runs on a stale table
   object that's removed from the cache by a global INVALIDATE.

Adds --max_wait_time_for_sync_ddl_s=10 in catalogd_args for the last
exception to occur. Otherwise the query will just timeout.

Tested by running test_concurrent_ddls.py 15 times. The 1st exception
previously would show up within 3-4 runs, while the 2nd exception
happens pretty much every run.

Change-Id: I04d071b62e4f306466a69ebd9e134a37d4327b77
Reviewed-on: http://gerrit.cloudera.org:8080/22802
Reviewed-by: Impala Public Jenkins <[email protected]>
Tested-by: Michael Smith <[email protected]>


> alterTableOrViewRename shouldn't hold catalogVersionLock during external RPCs
> -----------------------------------------------------------------------------
>
>                 Key: IMPALA-13631
>                 URL: https://issues.apache.org/jira/browse/IMPALA-13631
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Catalog
>    Affects Versions: Impala 4.5.0
>            Reporter: Quanlong Huang
>            Assignee: Quanlong Huang
>            Priority: Critical
>             Fix For: Impala 5.0.0
>
>
> CatalogOpExecutor.alterTableOrViewRename() requires holding the 
> catalogVersion writeLock, as the comment mentioned:
> {code:java}
>         // RENAME is implemented as an ADD + DROP, so we need to execute it 
> as we hold
>         // the catalog lock.
>         try {
>           alterTableOrViewRename(tbl,
>               
> TableName.fromThrift(params.getRename_params().getNew_table_name()),
>               modification, wantMinimalResult, response, catalogTimeline);
>           modification.validateInProgressModificationComplete();
>           return;
>         } finally {
>           // release the version taken in the tryLock call above
>           catalog_.getLock().writeLock().unlock();
>         } {code}
> https://github.com/apache/impala/blob/0bbd2b684ddc7dcf8b6c16f1f7c6fab15291f782/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java#L1221-L1232
> However, alterTableOrViewRename() triggers external RPCs, e.g. HMS 
> alter_table RPC, which could hang due to external issues. Holding the 
> catalogVersion writeLock blocks all other catalog operations, including all 
> the read requests like getPartialCatalogObject or collecting catalog updates. 
> This will impact the whole Impala cluster. Lots of queries will be blocked in 
> the CREATED state.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to