[ 
https://issues.apache.org/jira/browse/IMPALA-7961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17192953#comment-17192953
 ] 

ASF subversion and git services commented on IMPALA-7961:
---------------------------------------------------------

Commit 0c89a9d562c280507a6e842898bf3e41cadc3ff1 in impala's branch 
refs/heads/master from wzhou-code
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=0c89a9d ]

IMPALA-10140: Fix CatalogExeception for creating database with sync_ddl as true

IMPALA-7961 handle the cases for query "create table if not exists"
with sync_ddl as true. Customers reported similar issue which happened
for query "create database if not exists" with sync_ddl as true.
This patch adds the similar fixing as the fixing for IMPALA-7961 to
function CatalogOpExecutor.createDatabase() to fix the issue.

Testing:
 - Manual tests
   Since this is a racy bug, I could only reproduce it by forcing
   frequent topicUpdateLog GCs along with a specific sequence of
   actions, like: run some DDLs and REFRESHs to trigger a GC in
   topicUpdateLog, then run query "create database if not exists" with
   sync_ddl as true. Verified that the issue couldn't be reproduced
   after applying this patch.
 - Passed exhaustive test.

Change-Id: Id623118f8938f416414c45d93404fb70d036a9df
Reviewed-on: http://gerrit.cloudera.org:8080/16421
Reviewed-by: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenk...@cloudera.com>


> Concurrent catalog heavy workloads can cause queries with SYNC_DDL to fail 
> fast
> -------------------------------------------------------------------------------
>
>                 Key: IMPALA-7961
>                 URL: https://issues.apache.org/jira/browse/IMPALA-7961
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Catalog
>    Affects Versions: Impala 2.12.0, Impala 3.1.0
>            Reporter: Bharath Vissapragada
>            Assignee: Bharath Vissapragada
>            Priority: Critical
>             Fix For: Impala 3.2.0
>
>         Attachments: 0001-Repro-of-IMPALA-7961.patch
>
>
> When catalog server is under heavy load with concurrent updates to objects, 
> queries with SYNC_DDL can fail with the following message.
> *User facing error message:*
> {noformat}
> ERROR: CatalogException: Couldn't retrieve the catalog topic version for the 
> SYNC_DDL operation after 3 attempts.The operation has been successfully 
> executed but its effects may have not been broadcast to all the coordinators.
> {noformat}
> *Exception from the catalog server log:*
> {noformat}
> I1031 00:00:49.168761 1127039 CatalogServiceCatalog.java:1903] Operation 
> using SYNC_DDL is waiting for catalog topic version: 236535. Time to identify 
> topic version (msec): 1088
> I1031 00:00:49.168824 1125528 CatalogServiceCatalog.java:1903] Operation 
> using SYNC_DDL is waiting for catalog topic version: 236535. Time to identify 
> topic version (msec): 12625
> I1031 00:00:49.168851 1131986 jni-util.cc:230] 
> org.apache.impala.catalog.CatalogException: Couldn't retrieve the catalog 
> topic version for the SYNC_DDL operation after 3 attempts.The operation has 
> been successfully executed but its effects may have not been broadcast to all 
> the coordinators.
>         at 
> org.apache.impala.catalog.CatalogServiceCatalog.waitForSyncDdlVersion(CatalogServiceCatalog.java:1891)
>         at 
> org.apache.impala.service.CatalogOpExecutor.execDdlRequest(CatalogOpExecutor.java:336)
>         at org.apache.impala.service.JniCatalog.execDdl(JniCatalog.java:146)
> ::::
> {noformat}
> *What this means*
> The Catalog operation is actually successful (the change has been committed 
> to HMS and Catalog server cache) but the Catalog server noticed that it is 
> taking longer than expected time for it to broadcast the changes (for 
> whatever reason) and instead of hanging in there, it fails fast. The 
> coordinators are expected to eventually sync up in the background.
> *Problem*
>  - This violates the contract of the SYNC_DDL query option since the query 
> returns early.
>  - This is a behavioral regression from pre IMPALA-5058 state where the 
> queries would wait forever for SYNC_DDL based changes to propagate.
> *Notes*
>  - Introduced by IMPALA-5058
>  - Based on the occurrences of this issue, we narrowed it down to a specific 
> kind of DDLs (see Jira comments).
>  - My understanding is that this also applies to the Catalog V2 (or 
> LocalCatalog mode) since we still rely on the CatalogServer for DDL 
> orchestration and hence it takes this codepath.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

Reply via email to