[ https://issues.apache.org/jira/browse/IMPALA-7961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17192953#comment-17192953 ]
ASF subversion and git services commented on IMPALA-7961: --------------------------------------------------------- Commit 0c89a9d562c280507a6e842898bf3e41cadc3ff1 in impala's branch refs/heads/master from wzhou-code [ https://gitbox.apache.org/repos/asf?p=impala.git;h=0c89a9d ] IMPALA-10140: Fix CatalogExeception for creating database with sync_ddl as true IMPALA-7961 handle the cases for query "create table if not exists" with sync_ddl as true. Customers reported similar issue which happened for query "create database if not exists" with sync_ddl as true. This patch adds the similar fixing as the fixing for IMPALA-7961 to function CatalogOpExecutor.createDatabase() to fix the issue. Testing: - Manual tests Since this is a racy bug, I could only reproduce it by forcing frequent topicUpdateLog GCs along with a specific sequence of actions, like: run some DDLs and REFRESHs to trigger a GC in topicUpdateLog, then run query "create database if not exists" with sync_ddl as true. Verified that the issue couldn't be reproduced after applying this patch. - Passed exhaustive test. Change-Id: Id623118f8938f416414c45d93404fb70d036a9df Reviewed-on: http://gerrit.cloudera.org:8080/16421 Reviewed-by: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenk...@cloudera.com> > Concurrent catalog heavy workloads can cause queries with SYNC_DDL to fail > fast > ------------------------------------------------------------------------------- > > Key: IMPALA-7961 > URL: https://issues.apache.org/jira/browse/IMPALA-7961 > Project: IMPALA > Issue Type: Bug > Components: Catalog > Affects Versions: Impala 2.12.0, Impala 3.1.0 > Reporter: Bharath Vissapragada > Assignee: Bharath Vissapragada > Priority: Critical > Fix For: Impala 3.2.0 > > Attachments: 0001-Repro-of-IMPALA-7961.patch > > > When catalog server is under heavy load with concurrent updates to objects, > queries with SYNC_DDL can fail with the following message. > *User facing error message:* > {noformat} > ERROR: CatalogException: Couldn't retrieve the catalog topic version for the > SYNC_DDL operation after 3 attempts.The operation has been successfully > executed but its effects may have not been broadcast to all the coordinators. > {noformat} > *Exception from the catalog server log:* > {noformat} > I1031 00:00:49.168761 1127039 CatalogServiceCatalog.java:1903] Operation > using SYNC_DDL is waiting for catalog topic version: 236535. Time to identify > topic version (msec): 1088 > I1031 00:00:49.168824 1125528 CatalogServiceCatalog.java:1903] Operation > using SYNC_DDL is waiting for catalog topic version: 236535. Time to identify > topic version (msec): 12625 > I1031 00:00:49.168851 1131986 jni-util.cc:230] > org.apache.impala.catalog.CatalogException: Couldn't retrieve the catalog > topic version for the SYNC_DDL operation after 3 attempts.The operation has > been successfully executed but its effects may have not been broadcast to all > the coordinators. > at > org.apache.impala.catalog.CatalogServiceCatalog.waitForSyncDdlVersion(CatalogServiceCatalog.java:1891) > at > org.apache.impala.service.CatalogOpExecutor.execDdlRequest(CatalogOpExecutor.java:336) > at org.apache.impala.service.JniCatalog.execDdl(JniCatalog.java:146) > :::: > {noformat} > *What this means* > The Catalog operation is actually successful (the change has been committed > to HMS and Catalog server cache) but the Catalog server noticed that it is > taking longer than expected time for it to broadcast the changes (for > whatever reason) and instead of hanging in there, it fails fast. The > coordinators are expected to eventually sync up in the background. > *Problem* > - This violates the contract of the SYNC_DDL query option since the query > returns early. > - This is a behavioral regression from pre IMPALA-5058 state where the > queries would wait forever for SYNC_DDL based changes to propagate. > *Notes* > - Introduced by IMPALA-5058 > - Based on the occurrences of this issue, we narrowed it down to a specific > kind of DDLs (see Jira comments). > - My understanding is that this also applies to the Catalog V2 (or > LocalCatalog mode) since we still rely on the CatalogServer for DDL > orchestration and hence it takes this codepath. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org