[
https://issues.apache.org/jira/browse/IMPALA-14695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18068939#comment-18068939
]
Quanlong Huang commented on IMPALA-14695:
-----------------------------------------
I can reproduce the issue by adding some sleeps (on commit 772ebd227):
{code:java}
diff --git
a/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java
b/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java
index c48191c25..929e7ac49 100644
--- a/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java
+++ b/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java
@@ -479,9 +479,15 @@ public class CatalogdMetaProvider implements MetaProvider {
byte[] ret = null;
Stopwatch sw = Stopwatch.createStarted();
try {
+ if (req.object_desc.isSetCatalog_version()) {
+ LOG.info("Sleep before request " + req);
+ Thread.sleep(1000);
+ }
ret = FeSupport.GetPartialCatalogObject(new
TSerializer().serialize(req));
} catch (InternalException e) {
throw new TException(e);
+ } catch (InterruptedException e) {
+ throw new RuntimeException(e);
} finally {
sw.stop();
FrontendProfile profile = FrontendProfile.getCurrentOrNull();
diff --git a/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
b/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
index 60434dc09..10414d144 100644
--- a/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
+++ b/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
@@ -1250,6 +1250,11 @@ public class CatalogOpExecutor {
BLACKLISTED_TABLES_INCONSISTENT_ERR_STR));
}
tryWriteLock(tbl, catalogTimeline);
+ try {
+ Thread.sleep(1000);
+ } catch (InterruptedException e) {
+ throw new RuntimeException(e);
+ }
// get table's catalogVersion before altering it
long oldCatalogVersion = tbl.getCatalogVersion();
// Get a new catalog version, wrap it in InProgressTableModification, and
assign new{code}
h3. *Repro steps*
First create a partitioned table.
{code:sql}
create table part_10 (i int) partitioned by (p int);
alter table part_10 add partition(p=0);
alter table part_10 add partition(p=1);
alter table part_10 add partition(p=2);
alter table part_10 add partition(p=3);
alter table part_10 add partition(p=4);
alter table part_10 add partition(p=5);
alter table part_10 add partition(p=6);
alter table part_10 add partition(p=7);
alter table part_10 add partition(p=8);
alter table part_10 add partition(p=9);{code}
Then run the following script to submit ALTER PARTITION statements in loops.
{code:bash}
#!/bin/bash
TBL=part_10
for i in {0..9}; do
for j in `seq 5`; do
impala-shell.sh -B -q "alter table $TBL partition(p=$i) set
tblproperties('numRows'='0')"
done &
done
wait{code}
I can see coordinator logs like
{noformat}
W20260327 17:06:05.697455 564860 Frontend.java:2604]
d24a713a82017555:8485ae2100000000] Retrying plan of query alter table part_10
partition(p=1) set tblproperties('numRows'='0'): Catalog object
TCatalogObject(type:TABLE, catalog_version:2385, table:TTable(db_name:default,
tbl_name:part_10)) changed version between accesses. (retry #7 of 40){noformat}
Uploaded [^profile_retry_example.txt] as an example.
> Fast path for simple partition queries to fetch missing metadata from
> catalogd in batch
> ---------------------------------------------------------------------------------------
>
> Key: IMPALA-14695
> URL: https://issues.apache.org/jira/browse/IMPALA-14695
> Project: IMPALA
> Issue Type: Bug
> Components: Catalog, Frontend
> Reporter: Quanlong Huang
> Assignee: Quanlong Huang
> Priority: Critical
> Attachments: profile_retry_example.txt
>
>
> It's a regression of local catalog mode (i.e. catalog-v2) that query planning
> needs to retry when metadata changes in catalogd side. Example coordinator
> logs:
> {code:java}
> W0121 11:23:13.707223 317257 CatalogdMetaProvider.java:469]
> 0f4d14a4e9abc477:7cfdb09400000000] Catalog object TCatalogObject(type:TABLE,
> catalog_version:19032, table:TTable(db_name:mydb, tbl_name:mytbl)) changed
> version from 19032 to 19041 while fetching metadata
> W0121 11:23:20.107473 317257 Frontend.java:2127]
> 0f4d14a4e9abc477:7cfdb09400000000] Retrying plan of query alter table
> mydb.mytbl partition (p='2268357') set tblproperties('numRows'='1325',
> 'STATS_GENERATED_VIA_STATS_TASK'='true'): Catalog object
> TCatalogObject(type:TABLE, catalog_version:19032, table:TTable(db_name:mydb,
> tbl_name:mytbl)) changed version between accesses. (retry #32 of 40)
> {code}
> For simple queries like REFRESH PARTITION, COMPUTE INCREMENTAL STATS on a
> single partition, or ALTER TABLE on a single partition, if some metadata is
> missing in coordinator's local cache, we can consider sending a single batch
> request to catalogd to fetch all the metadata the query needs, thus to avoid
> hitting InconsistentMetadataFetchException which requires retries.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]