BsoBird opened a new issue, #8624:
URL: https://github.com/apache/iceberg/issues/8624
### Apache Iceberg version
1.3.1 (latest release)
### Query engine
Spark
### Please describe the bug 🐞
I have found that when I configure multiple Catalogs in the same
SparkSession, in some cases Iceberg does not work.
Example:
```
---iceberg catalog
spark.sql.catalog.datacenter org.apache.iceberg.spark.SparkCatalog
spark.sql.catalog.datacenter.type hadoop
spark.sql.catalog.datacenter.warehouse /iceberg-catalog/warehouse
--paimon catalog
spark.sql.catalog.paimon org.apache.paimon.spark.SparkCatalog
spark.sql.catalog.paimon.warehouse hdfs:///paimon/warehouse
spark.jars
/data/kyuubi/spark_aux_lib/paimon-spark-3.3-0.6-20230922.002014-17.jar
```
当我执行在ICEBERG CATALOG下执行MERGE INTO语句时,报错信息如下:
```
org.apache.spark.SparkException: The Spark SQL phase planning failed with an
internal error. Please, fill a bug report in, and provide the full stack trace.
at
org.apache.spark.sql.execution.QueryExecution$.toInternalError(QueryExecution.scala:500)
at
org.apache.spark.sql.execution.QueryExecution$.withInternalError(QueryExecution.scala:512)
at
org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$1(QueryExecution.scala:185)
at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:779)
at
org.apache.spark.sql.execution.QueryExecution.executePhase(QueryExecution.scala:184)
at
org.apache.spark.sql.execution.QueryExecution.sparkPlan$lzycompute(QueryExecution.scala:145)
at
org.apache.spark.sql.execution.QueryExecution.sparkPlan(QueryExecution.scala:138)
at
org.apache.spark.sql.execution.QueryExecution.$anonfun$executedPlan$1(QueryExecution.scala:158)
at
org.apache.spark.sql.catalyst.QueryPlanningTracker.measurePhase(QueryPlanningTracker.scala:111)
at
org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$2(QueryExecution.scala:185)
at
org.apache.spark.sql.execution.QueryExecution$.withInternalError(QueryExecution.scala:510)
at
org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$1(QueryExecution.scala:185)
at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:779)
at
org.apache.spark.sql.execution.QueryExecution.executePhase(QueryExecution.scala:184)
at
org.apache.spark.sql.execution.QueryExecution.executedPlan$lzycompute(QueryExecution.scala:158)
at
org.apache.spark.sql.execution.QueryExecution.executedPlan(QueryExecution.scala:151)
at
org.apache.spark.sql.execution.QueryExecution.simpleString(QueryExecution.scala:204)
at
org.apache.spark.sql.execution.QueryExecution.org$apache$spark$sql$execution$QueryExecution$$explainString(QueryExecution.scala:249)
at
org.apache.spark.sql.execution.QueryExecution.explainString(QueryExecution.scala:218)
at
org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$6(SQLExecution.scala:103)
at
org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:169)
at
org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:95)
at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:779)
at
org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:64)
at
org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.applyOrElse(QueryExecution.scala:98)
at
org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.applyOrElse(QueryExecution.scala:94)
at
org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformDownWithPruning$1(TreeNode.scala:584)
at
org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:176)
at
org.apache.spark.sql.catalyst.trees.TreeNode.transformDownWithPruning(TreeNode.scala:584)
at
org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.org$apache$spark$sql$catalyst$plans$logical$AnalysisHelper$$super$transformDownWithPruning(LogicalPlan.scala:30)
at
org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning(AnalysisHelper.scala:267)
at
org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning$(AnalysisHelper.scala:263)
at
org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:30)
at
org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:30)
at
org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:560)
at
org.apache.spark.sql.execution.QueryExecution.eagerlyExecuteCommands(QueryExecution.scala:94)
at
org.apache.spark.sql.execution.QueryExecution.commandExecuted$lzycompute(QueryExecution.scala:81)
at
org.apache.spark.sql.execution.QueryExecution.commandExecuted(QueryExecution.scala:79)
at org.apache.spark.sql.Dataset.<init>(Dataset.scala:220)
at org.apache.spark.sql.Dataset$.$anonfun$ofRows$2(Dataset.scala:100)
at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:779)
at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:97)
at
org.apache.spark.sql.SparkSession.$anonfun$sql$1(SparkSession.scala:622)
at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:779)
at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:617)
at
org.apache.kyuubi.engine.spark.operation.ExecuteStatement.$anonfun$executeStatement$1(ExecuteStatement.scala:83)
at
scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at
org.apache.kyuubi.engine.spark.operation.SparkOperation.$anonfun$withLocalProperties$1(SparkOperation.scala:155)
at
org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:169)
at
org.apache.kyuubi.engine.spark.operation.SparkOperation.withLocalProperties(SparkOperation.scala:139)
at
org.apache.kyuubi.engine.spark.operation.ExecuteStatement.executeStatement(ExecuteStatement.scala:78)
at
org.apache.kyuubi.engine.spark.operation.ExecuteStatement$$anon$1.run(ExecuteStatement.scala:100)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
Caused by: java.lang.AssertionError: assertion failed: No plan for
ReplaceIcebergData RelationV2[data_from#13985, partner#13986, plat_code#13987,
uni_shop_id#13988, category_id#13989, parent_category_id#13990,
category_name#13991, root_cid#13992, is_leaf#13993, tenant#13994,
last_sync#13995] datacenter.dwd.b_std_category,
IcebergWrite(table=datacenter.dwd.b_std_category, format=ORC)
+- Sort [icebergbuckettransform(64, uni_shop_id#14019) ASC NULLS FIRST],
false
+- RepartitionByExpression [icebergbuckettransform(64,
uni_shop_id#14019)], 12288
+- Project [data_from#14016, partner#14017, plat_code#14018,
uni_shop_id#14019, category_id#14020, parent_category_id#14021,
category_name#14022, root_cid#14023, is_leaf#14024, tenant#14025,
last_sync#14026]
+- MergeRows[data_from#14016, partner#14017, plat_code#14018,
uni_shop_id#14019, category_id#14020, parent_category_id#14021,
category_name#14022, root_cid#14023, is_leaf#14024, tenant#14025,
last_sync#14026, _file#14027]
+- Join FullOuter, ((uni_shop_id#13979 = uni_shop_id#13988) AND
(category_id#13980 = category_id#13989)), leftHint=(strategy=no_broadcast_hash)
:- NoStatsUnaryNode
: +- Project [data_from#13985, partner#13986,
plat_code#13987, uni_shop_id#13988, category_id#13989,
parent_category_id#13990, category_name#13991, root_cid#13992, is_leaf#13993,
tenant#13994, last_sync#13995, _file#14010, true AS __row_from_target#14013,
monotonically_increasing_id() AS __row_id#14014L]
: +- Filter dynamicpruning#14075 [_file#14010]
: : +- Project [_file#14074]
: : +- Join LeftSemi, ((uni_shop_id#13979 =
uni_shop_id#14066) AND (category_id#13980 = category_id#14067))
: : :- Filter isnotnull(category_id#14067)
: : : +- RelationV2[uni_shop_id#14066,
category_id#14067, _file#14074] datacenter.dwd.b_std_category
: : +- Project [uni_shop_id#13979,
category_id#13980]
: : +- Filter ((rank#13984 = 1) AND
(isnotnull(uni_shop_id#13979) AND isnotnull(category_id#13980)))
: : +- Window [row_number()
windowspecdefinition(shop_id#14000, cid#14001, modified#14004 DESC NULLS LAST,
specifiedwindowframe(RowFrame, unboundedpreceding$(), currentrow$())) AS
rank#13984], [shop_id#14000, cid#14001], [modified#14004 DESC NULLS LAST]
: : +- Project [shop_id#14000 AS
uni_shop_id#13979, cid#14001 AS category_id#13980, shop_id#14000, cid#14001,
modified#14004]
: : +- HiveTableRelation
[`dw_base_temp`.`category_analyse_result`,
org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, Data Cols:
[data_from#13997, partner#13998, plat_code#13999, shop_id#14000, cid#14001,
parent_cid#14002, nam..., Partition Cols: []]
: +- RelationV2[data_from#13985, partner#13986,
plat_code#13987, uni_shop_id#13988, category_id#13989,
parent_category_id#13990, category_name#13991, root_cid#13992, is_leaf#13993,
tenant#13994, last_sync#13995, _file#14010] datacenter.dwd.b_std_category
+- Project [data_from#14028, partner#14029, plat_code#14030,
uni_shop_id#13979, category_id#13980, parent_category_id#13981,
category_name#13982, root_cid#14036, is_leaf#14037, tenant#14038,
last_sync#13983, true AS __row_from_source#14015]
+- Filter (rank#13984 = 1)
+- Window [row_number()
windowspecdefinition(shop_id#14031, cid#14032, modified#14035 DESC NULLS LAST,
specifiedwindowframe(RowFrame, unboundedpreceding$(), currentrow$())) AS
rank#13984], [shop_id#14031, cid#14032], [modified#14035 DESC NULLS LAST]
+- Project [data_from#14028, partner#14029,
plat_code#14030, shop_id#14031 AS uni_shop_id#13979, cid#14032 AS
category_id#13980, parent_cid#14033 AS parent_category_id#13981, name#14034 AS
category_name#13982, root_cid#14036, is_leaf#14037, tenant#14038,
modified#14035 AS last_sync#13983, shop_id#14031, cid#14032, modified#14035]
+- HiveTableRelation
[`dw_base_temp`.`category_analyse_result`,
org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, Data Cols:
[data_from#14028, partner#14029, plat_code#14030, shop_id#14031, cid#14032,
parent_cid#14033, nam..., Partition Cols: []]
at scala.Predef$.assert(Predef.scala:223)
at
org.apache.spark.sql.catalyst.planning.QueryPlanner.plan(QueryPlanner.scala:93)
at
org.apache.spark.sql.execution.SparkStrategies.plan(SparkStrategies.scala:69)
at
org.apache.spark.sql.catalyst.planning.QueryPlanner.$anonfun$plan$3(QueryPlanner.scala:78)
at
scala.collection.TraversableOnce$folder$1.apply(TraversableOnce.scala:196)
at
scala.collection.TraversableOnce$folder$1.apply(TraversableOnce.scala:194)
at scala.collection.Iterator.foreach(Iterator.scala:943)
at scala.collection.Iterator.foreach$(Iterator.scala:943)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1431)
at scala.collection.TraversableOnce.foldLeft(TraversableOnce.scala:199)
at scala.collection.TraversableOnce.foldLeft$(TraversableOnce.scala:192)
at scala.collection.AbstractIterator.foldLeft(Iterator.scala:1431)
at
org.apache.spark.sql.catalyst.planning.QueryPlanner.$anonfun$plan$2(QueryPlanner.scala:75)
at scala.collection.Iterator$$anon$11.nextCur(Iterator.scala:486)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:492)
at
org.apache.spark.sql.catalyst.planning.QueryPlanner.plan(QueryPlanner.scala:93)
at
org.apache.spark.sql.execution.SparkStrategies.plan(SparkStrategies.scala:69)
at
org.apache.spark.sql.execution.QueryExecution$.createSparkPlan(QueryExecution.scala:459)
at
org.apache.spark.sql.execution.QueryExecution.$anonfun$sparkPlan$1(QueryExecution.scala:145)
at
org.apache.spark.sql.catalyst.QueryPlanningTracker.measurePhase(QueryPlanningTracker.scala:111)
at
org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$2(QueryExecution.scala:185)
at
org.apache.spark.sql.execution.QueryExecution$.withInternalError(QueryExecution.scala:510)
... 55 more
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]