coderfender commented on PR #3542:
URL:
https://github.com/apache/datafusion-comet/pull/3542#issuecomment-3917679967
@0lai0 , @andygrove . We might want to hold off onto this PR before
merging. There is a test failure and I am not sure we covered all possible
`Literal` conditions in our case statement . Steps to reproduce the SQL failure
```
test("concat_ws test - no constant folding") {
withSQLConf(
"spark.sql.optimizer.excludedRules" ->
"org.apache.spark.sql.catalyst.optimizer.ConstantFolding") {
withParquetTable(Seq(1, 2).map(Tuple1(_)), "t") {
val df = sql("SELECT concat_ws(',', NULL, 'b', 'c'), concat_ws(NULL,
'a', 'b') FROM t")
df.explain(true)
checkSparkAnswerAndOperator(df)
}
}
}
```
Error (with plan)
```
== Parsed Logical Plan ==
'Project [unresolvedalias('concat_ws(,, null, b, c), None),
unresolvedalias('concat_ws(null, a, b), None)]
+- 'UnresolvedRelation [t], [], false
== Analyzed Logical Plan ==
concat_ws(,, NULL, b, c): string, concat_ws(NULL, a, b): string
Project [concat_ws(,, cast(null as array<string>), b, c) AS concat_ws(,,
NULL, b, c)#5, concat_ws(cast(null as string), a, b) AS concat_ws(NULL, a, b)#6]
+- SubqueryAlias t
+- View (`t`, [_1#3])
+- Relation [_1#3] parquet
== Optimized Logical Plan ==
Project [concat_ws(,, null, b, c) AS concat_ws(,, NULL, b, c)#5,
concat_ws(null, a, b) AS concat_ws(NULL, a, b)#6]
+- Relation [_1#3] parquet
== Physical Plan ==
*(1) CometColumnarToRow
+- CometProject [concat_ws(,, NULL, b, c)#5, concat_ws(NULL, a, b)#6],
[concat_ws(,, null, b, c) AS concat_ws(,, NULL, b, c)#5, concat_ws(null, a, b)
AS concat_ws(NULL, a, b)#6]
+- CometScan [native_iceberg_compat] parquet [] Batched: true,
DataFilters: [], Format: CometParquet, Location: InMemoryFileIndex(1
paths)[file:/private/var/folders/k0/t16s7rgj6gl2x008c266k4vm0000gn/T/spark-53...,
PartitionFilters: [], PushedFilters: [], ReadSchema: struct<>
Job aborted due to stage failure: Task 0 in stage 3.0 failed 1 times, most
recent failure: Lost task 0.0 in stage 3.0 (TID 5) (172.16.2.87 executor
driver): org.apache.comet.CometNativeException: Expected string literal, got
None.
This issue was likely caused by a bug in DataFusion's code. Please help us
to resolve this by filing a bug report in our issue tracker:
https://github.com/apache/datafusion/issues
at org.apache.comet.Native.executePlan(Native Method)
at
org.apache.comet.CometExecIterator.$anonfun$getNextBatch$2(CometExecIterator.scala:150)
at
org.apache.comet.CometExecIterator.$anonfun$getNextBatch$2$adapted(CometExecIterator.scala:149)
at org.apache.comet.vector.NativeUtil.getNextBatch(NativeUtil.scala:232)
at
org.apache.comet.CometExecIterator.$anonfun$getNextBatch$1(CometExecIterator.scala:149)
at org.apache.comet.Tracing$.withTrace(Tracing.scala:31)
at
org.apache.comet.CometExecIterator.getNextBatch(CometExecIterator.scala:147)
at
org.apache.comet.CometExecIterator.hasNext(CometExecIterator.scala:203)
at
org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.cometcolumnartorow_nextBatch_0$(Unknown
Source)
at
org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown
Source)
at
org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
at
org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:760)
at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460)
at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460)
at org.apache.spark.util.Iterators$.size(Iterators.scala:29)
at org.apache.spark.util.Utils$.getIteratorSize(Utils.scala:1953)
at org.apache.spark.rdd.RDD.$anonfun$count$1(RDD.scala:1269)
at org.apache.spark.rdd.RDD.$anonfun$count$1$adapted(RDD.scala:1269)
at
org.apache.spark.SparkContext.$anonfun$runJob$5(SparkContext.scala:2303)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:92)
at
org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:161)
at org.apache.spark.scheduler.Task.run(Task.scala:139)
at
org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:554)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1529)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:557)
at
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
at
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
at java.base/java.lang.Thread.run(Thread.java:840)
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]