viirya commented on code in PR #613:
URL: https://github.com/apache/datafusion-comet/pull/613#discussion_r1664503124


##########
spark/src/test/scala/org/apache/spark/sql/CometTPCDSQuerySuite.scala:
##########
@@ -158,6 +158,11 @@ class CometTPCDSQuerySuite
     conf.set(CometConf.COMET_EXEC_ALL_OPERATOR_ENABLED.key, "true")
     conf.set(CometConf.COMET_EXEC_SHUFFLE_ENABLED.key, "true")
     conf.set(CometConf.COMET_MEMORY_OVERHEAD.key, "20g")
+    conf.set(CometConf.COMET_SHUFFLE_ENFORCE_MODE_ENABLED.key, "true")
+    conf.set("spark.sql.adaptive.coalescePartitions.enabled", "true")
+    // Disable `CometTakeOrderedAndProjectExec` because it doesn't produce 
same output order
+    // as Spark.
+    conf.set("spark.comet.exec.takeOrderedAndProjectExec.disabled", "true")

Review Comment:
   I think these tests should be deterministic (that's why we can compare it 
with golden files). I'm not sure why `CometTakeOrderedAndProjectExec` returns 
out of order results.
   
   The results are same, but the orders are different to Spark. I suspect that 
it is something related to sorting part in `CometTakeOrderedAndProjectExec`. As 
the sorting is delegated to DataFusion's sort/top k operators, I need to 
investigate particularly for the failed query (e.g., q6).
   
   It is not related to the change here, though. So I will investigate it 
separately in follow PRs.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to