andygrove commented on issue #182:
URL:
https://github.com/apache/datafusion-comet/issues/182#issuecomment-2113077753
My repro:
Using latest commit from main (`1a04805be5e0f3a634521a821b24c0e0efb43d31`) I
ran `make release`.
Started Spark shell with:
```shell
$SPARK_HOME/bin/spark-shell \
--jars spark/target/comet-spark-spark3.4_2.12-0.1.0-SNAPSHOT.jar \
--conf spark.sql.extensions=org.apache.comet.CometSparkSessionExtensions
\
--conf spark.comet.enabled=true \
--conf spark.comet.exec.enabled=true \
--conf spark.comet.exec.all.enabled=true \
--conf spark.comet.explainFallback.enabled=true
```
Ran this code:
```scala
val tables = Seq("customer", "lineitem", "nation", "orders", "part",
"partsupp", "region", "supplier")
tables.foreach(t =>
spark.read.parquet(s"/Users/andy/Data/sf100-parquet/${t}.parquet").createOrReplaceTempView(t)))
val sql =
scala.io.Source.fromFile("/Users/andy/git/datafusion-contrib/sqlbench-h/queries/sf=100/q2.sql").mkString
spark.time(spark.sql(sql).collect)
```
The Parquet files were produced by DataFusion. I have been benchmarking with
these same files with Comet on my Linux desktop with no issues.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]