[PR] test: remove "Comet (Scan)" cases from microbenchmarks [datafusion-comet]

via GitHub Thu, 07 May 2026 07:31:17 -0700


andygrove opened a new pull request, #4258:
URL: https://github.com/apache/datafusion-comet/pull/4258


   ## Which issue does this PR close?
   
   Closes #.
   
   ## Rationale for this change
   
   The microbenchmarks added a "Comet (Scan)" case (`COMET_ENABLED=true`, 
`COMET_EXEC_ENABLED=false`) intending to isolate scan performance from operator 
performance. With `spark.comet.scan.impl=auto` (the default), 
`CometScanRule.nativeDataFusionScan` refuses to install when exec is disabled, 
so the case actually measures `native_iceberg_compat` scan + Spark 
`ColumnarToRow`. Comparing it against the other Comet case (which uses 
`native_datafusion` + `CometNativeColumnarToRow`) makes the result a proxy for 
scan-impl choice rather than the intended scan-vs-scan+exec isolation. The 
numbers are confusing rather than informative: the rlike microbenchmark is a 
clear example, where the Project falls back in both Comet cases yet "Comet 
(Scan + Exec)" still shows ~3x over "Comet (Scan)" purely because of the 
upstream scan/c2r difference.
   
   ## What changes are included in this PR?
   
   - `CometBenchmarkBase.runExpressionBenchmark`: drop the `Comet (Scan)` case, 
rename `Comet (Scan + Exec)` to `Comet`, update the scaladoc.
   - `CometExecBenchmark`: drop five `SQL Parquet - Comet (Scan)` cases, rename 
`SQL Parquet - Comet (Scan, Exec)` to `SQL Parquet - Comet` (including the 
BloomFilterAgg variant). The `SQL Parquet - Spark (Scan), Comet (Exec)` case in 
the Project+Filter benchmark stays: it forces a different scan source and is a 
meaningfully different config.
   - Doc-comment fixes in `CometStringExpressionBenchmark`, 
`CometCsvExpressionBenchmark`, `CometJsonExpressionBenchmark` ("scan+exec case" 
-> "Comet case").
   
   The plan-not-fully-Comet warning logic in `runExpressionBenchmark` is 
retained: it still surfaces fallbacks (rlike, regexp_replace, etc.) and is more 
useful with the simpler case list.
   
   ## How are these changes tested?
   
   These are benchmarks; the change is renaming/removing benchmark cases. 
Verified with `./mvnw test-compile` and `./mvnw scalastyle:check`. No behavior 
change to production code.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[PR] test: remove "Comet (Scan)" cases from microbenchmarks [datafusion-comet]

Reply via email to