Re: [PR] chore: add TPC queries to be run by fuzzer correctness checker [datafusion-comet]

via GitHub Thu, 23 Oct 2025 08:34:58 -0700


andygrove commented on PR #2632:
URL: 
https://github.com/apache/datafusion-comet/pull/2632#issuecomment-3437685842


   > > I don't think that we should have a combined 
fuzz-testing-and-tpc-benchmark tool. They serve quite different purposes. I 
think it would be better to move the DataFrame comparison logic into a shared 
class somewhere and then update our benchmarking tool to be able to use it.
   > > This probably means that we need to convert our benchmark script from 
Python to Scala.
   > 
   > Another option would be to update the existing Python benchmark script to 
save query results to Parquet, and then implement a command-line tool for 
comparing the Parquet files produced from the Spark and Comet runs.
   
   I created https://github.com/apache/datafusion-comet/pull/2640 to add a new 
option to the benchmark script, to write query results to Parquet.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] chore: add TPC queries to be run by fuzzer correctness checker [datafusion-comet]

Reply via email to