alamb opened a new issue, #18411: URL: https://github.com/apache/datafusion/issues/18411
### Describe the bug Breaking out from https://github.com/apache/datafusion/issues/18341#issuecomment-3466350575 from @2010YOUY01 There is a discord discussion for slow tpch q1 https://discord.com/channels/885562378132000778/1290751484807352412/1432863136612089959 from @camuel ```sql select l_returnflag, l_linestatus, count(*) as count_order from lineitem group by l_returnflag, l_linestatus; ``` DuckDB is run by creating views over parquet with this command and also threads are set to 12 (which matches actual cores on my AMD64 machine): ```sql CREATE VIEW lineitem AS SELECT * FROM read_parquet('...repos/datafusion/sf1000/lineitem/*.parquet'); ``` DataFusion is run by running this command: ``` cargo run --profile release --bin tpch -- benchmark datafusion --path ./sf1000 --partitions 12 --format parquet --query 1 --iterations=1 --memory-limit 1G --debug --batch-size 8192 --prefer_hash_join true --mem-pool-type fair ``` gist with logs: https://gist.github.com/camuel/67b4424205b81f06d657ea093ddbfe3c ### To Reproduce _No response_ ### Expected behavior _No response_ ### Additional context _No response_ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
