alamb opened a new issue, #18411:
URL: https://github.com/apache/datafusion/issues/18411

   ### Describe the bug
   
   Breaking out from 
https://github.com/apache/datafusion/issues/18341#issuecomment-3466350575 from 
@2010YOUY01 
   
   There is a discord discussion for slow tpch q1 
https://discord.com/channels/885562378132000778/1290751484807352412/1432863136612089959
 from @camuel
   
   ```sql
   select
       l_returnflag,
       l_linestatus,
       count(*) as count_order
   from
       lineitem
   group by
       l_returnflag,
       l_linestatus;
   ```
   
   DuckDB is run by creating views over parquet with this command and also 
threads are set to 12 (which matches actual cores on my AMD64 machine):
   ```sql
   CREATE VIEW lineitem AS SELECT * FROM 
read_parquet('...repos/datafusion/sf1000/lineitem/*.parquet');
   ```
   
   DataFusion is run by running this command:
   ```
   cargo run --profile release --bin tpch -- benchmark datafusion --path 
./sf1000 --partitions 12 --format parquet --query 1 --iterations=1 
--memory-limit 1G --debug --batch-size 8192 --prefer_hash_join true 
--mem-pool-type fair
   ```
   
   gist with logs: 
https://gist.github.com/camuel/67b4424205b81f06d657ea093ddbfe3c 
   
   
   
   ### To Reproduce
   
   _No response_
   
   ### Expected behavior
   
   _No response_
   
   ### Additional context
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to