logan-keede commented on issue #17261:
URL: https://github.com/apache/datafusion/issues/17261#issuecomment-3505310653

   After some testing to narrow the cause of the difference between benchmark 
and datafusion-cli, I found out that the cause is the fact that `UInt64` 
somehow does not get optimised by `eliminate_nested_union` optimiser(which is 
being used in benchmarks), after changing it to `Int64` it seems to work fine.
   It reduces the time by about half for 50 columns  and 100 columns scenario.
   
   ```
   Int64
   Benchmarking physical_sorted_union_order_by_50: Collecting 100 samples in 
estimated 11.081 s (physical_sorted_union_order_by_50
                           time:   [105.39 ms 105.57 ms 105.78 ms]
                           change: [−1.4254% −0.8674% −0.3407%] (p = 0.00 < 
0.05)
                           Change within noise threshold.
   Found 5 outliers among 100 measurements (5.00%)
     3 (3.00%) high mild
     2 (2.00%) high severe
   
   UInt64
   Benchmarking physical_sorted_union_order_by_50: Warming up for 3.0000 s
   Warning: Unable to complete 100 samples in 5.0s. You may wish to increase 
target time to 25.7s, or reduce sample count to 10.
   Benchmarking physical_sorted_union_order_by_50: Collecting 100 samples in 
estimated 25.740 s (physical_sorted_union_order_by_50
                           time:   [252.76 ms 254.23 ms 255.73 ms]
                           change: [+139.23% +140.81% +142.27%] (p = 0.00 < 
0.05)
                           Performance has regressed.
   Found 2 outliers among 100 measurements (2.00%)
     1 (1.00%) high mild
     1 (1.00%) high severe
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to