Hi,

On 2026-01-28 07:56:46 +0000, Pierre Ducroquet wrote:
> Here is a rebased version of the patch with a rewrite of the comment.  Thank
> you again for your previous review.  FYI, I've tried adding other passes but
> none had a similar benefits over cost ratio. The benefits could rather be in
> changing from O3 to an extensive list of passes.

I agree that we should have a better list of passes. I'm a bit worried that
having an explicit list of passes that we manage ourselves is going to be
somewhat of a pain to maintain across llvm versions, but ...

WRT passes that might be worth having even with -O0 - running duplicate
function merging early on could be quite useful, particularly because we won't
inline the deform routines anyway.


> > I did some benchmarks on some TPCH queries (1 and 4) and I got these
> > results. Note that for these tests I set jit_optimize_above_cost=1000000
> > so that it force to use the default<O0> pass with simplifycfg.

FYI, you can use -1 to just disble it, instead of having to rely on a specific
cost.

> > 
> > Master Q1:
> > Timing: Generation 1.553 ms (Deform 0.573 ms), Inlining 0.052 ms, 
> > Optimization 95.571 ms, Emission 58.941 ms, Total 156.116 ms
> > Execution Time: 38221.318 ms
> > 
> > Patch Q1:
> > Timing: Generation 1.477 ms (Deform 0.534 ms), Inlining 0.040 ms, 
> > Optimization 95.364 ms, Emission 58.046 ms, Total 154.927 ms
> > Execution Time: 38257.797 ms
> > 
> > Master Q4:
> > Timing: Generation 0.836 ms (Deform 0.309 ms), Inlining 0.086 ms, 
> > Optimization 5.098 ms, Emission 6.963 ms, Total 12.983 ms
> > Execution Time: 19512.134 ms
> > 
> > Patch Q4:
> > Timing: Generation 0.802 ms (Deform 0.294 ms), Inlining 0.090 ms, 
> > Optimization 5.234 ms, Emission 6.521 ms, Total 12.648 ms
> > Execution Time: 16051.483 ms
> > 
> > 
> > For Q4 I see a small increase on Optimization phase but we have a good
> > performance improvement on execution time. For Q1 the results are almost
> > the same.

These queries are all simple enough that I'm not sure this is a particularly
good benchmark for optimization speed. In particular, the deform routines
don't have to deal with a lot of columns and there aren't a lot of functions
(although I guess that shouldn't really matter WRT simplifycfg).


Greetings,

Andres Freund


Reply via email to