[Bug rtl-optimization/113495] RISC-V: Time and memory awful consumption of SPEC2017 wrf benchmark

rguenth at gcc dot gnu.org via Gcc-bugs Mon, 22 Jan 2024 04:01:03 -0800

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113495


--- Comment #29 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to rguent...@suse.de from comment #26)
> On Fri, 19 Jan 2024, juzhe.zhong at rivai dot ai wrote:
> 
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113495
> > 
> > --- Comment #22 from JuzheZhong <juzhe.zhong at rivai dot ai> ---
> > (In reply to Richard Biener from comment #21)
> > > I once tried to avoid df_reorganize_refs and/or optimize this with the
> > > blocks involved but failed.
> > 
> > I am considering whether we should disable LICM for RISC-V by default if 
> > vector
> > is enabled ?
> > Since the compile time explode 10 times is really horrible.
> 
> I think that's a bad idea.  It only explodes for some degenerate cases.
> The best would be to fix invariant motion to keep DF up-to-date so
> it can stop using df_analyze_loop and instead analyze the whole function.
> Or maybe change it to use the rtl-ssa framework instead.
> 
> There's already param_loop_invariant_max_bbs_in_loop:
> 
>   /* Process the loops, innermost first.  */
>   for (auto loop : loops_list (cfun, LI_FROM_INNERMOST))
>     {
>       curr_loop = loop;
>       /* move_single_loop_invariants for very large loops is time 
> consuming
>          and might need a lot of memory.  For -O1 only do loop invariant
>          motion for very small loops.  */
>       unsigned max_bbs = param_loop_invariant_max_bbs_in_loop;
>       if (optimize < 2)
>         max_bbs /= 10;
>       if (loop->num_nodes <= max_bbs)
>         move_single_loop_invariants (loop);
>     }
> 
> it might be possible to restrict invariant motion to innermost loops
> when the overall number of loops is too large (with a new param
> for that).  And when the number of innermost loops also exceeds
> the limit avoid even that?  The above also misses a
> optimize_loop_for_speed_p (loop) check (probably doesn't make
> a difference, but you could try).

Ah, sorry - I was mis-matching LICM to invariant motion above, still
invariant motion is the biggest offender (might be due to DF checking
if you enabled that).

As for sbitmap vs. bitmap it's a difficult call.  When there's big
profile hits on individual bit operations (bitmap_bit_p, bitmap_set_bit)
it might may off to use bitmap but with tree view.  There's also
sparseset but that requires even more memory.

[Bug rtl-optimization/113495] RISC-V: Time and memory awful consumption of SPEC2017 wrf benchmark

Reply via email to