https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120941

--- Comment #37 from H.J. Lu <hjl.tools at gmail dot com> ---
(In reply to Richard Biener from comment #35)
> (In reply to H.J. Lu from comment #33)
> > Created attachment 61995 [details]
> > An updated patch
> > 
> > Please try this.
> 
> Looking at the patch I do wonder about
> 
> static void 
> ix86_place_single_vector_set (rtx dest, rtx src, bitmap bbs,
>                               rtx inner_scalar = nullptr)
> {                        
>   basic_block bb = nearest_common_dominator_for_set (CDI_DOMINATORS, bbs);
>   while (bb->loop_father->latch              
>          != EXIT_BLOCK_PTR_FOR_FN (cfun))
>     bb = get_immediate_dominator (CDI_DOMINATORS,
>                                   bb->loop_father->header);
> 
> when the nearest common dominator is a BB in a loop nest like
> 
>  loop {
>    loop {
>    }
> 
>    loop {
>       BB;
>    }
>    BB';
>  }
> 
> this will skip an arbitrary number of earlier sibling loops.  I think
> if we want to do such additional hoisting at all - for a splat of a
> non-constant we have to ensure the set of the source we splat is still
> dominating the insertion point (where's that done?) - it IMO only
> makes sense (without extra costing) to hoist the set out of a perfect
> nest, thus never across earlier sibling loops.  Even for BB' this is
> likely problematic.

Since my patch works, I'd like to keep it as is.  Will it work for you?

Reply via email to