https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97832

--- Comment #20 from Michael_S <already5chosen at yahoo dot com> ---
(In reply to Richard Biener from comment #17)
> (In reply to Michael_S from comment #16)
> > On unrelated note, why loop overhead uses so many instructions?
> > Assuming that I am as misguided as gcc about load-op combining, I would
> > write it as:
> >   sub %rax, %rdx
> > .L3:
> >   vmovupd   (%rdx,%rax), %ymm1
> >   vmovupd 32(%rdx,%rax), %ymm0
> >   vfmadd213pd    32(%rax), %ymm3, %ymm1
> >   vfnmadd213pd     (%rax), %ymm2, %ymm0
> >   vfnmadd231pd   32(%rdx,%rax), %ymm3, %ymm0
> >   vfnmadd231pd     (%rdx,%rax), %ymm2, %ymm1
> >   vmovupd %ymm0,   (%rax)
> >   vmovupd %ymm1, 32(%rax)
> >   addq    $64, %rax
> >   decl    %esi
> >   jb      .L3
> >   
> > The loop overhead in my variant is 3 x86 instructions==2 macro-ops,
> > vs 5 x86 instructions==4 macro-ops in gcc variant.
> > Also, in gcc variant all memory accesses have displacement that makes them
> > 1 byte longer. In my variant only half of accesses have displacement.
> > 
> > I think, in the past I had seen cases where gcc generates optimal or
> > near-optimal
> > code sequences for loop overhead. I wonder why it can not do it here.
> 
> I don't think we currently consider IVs based on the difference of two
> addresses.  

It seems to me that I had seen you doing it.
But, may be, I confuse gcc with clang.

> The cost benefit of no displacement is only size, 

Size is pretty important in high-IPC SIMD loops. Esp. on Intel and when # of
iterations is small, because Intel has 16-byte fetch out of L1I cache. SIMD
instructions tend to be long and not many instructions fit within 16 bytes even
when memory accesses have no offsets. Offset adds impact to the injury.

> otherwise
> I have no idea why we have biased the %rax accesses by -32.  Why we
> fail to consider decrement-to-zero for the counter IV is probably because
> IVCANON would add such IV but the vectorizer replaces that and IVOPTs
> doesn't consider re-adding that.

Sorry, I have no idea about the meaning of IVCANON.

Reply via email to