13 Regression] double counting for sum of structs of floating point types

clyon at gcc dot gnu.org via Gcc-bugs Fri, 30 Jun 2023 07:05:26 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110381


--- Comment #15 from Christophe Lyon <clyon at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #14)
> (In reply to Christophe Lyon from comment #12)
> > The new testcase (gcc.dg/vect/pr110381.c) fails:
> > FAIL: gcc.dg/vect/pr110381.c -flto -ffat-lto-objects execution test
> > FAIL: gcc.dg/vect/pr110381.c execution test
> > 
> > on arm-linux-gnueabihf configured with --with-float=hard
> > --with-fpu=neon-fp-armv8 --with-mode=thumb --with-arch=armv8-a
> 
> Can you check if it works now?  I've added a missing check_vect () call in
> case the harness passes in command-line options that your HW doesn't
> support.  Otherwise I'd appreciate command-line options to reproduce.

I still fails (check_vect() passes on my config, so there's no change).

Here is what sum_8_foos looks like:
sum_8_foos:
        @ args = 0, pretend = 0, frame = 0
        @ frame_needed = 0, uses_anonymous_args = 0
        @ link register save eliminated.
        vmov.i64        d0, #0  @ float
        add     r3, r0, #192
.L10:
        vldr.64 d16, [r0, #8]
        adds    r0, r0, #24
        vldr.64 d18, [r0, #-24]
        vldr.64 d17, [r0, #-8]
        cmp     r3, r0
        vadd.f64        d16, d16, d18
        vadd.f64        d16, d16, d17
        vadd.f64        d0, d0, d16
        bne     .L10
        bx      lr

so we load:
d16=5
d17=-__DBL_MAX__
d18=__DBL_MAX__
the first addition makes d16=__DBL_MAX__
and the second one makes d16=0


> I cannot get anything to vectorize with a cc1 cross using
> 
> > ./cc1 -quiet t.c -O2 -ftree-vectorize -fno-vect-cost-model -fopt-info-vec 
> > -I include tri
> 
> but I have a cross configured with --with-float=hard --with-cpu=cortex-a9
> --with-fpu=neon-fp16

Not sure what happens. I tried my native compiler with the above flags, I get
the same code.
I tried to build my native compiler with the same flags, same code too.

> 
> I hope the FPU is compliant enough to compute __DBL_MAX__ + -__DBL_MAX__ +
> 5. to 5.

[Bug tree-optimization/110381] [11/12/13 Regression] double counting for sum of structs of floating point types

Reply via email to