[Bug target/66112] __builtin_mul_overflow for int16_t emits poor code

2015-11-25 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66112

--- Comment #13 from Jakub Jelinek  ---
(In reply to ktkachov from comment #12)
> r223115 caused a code quality regression that I'm seeing on aarch64 with PR
> 68381.
> 
> We have
> int
> foo (unsigned short x, unsigned short y)
> {
>   int r;
>   if (__builtin_mul_overflow (x, y, ))
> __builtin_abort ();
>   return r;
> }
> 
> and before the commit we used to generate:
> foo:
>   uxthx0, w0
>   uxthx1, w1
>   mul x0, x0, x1
>   cmp x0, x0, sxtw
>   bne .L9
>   ret
> .L9:
>   stp x29, x30, [sp, -16]!
>   add x29, sp, 0
>   bl  abort
> 
> but after r223115 at -O3 on an aarch64 toolchain we generate:
>  foo:
>   uxthw0, w0
>   uxthw1, w1
>   stp x29, x30, [sp, -16]!
>   umull   x0, w0, w1
>   add x29, sp, 0
>   tbnzw0, #31, .L6
>   mov w2, 0
>   cbnzw2, .L6
>   ldp x29, x30, [sp], 16
>   ret
> .L6:
>   bl  abort

You really should consider adding {u,}{add,sub,mul}v4 and/or negv3
expanders if you have some better sequences you can generate over the generic
ones on some target.

[Bug target/66112] __builtin_mul_overflow for int16_t emits poor code

2015-11-25 Thread ktkachov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66112

ktkachov at gcc dot gnu.org changed:

   What|Removed |Added

 CC||ktkachov at gcc dot gnu.org

--- Comment #12 from ktkachov at gcc dot gnu.org ---
r223115 caused a code quality regression that I'm seeing on aarch64 with PR
68381.

We have
int
foo (unsigned short x, unsigned short y)
{
  int r;
  if (__builtin_mul_overflow (x, y, ))
__builtin_abort ();
  return r;
}

and before the commit we used to generate:
foo:
uxthx0, w0
uxthx1, w1
mul x0, x0, x1
cmp x0, x0, sxtw
bne .L9
ret
.L9:
stp x29, x30, [sp, -16]!
add x29, sp, 0
bl  abort

but after r223115 at -O3 on an aarch64 toolchain we generate:
 foo:
uxthw0, w0
uxthw1, w1
stp x29, x30, [sp, -16]!
umull   x0, w0, w1
add x29, sp, 0
tbnzw0, #31, .L6
mov w2, 0
cbnzw2, .L6
ldp x29, x30, [sp], 16
ret
.L6:
bl  abort

[Bug target/66112] __builtin_mul_overflow for int16_t emits poor code

2015-05-13 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66112

--- Comment #9 from Jakub Jelinek jakub at gcc dot gnu.org ---
Author: jakub
Date: Wed May 13 08:07:58 2015
New Revision: 223115

URL: https://gcc.gnu.org/viewcvs?rev=223115root=gccview=rev
Log:
PR target/66112
* internal-fn.c (get_min_precision): Use UNSIGNED instead of
SIGNED to get precision of non-negative value.

* gcc.target/i386/pr66112-1.c: New test.

Added:
trunk/gcc/testsuite/gcc.target/i386/pr66112-1.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/internal-fn.c
trunk/gcc/testsuite/ChangeLog


[Bug target/66112] __builtin_mul_overflow for int16_t emits poor code

2015-05-13 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66112

--- Comment #10 from Jakub Jelinek jakub at gcc dot gnu.org ---
Author: jakub
Date: Wed May 13 08:09:01 2015
New Revision: 223116

URL: https://gcc.gnu.org/viewcvs?rev=223116root=gccview=rev
Log:
PR target/66112
* config/i386/i386.md (mulvmode4, umulvmode4, *umulvmode4):
Use SWI248 iterator instead of SWI.
(*mulvmode4_1): Use SWI48 instead of SWI.  Simplify output template.
Use eq_attr alternative 0 instead of match_test in
length_immediate attribute computation.
(*mulvhi4, *mulvhi4_1): New define_insns.

* gcc.target/i386/pr66112-2.c: New test.

Added:
trunk/gcc/testsuite/gcc.target/i386/pr66112-2.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/i386/i386.md
trunk/gcc/testsuite/ChangeLog


[Bug target/66112] __builtin_mul_overflow for int16_t emits poor code

2015-05-13 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66112

Jakub Jelinek jakub at gcc dot gnu.org changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #11 from Jakub Jelinek jakub at gcc dot gnu.org ---
Fixed on the trunk.  Not a regression, so probably not suitable for
backporting.


[Bug target/66112] __builtin_mul_overflow for int16_t emits poor code

2015-05-12 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66112

Jakub Jelinek jakub at gcc dot gnu.org changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #1 from Jakub Jelinek jakub at gcc dot gnu.org ---
Created attachment 35522
  -- https://gcc.gnu.org/bugzilla/attachment.cgi?id=35522action=edit
gcc5-pr66112.patch

Supposedly just using SWI248 instead of SWI48 iterator should fix this, though
not sure about all the AMD scheduling stuff.


[Bug target/66112] __builtin_mul_overflow for int16_t emits poor code

2015-05-12 Thread ubizjak at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66112

--- Comment #2 from Uroš Bizjak ubizjak at gmail dot com ---
(In reply to Jakub Jelinek from comment #1)
 Created attachment 35522 [details]
 gcc5-pr66112.patch
 
 Supposedly just using SWI248 instead of SWI48 iterator should fix this,
 though not sure about all the AMD scheduling stuff.

Just copy HImode patterns, as is already case with QI, SI/DImode (and we
tolerated it). We can macroize these patterns latter, if needed at all.

I wonder, why HImode patterns were left out in the first place ...

[Bug target/66112] __builtin_mul_overflow for int16_t emits poor code

2015-05-12 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66112

Jakub Jelinek jakub at gcc dot gnu.org changed:

   What|Removed |Added

  Attachment #35522|0   |1
is obsolete||
 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2015-05-12
   Assignee|unassigned at gcc dot gnu.org  |jakub at gcc dot gnu.org
 Ever confirmed|0   |1

--- Comment #4 from Jakub Jelinek jakub at gcc dot gnu.org ---
Created attachment 35523
  -- https://gcc.gnu.org/bugzilla/attachment.cgi?id=35523action=edit
gcc6-pr66112-1.patch

Generic fix, to improve minimum precision computation for unsigned values.


[Bug target/66112] __builtin_mul_overflow for int16_t emits poor code

2015-05-12 Thread gcc.hall at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66112

--- Comment #7 from Jeremy gcc.hall at gmail dot com ---
Comment on attachment 35522
  -- https://gcc.gnu.org/bugzilla/attachment.cgi?id=35522
gcc5-pr66112.patch

Done, PR66120


[Bug target/66112] __builtin_mul_overflow for int16_t emits poor code

2015-05-12 Thread gcc.hall at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66112

--- Comment #3 from Jeremy gcc.hall at gmail dot com ---
Related FYI,
Few instructions on ARM set the overflow flag.  Two that do are 32-bit add and
subtract.  For these two, GCC could just emit adds followed by bvs 
Instead it produces:-

bl  atoi@
add r1, r4, r0  @ tmp121, a, b
cmp r0, #0  @ b,
blt .L4 @,
cmp r1, r4  @ tmp121, a
bge .L5 @,
b   .L3 @
.L4:
cmp r1, r4  @ tmp121, a
ble .L5 @,


[Bug target/66112] __builtin_mul_overflow for int16_t emits poor code

2015-05-12 Thread ubizjak at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66112

--- Comment #8 from Uroš Bizjak ubizjak at gmail dot com ---
(In reply to Jakub Jelinek from comment #5)
 Created attachment 35524 [details]
 gcc6-pr66112-2.patch
 
 And i386 mulvhi4 and umulvhi4 support.  For umulvhi4, I haven't found
 corresponding i386.md instruction that would emit mul{w}, so just changed
 SWI48 to SWI248 iterator in that case.

No, mulw outputs to %ax:%dx pair of HImode registers! Please see [1]

[1] http://x86.renejeschke.de/html/file_module_x86_id_210.html

[Bug target/66112] __builtin_mul_overflow for int16_t emits poor code

2015-05-12 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66112

--- Comment #5 from Jakub Jelinek jakub at gcc dot gnu.org ---
Created attachment 35524
  -- https://gcc.gnu.org/bugzilla/attachment.cgi?id=35524action=edit
gcc6-pr66112-2.patch

And i386 mulvhi4 and umulvhi4 support.  For umulvhi4, I haven't found
corresponding i386.md instruction that would emit mul{w}, so just changed SWI48
to SWI248 iterator in that case.


[Bug target/66112] __builtin_mul_overflow for int16_t emits poor code

2015-05-12 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66112

--- Comment #6 from Jakub Jelinek jakub at gcc dot gnu.org ---
As for ARM, please file a separate enhancement request.  The generic code has
the means for backends to provide better patterns, but arm doesn't use them
(only i386.md uses them right now), so arm maintainers should add those if
needed.