[Bug target/110792] [13/14 Regression] GCC 13 x86_32 miscompilation of Whirlpool hash function

2023-10-30 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110792

Sam James  changed:

   What|Removed |Added

   Keywords|needs-bisection |

--- Comment #17 from Sam James  ---
fwiw I backported this downstream a while ago and it's been in production
since, no complaints

[Bug target/110792] [13/14 Regression] GCC 13 x86_32 miscompilation of Whirlpool hash function

2023-09-15 Thread mpolacek at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110792

Marek Polacek  changed:

   What|Removed |Added

 CC||mpolacek at gcc dot gnu.org

--- Comment #16 from Marek Polacek  ---
(In reply to Sam James from comment #15)
> Roger, is this one ready to backport to releases/gcc-13 so we can pick it up
> easily downstream via the branch snapshots, or do you want to let it bake on
> trunk a bit longer? Cheers.

I have the same question.

[Bug target/110792] [13/14 Regression] GCC 13 x86_32 miscompilation of Whirlpool hash function

2023-08-10 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110792

--- Comment #15 from Sam James  ---
Roger, is this one ready to backport to releases/gcc-13 so we can pick it up
easily downstream via the branch snapshots, or do you want to let it bake on
trunk a bit longer? Cheers.

[Bug target/110792] [13/14 Regression] GCC 13 x86_32 miscompilation of Whirlpool hash function

2023-08-06 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110792

--- Comment #14 from CVS Commits  ---
The master branch has been updated by Roger Sayle :

https://gcc.gnu.org/g:529909f9e92dd3b0ed0383f45a44d2b5f8a58958

commit r14-3012-g529909f9e92dd3b0ed0383f45a44d2b5f8a58958
Author: Roger Sayle 
Date:   Sun Aug 6 23:19:10 2023 +0100

[Committed] Avoid FAIL of gcc.target/i386/pr110792.c

My apologies (again), I managed to mess up the 64-bit version of the
test case for PR 110792.  Unlike the 32-bit version, the 64-bit case
contains exactly the same load instructions, just in a different order
making the correct and incorrect behaviours impossible to distinguish
with a scan-assembler-not.  Somewhere between checking that this test
failed in a clean tree without the patch, and getting the escaping
correct, I'd failed to notice that this also FAILs in the patched tree.
Doh!  Instead of removing the test completely, I've left it as a
compilation test.

The original fix is tested by the 32-bit test case.

Committed to mainline as obvious.  Sorry for the incovenience.

2023-08-06  Roger Sayle  

gcc/testsuite/ChangeLog
PR target/110792
* gcc.target/i386/pr110792.c: Remove dg-final scan-assembler-not.

[Bug target/110792] [13/14 Regression] GCC 13 x86_32 miscompilation of Whirlpool hash function

2023-08-02 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110792

--- Comment #13 from CVS Commits  ---
The master branch has been updated by Roger Sayle :

https://gcc.gnu.org/g:790c1f60a5662b16eb19eb4b81922995863c7571

commit r14-2939-g790c1f60a5662b16eb19eb4b81922995863c7571
Author: Roger Sayle 
Date:   Thu Aug 3 07:12:04 2023 +0100

PR target/110792: Early clobber issues with rot32di2_doubleword on i386.

This patch is a conservative fix for PR target/110792, a wrong-code
regression affecting doubleword rotations by BITS_PER_WORD, which
effectively swaps the highpart and lowpart words, when the source to be
rotated resides in memory. The issue is that if the register used to
hold the lowpart of the destination is mentioned in the address of
the memory operand, the current define_insn_and_split unintentionally
clobbers it before reading the highpart.

Hence, for the testcase, the incorrectly generated code looks like:

salq$4, %rdi// calculate address
movqWHIRL_S+8(%rdi), %rdi   // accidentally clobber addr
movqWHIRL_S(%rdi), %rbp // load (wrong) lowpart

Traditionally, the textbook way to fix this would be to add an
explicit early clobber to the instruction's constraints.

 (define_insn_and_split "32di2_doubleword"
- [(set (match_operand:DI 0 "register_operand" "=r,r,r")
+ [(set (match_operand:DI 0 "register_operand" "=r,r,&r")
(any_rotate:DI (match_operand:DI 1 "nonimmediate_operand" "0,r,o")
   (const_int 32)))]

but unfortunately this currently generates significantly worse code,
due to a strange choice of reloads (effectively memcpy), which ends up
looking like:

salq$4, %rdi// calculate address
movdqa  WHIRL_S(%rdi), %xmm0// load the double word in SSE reg.
movaps  %xmm0, -16(%rsp)// store the SSE reg back to the
stack
movq-8(%rsp), %rdi  // load highpart
movq-16(%rsp), %rbp // load lowpart

Note that reload's "&" doesn't distinguish between the memory being
early clobbered, vs the registers used in an addressing mode being
early clobbered.

The fix proposed in this patch is to remove the third alternative, that
allowed offsetable memory as an operand, forcing reload to place the
operand into a register before the rotation.  This results in:

salq$4, %rdi
movqWHIRL_S(%rdi), %rax
movqWHIRL_S+8(%rdi), %rdi
movq%rax, %rbp

I believe there's a more advanced solution, by swapping the order of
the loads (if first destination register is mentioned in the address),
or inserting a lea insn (if both destination registers are mentioned
in the address), but this fix is a minimal "safe" solution, that
should hopefully be suitable for backporting.

2023-08-03  Roger Sayle  

gcc/ChangeLog
PR target/110792
* config/i386/i386.md (ti3): For rotations by 64 bits
place operand in a register before gen_64ti2_doubleword.
(di3): Likewise, for rotations by 32 bits, place
operand in a register before gen_32di2_doubleword.
(32di2_doubleword): Constrain operand to be in
register.
(64ti2_doubleword): Likewise.

gcc/testsuite/ChangeLog
PR target/110792
* g++.target/i386/pr110792.C: New 32-bit C++ test case.
* gcc.target/i386/pr110792.c: New 64-bit C test case.

[Bug target/110792] [13/14 Regression] GCC 13 x86_32 miscompilation of Whirlpool hash function

2023-07-27 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110792

Richard Biener  changed:

   What|Removed |Added

   Target Milestone|13.2|13.3

--- Comment #12 from Richard Biener  ---
GCC 13.2 is being released, retargeting bugs to GCC 13.3.

[Bug target/110792] [13/14 Regression] GCC 13 x86_32 miscompilation of Whirlpool hash function

2023-07-25 Thread roger at nextmovesoftware dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110792

Roger Sayle  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |roger at 
nextmovesoftware dot com

--- Comment #11 from Roger Sayle  ---
Mine.  Alas the obvious fix of adding an early clobber to the rotate doubleword
from memory alternative generates some truly terrible code (spills via memory
to SSE registers!?), but I've come up with a better solution, forcing reload to
read the operand from memory before the rotate, which appears to fix the
problem without the adverse performance impact.

[Bug target/110792] [13/14 Regression] GCC 13 x86_32 miscompilation of Whirlpool hash function

2023-07-24 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110792

--- Comment #10 from Andrew Pinski  ---
(In reply to Andrew Pinski from comment #9)
> Created attachment 55623 [details]
> x86_64 testcase
> 
> Here is a x86_64 testcase which shows the issue is not just with i686 but
> also with int128 and x86_64 too.

Note this started with r13-1907-g525a1a73a5a563 (rather than
r13-1362-g00193676a5a3e7 ) But both patterns have the same issue and should be
both fixed rather than just one or the other.

[Bug target/110792] [13/14 Regression] GCC 13 x86_32 miscompilation of Whirlpool hash function

2023-07-24 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110792

--- Comment #9 from Andrew Pinski  ---
Created attachment 55623
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55623&action=edit
x86_64 testcase

Here is a x86_64 testcase which shows the issue is not just with i686 but also
with int128 and x86_64 too.

[Bug target/110792] [13/14 Regression] GCC 13 x86_32 miscompilation of Whirlpool hash function

2023-07-24 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110792

--- Comment #8 from Andrew Pinski  ---
(In reply to Andrew Pinski from comment #7)
> Created attachment 55622 [details]
> Reduced C testcase
> 
> This reduces the previous testcase and forces the overlap ...

Oh and changed it to be able to compile with the C front-end and not just the
C++ front-end.

[Bug target/110792] [13/14 Regression] GCC 13 x86_32 miscompilation of Whirlpool hash function

2023-07-24 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110792

Andrew Pinski  changed:

   What|Removed |Added

  Attachment #55620|0   |1
is obsolete||

--- Comment #7 from Andrew Pinski  ---
Created attachment 55622
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55622&action=edit
Reduced C testcase

This reduces the previous testcase and forces the overlap ...

[Bug target/110792] [13/14 Regression] GCC 13 x86_32 miscompilation of Whirlpool hash function

2023-07-24 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110792

Andrew Pinski  changed:

   What|Removed |Added

 CC||roger at nextmovesoftware dot 
com
   See Also|https://gcc.gnu.org/bugzill |
   |a/show_bug.cgi?id=109109|

--- Comment #6 from Andrew Pinski  ---
I am almost positive it started with r13-1362-g00193676a5a3e7 which forgot to
check to see the memory operand overlaps (refers to) with the destination
register.

[Bug target/110792] [13/14 Regression] GCC 13 x86_32 miscompilation of Whirlpool hash function

2023-07-24 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110792

--- Comment #5 from Andrew Pinski  ---
When this is split:
(insn 12 11 18 2 (set (reg:DI 0 ax [87])
(rotate:DI (mem:DI (plus:SI (mult:SI (reg:SI 0 ax [orig:89 x0 ] [89])
(const_int 8 [0x8]))
(symbol_ref:SI ("WHIRL_S") [flags 0x2]  )) [1 WHIRL_S[_1]+0 S8 A64])
(const_int 32 [0x20]))) "/app/example.cpp":4:75 971
{rotl32di2_doubleword}
 (nil))

It gets split incorrectly into:
(insn 21 11 22 2 (set (reg:SI 0 ax [87])
(mem:SI (plus:SI (mult:SI (reg:SI 0 ax [orig:89 x0 ] [89])
(const_int 8 [0x8]))
(const:SI (plus:SI (symbol_ref:SI ("WHIRL_S") [flags 0x2] 
)
(const_int 4 [0x4] [1 WHIRL_S[_1]+4 S4 A32]))
"/app/example.cpp":4:75 91 {*movsi_internal}
 (nil))
(insn 22 21 18 2 (set (reg:SI 1 dx [+4 ])
(mem:SI (plus:SI (mult:SI (reg:SI 0 ax [orig:89 x0 ] [89])
(const_int 8 [0x8]))
(symbol_ref:SI ("WHIRL_S") [flags 0x2]  )) [1 WHIRL_S[_1]+0 S4 A64])) "/app/example.cpp":4:75 91
{*movsi_internal}
 (nil))

Basically ax set before its use in the memory ...

[Bug target/110792] [13/14 Regression] GCC 13 x86_32 miscompilation of Whirlpool hash function

2023-07-24 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110792

--- Comment #4 from Andrew Pinski  ---
Created attachment 55621
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55621&action=edit
almost reduced all the way

[Bug target/110792] [13/14 Regression] GCC 13 x86_32 miscompilation of Whirlpool hash function

2023-07-24 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110792

--- Comment #3 from Andrew Pinski  ---
Created attachment 55620
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55620&action=edit
semi-reduced testcase

Here is a reduced testcase which also fails on the trunk.