[Bug c/79487] New: Invalid _Decimal32 comparison on s390x

vogt at linux dot vnet.ibm.com Mon, 13 Feb 2017 05:36:38 -0800

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79487


            Bug ID: 79487
           Summary: Invalid _Decimal32 comparison on s390x
           Product: gcc
           Version: 7.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c
          Assignee: unassigned at gcc dot gnu.org
          Reporter: vogt at linux dot vnet.ibm.com
                CC: jakub at gcc dot gnu.org, krebbel at gcc dot gnu.org
  Target Milestone: ---
              Host: s390x
            Target: s390x

This is a finding from an Asan test case failure reported here:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79341
It may be a bug in the middle end or the backend.

The failing program is

--
extern _Decimal32 bar(_Decimal32 x); 

void foo(void) 
{ 
  int i; 

#if 0 /*!!!*/ 
  volatile 
#endif 
    _Decimal32 min = (-0x7fffffffffffffffL - 1L); 
  volatile _Decimal32 tem = min; 

  for (i = 0; i < 999999; i++) 
    { 
      tem -= (_Decimal32)1.0; 
      if (min != tem) 
        { 
          bar(tem); 
          break; 
        } 
    } 
} 
--

When compiled with -O3 -march=zEC12, with the "volatile" disabled, the
comparison "min != tem" is true right in the first pass of the loop, but bar()
is called actually with the a value that is identical to "(_Decimal32)min".

One cause of this is that although s390x supports 32-Bit decimal floating point
in hardware, there are no instructions to do arithmetics or comparisons on such
values.  The 32 bit values need to be converted to 64 bit format for
comparisons.  Gcc pre-calculates the constant (7fffffffffffffffL - 1L) and puts
it into the literal pool as 64 bit quantity.  At the same time, "tem" is kept
in memory as a 32 bit quantity, loaded to a register, extended to 64 bits and
then compared to the value from the literal pool.  Since the latter was not
rounded, the comparison is always true.  Making "min" volatile circumvents the
problem.

This is a (slightly shortened) diff of the assembly code with the "volatile"
enabled (-, good) and disabled (+, bad):

--
 foo:
        ldgr    %f4,%r15
        larl    %r5,.L8
        lay     %r15,-168(%r15)
        le      %f0,.L9-.L8(%r5)
-       ste     %f0,160(%r15)
+       ste     %f0,164(%r15)
        iilf    %r1,999999
-       l       %r2,160(%r15)
-       st      %r2,164(%r15)
 .L3:
        le      %f0,164(%r15)
        ldetr   %f0,%f0,0         <-- extend tem to 64 bits
        ld      %f2,.L10-.L8(%r5)
        sdtr    %f0,%f0,%f2       <-- subtract 1 from tem
+       ld      %f2,.L11-.L8(%r5) <-- min: 64-bit from literal pool
        ledtr   %f0,0,%f0,0       <-- round tem to 32 bits
        ste     %f0,164(%r15)     <-- store a copy to stack
-       le      %f2,160(%r15)     <-- min: 32-bit from stack
        le      %f0,164(%r15)     <-- load tem from stack (32 bits)
-       ldetr   %f2,%f2,0         <-- min: extend to 64 bits
        ldetr   %f0,%f0,0         <-- tem: extend to 64 bits
-       cdtr    %f2,%f0           <-- compare min and tem (64 bits)
+       cdtr    %f0,%f2
        jne     .L7
        brct    %r1,.L3
        lgdr    %r15,%f4
        br      %r14
 .L7:
        le      %f0,164(%r15)
        lgdr    %r15,%f4
        jg      bar
...
 .L8:
+.L11:
+       .long   -297458820   <--- (-7fffffffffffffffL - 1L) as 64 bit value
+       .long   -2090241034  <-/
 .L10:
        .long   573833216
        .long   16
 .L9:
        .long   -283865614   <--- (-7fffffffffffffffL - 1L) as 32 bit value
--

Somewhere, Gcc is using 64 bit precision for _Decimal32 where it should not. 
Note that this does not happen on Power which has similar DFP instructions. 
Gcc does not store the constant with 64 bit precision in the literal pool
there.

[Bug c/79487] New: Invalid _Decimal32 comparison on s390x

Reply via email to