https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102785

            Bug ID: 102785
           Summary: [12 Regression] {smul,umul}_highpart changes break
                    bfin-elf
           Product: gcc
           Version: 12.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: rtl-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: law at gcc dot gnu.org
  Target Milestone: ---

This change:
commit 555fa3545efe23393ff21fe0928aa3942e1b90ed (refs/bisect/bad)
Author: Roger Sayle <ro...@nextmovesoftware.com>
Date:   Thu Oct 7 15:42:09 2021 +0100

    Introduce smul_highpart and umul_highpart RTX for high-part multiplications

    This patch introduces new RTX codes to allow the RTL passes and
    backends to consistently represent high-part multiplications.
    Currently, the RTL used by different backends for expanding
    smul<mode>3_highpart and umul<mode>3_highpart varies greatly,
    with many but not all choosing to express this something like:
[ ... ]

Is causing execution failures for bfin-elf for this code (taken from the
bfin/builtins tests):

extern void abort (void);

typedef short  __v2hi __attribute ((vector_size(4)));
typedef __v2hi fract2x16;
typedef short  fract16;

int main ()
{
  fract2x16 a, b, t;
  fract16 t1, t2;

  a = __builtin_bfin_compose_2x16 (0x5000, 0x3000);
  b = __builtin_bfin_compose_2x16 (0x4000, 0x2000);

  t = __builtin_bfin_dspaddsubsat (a, b);
  t1 = __builtin_bfin_extract_hi (t);
  t2 = __builtin_bfin_extract_lo (t);
  if (t1 != 0x7fff || t2 != 0x1000)
    abort ();
  return 0;
}


Before the change the source compiled down to something like this at -O2:
_main:
        R0.H = 20480;
        R1.H = 16384;
        R1.L = 8192;
        R0.L = 12288;
        R0 = R0 +|- R1 (S);
        R1.L = R0.H << 0;
        R1 = R1.L (X);
        R2 = 32767 (X);
        [--SP] = RETS;
        cc =R1==R2;
        SP += -12;
        if !cc jump .L2;
        R0 = R0.L (X);
        R1 = 4096 (X);
        cc =R0==R1;
        if !cc jump .L2;
        SP += 12;
        R0 = 0 (X);
        RETS = [SP++];
        rts;
.L2:
        call _abort;

What's important here is there's real code that tests some values and
conditionally returns with zero status or aborts.

After the [su]mul_highpart changes we get:
_main:
        [--SP] = RETS;  // 36   [c=4 l=2]  *pushsi_insn
        SP += -12;      // 37   [c=4 l=2]  addsi3/0
        call _abort;            // 23   [c=0 l=4]  *call_symbol


ie, we unconditionally call abort.

Things start to differ in CSE1.  But I haven't really tried to debug it. 
Thankfully it should be possible to debug with just a cross compiler..

Reply via email to