On 02/12/2016 09:28 AM, Bernd Schmidt wrote:
PR69714 is an issue where the bswap pass makes an incorrect
transformation on big-endian targets. The source has a 32-bit bswap, but
PA doesn't have a pattern for that. Still, we recognize that there is a
16-bit bswap involved, and generate code for that - loading the halfword
at offset 2 from the original memory, as per the proper big-endian
correction.

The problem is that we recognized the rotation of the _high_ part, which
is at offset 0 on big-endian. The symbolic number is 0x0304, rather than
0x0102 as it should be. Only the latter form should ever be matched. The
problem is caused by the patch for PR67781, which was intended to solve
a different big-endian problem. Unfortunately, I think it is based on an
incorrect analysis.

The real issue with the PR67781 testcase is in fact the masking loop,
identified by Thomas in comment #7 for 67781.

for (tmpn = n->n, rsize = 0; tmpn; tmpn >>= BITS_PER_MARKER, rsize++)
   ;
n->range = rsize;

If we have a value of 0x01020304, but a range of 5, it means that
there's an "invisible" high-order byte that we don't care about. On
little-endian, we can just ignore it. On big-endian, this implies that
the data we're interested in is located at an offset. The code that does
the replacements does not use the offset or bytepos fields, it assumes
that the bytepos always matches that of the load instruction. The only
offset we can introduce is the big-endian correction, but that assumes
we're always dealing with lowparts.

So, I think the correct/conservative fix for both bugs is to revert the
earlier change for PR67781, and then apply the following on top:

--- revert.tree-ssa-math-opts.c    2016-02-12 15:22:57.098895058 +0100
+++ tree-ssa-math-opts.c    2016-02-12 15:23:08.482228474 +0100
@@ -2473,10 +2473,14 @@ find_bswap_or_nop (gimple *stmt, struct
    /* Find real size of result (highest non-zero byte).  */
    if (n->base_addr)
      {
-      int rsize;
+      unsigned HOST_WIDE_INT rsize;
        uint64_t tmpn;

        for (tmpn = n->n, rsize = 0; tmpn; tmpn >>= BITS_PER_MARKER,
rsize++);
+      if (BYTES_BIG_ENDIAN && n->range != rsize)
+    /* This implies an offset, which is currently not handled by
+       bswap_replace.  */
+    return NULL;
        n->range = rsize;
      }

I'm attaching a full patch. John David Anglin is currently running tests
on PA which may take a while. He's confirmed it fixes his testcase, and
the earlier 67781 testcase seems to be cured as well (only looked at
generated assembly, don't have a full cross environment - Thomas, can
you verify?)
I'm also doing an x86_64 test run. Ok everywhere if it passes? I'm also
attaching a version of the testcase (LGPL, from ffmpeg) which I'd also
apply to gcc.dg/torture if David (or Thomas) can confirm it works on a
big-endian target after the fix.
Yes.  Seems like the safe thing to do here.

Jeff

Reply via email to