https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110108

            Bug ID: 110108
           Summary: [14 Regression] Wrong code from combining
                    VPABSB/VPBLENDVB since 1ede03e2d0437ea9c2f7
           Product: gcc
           Version: 14.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: benjsith at gmail dot com
  Target Milestone: ---

Created attachment 55249
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55249&action=edit
A compressed preprocessed minimal repro of the issue

The following code is a minimal repro showing the issue:

#include <immintrin.h>
__m128i do_stuff_128(__m128i X0, __m128i X1) {
        __m128i AbsX0 = _mm_abs_epi8(X0);
        __m128i Result = _mm_blendv_epi8(AbsX0, X1, AbsX0);
        return Result;
}

A preprocessed version of the minimal repro is attached as well

On GCC 13.1, when compiled with `gcc -O1 -mavx2`, it produces this assembly:
        vpabsb  xmm0, xmm0
        vpblendvb       xmm0, xmm0, xmm1, xmm0
        ret

However, on trunk it compiles to:
        vpabsb  xmm0, xmm0
        ret

Godbolt link showing more details, and the difference in execution:
https://godbolt.org/z/eWszWPva4

What appears to be happening is it removes the blend, since VPBLENDVB uses the
high bit of the mask, and it assumes the high bit will always be zero due to
the abs. However, from reading the spec VPABSB will read signed-bytes as input,
but will output as unsigned bytes. If an input byte is 0x80, -128, the absolute
value will be 128, which as an unsigned byte is also 0x80. In that case, the
high bit could be set, and the blend may use some bytes from the second operand

This appears to have been introduced recently, in the last couple weeks. A
bisect shows that this started happening with commit
1ede03e2d0437ea9c2f7453fcbe263505b4e0def, however that commit seems like it
might just be hooking up existing functionality there may be another root cause

I have confirmed this still repros on the latest trunk,
dec7aaabe9651cb075ace60721b6e36864cc5140

For triage/priority purposes: this issue was not from code I manually wrote,
but was found by a fuzzer meant to test SIMD codegen

Reply via email to