https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110206

--- Comment #12 from Richard Biener <rguenth at gcc dot gnu.org> ---
I can see cprop1 adds the REG_EQUAL note:

(insn 22 21 23 4 (set (reg:V8HI 100)
        (zero_extend:V8HI (vec_select:V8QI (subreg:V16QI (reg:V4QI 98) 0)
                (parallel [
                        (const_int 0 [0])
                        (const_int 1 [0x1])
                        (const_int 2 [0x2])
                        (const_int 3 [0x3])
                        (const_int 4 [0x4])
                        (const_int 5 [0x5])
                         (const_int 6 [0x6])
                         (const_int 7 [0x7])
                     ])))) "t.c":12:42 7557 {sse4_1_zero_extendv8qiv8hi2}
-     (expr_list:REG_DEAD (reg:V4QI 98)
-        (nil)))
+     (expr_list:REG_EQUAL (const_vector:V8HI [
+                (const_int 204 [0xcc]) repeated x8
+            ])
+        (expr_list:REG_DEAD (reg:V4QI 98)
+            (nil))))

but I don't see yet what the actual wrong transform based on this REG_EQUAL
note is?

It looks like we CSE the above with

-   46: r122:V8QI=[`*.LC3']
-      REG_EQUAL const_vector
-   48: r125:V8HI=zero_extend(vec_select(r122:V8QI#0,parallel))
-      REG_EQUAL const_vector
-      REG_DEAD r122:V8QI
-   49: r126:V8HI=r124:V8HI*r125:V8HI
-      REG_DEAD r125:V8HI
+   49: r126:V8HI=r124:V8HI*r100:V8HI

but otherwise do nothing.  So the issue is that we rely on the "undefined"
vals to have a specific value (from the earlier REG_EQUAL note) but actual
code generation doesn't ensure this (it doesn't need to).  That said,
the issue isn't the constant folding per-se but that we do not actually
constant fold but register an equality that doesn't hold.

Reply via email to