https://gcc.gnu.org/bugzilla/show_bug.cgi?id=122569

--- Comment #5 from ptomsich at gcc dot gnu.org ---
The root-cause is in the CLZ table validator in simplify_count_zeroes
(gcc/tree-ssa-forwprop.cc): it invokes undefined behaviour when validating
64-bit DeBruijn CLZ tables.                                     

The checkfn computes (1 << (data + 1)) - 1 to simulate the value produced by
the OR-cascade b |= b >> 1; ... b |= b >> 32 — i.e. the value with all bits
from position 0 up to the original MSB set. When the input type is 64-bit and
the table entry is 63 (the MSB position, which is the bit position reached when
the OR-cascade input has bit 63 set), data + 1 equals HOST_BITS_PER_WIDE_INT,
so the shift HOST_WIDE_INT_1U << 64 is UB.
In practice the host produces 0, turning the check (0 * magic) >> 58 == 63 into
false. The bad entry drops the match count from bits + 1 to exactly bits, and
check_table_array returns false because it requires strictly more than bits
matches.                                                                        

Every valid 64-bit DeBruijn CLZ table has an entry mapping the all-ones value
to bit 63, so this UB rejects every well-formed 64-bit DeBruijn CLZ table —
including the magic 0x03f79d71b4cb0a89 found in Stockfish's msb(), zstd's
lib/common/bits.h, and cpython's Include/internal/pycore_bitutils.h.            

The 32-bit CLZ path and the CTZ path are not affected.

Reply via email to