On 12/7/23 03:39, Jakub Jelinek wrote:
On Thu, Dec 07, 2023 at 09:36:22AM +0100, Jakub Jelinek wrote:
So, one way to fix the LRA issue would be just to use
   lra_insn_recog_data_len = index * 3U / 2;
   if (lra_insn_recog_data_len <= index)
     lra_insn_recog_data_len = index + 1;
basically do what vec.cc does.  I thought we can do better for
both vec.cc and LRA on 64-bit hosts even without growing the allocated
counters, but now that I look at it again, perhaps we can't.
The above overflows already with original alloc or lra_insn_recog_data_len
0x55555556, where 0x5555555 * 3U / 2 is still 0x7fffffff
and so representable in the 32-bit, but 0x55555556 * 3U / 2 is
1.  I thought (and the patch implements it) that we could use
alloc * (size_t) 3 / 2 so that on 64-bit hosts it wouldn't overflow
that quickly, but 0x55555556 * (size_t) 3 / 2 there is 0x80000001
which is still ok in unsigned, but given that vec.h then stores the
counter into unsigned m_alloc:31; bit-field, it is too much.

The patch below is what I've actually bootstrapped/regtested on
x86_64-linux and i686-linux, but given the above I think I should drop
the vec.cc hunk and change (size_t) 3 in the LRA hunk to 3U.
Here is so far untested adjusted patch, which does the computation
just in unsigned int rather than size_t, because doing it in size_t
wouldn't improve things.

2023-12-07  Jakub Jelinek  <ja...@redhat.com>

        PR middle-end/112411
        * params.opt (-param=min-nondebug-insn-uid=): Add
        IntegerRange(0, 1073741824).
        * lra.cc (check_and_expand_insn_recog_data): Use 3U rather than 3
        in * 3 / 2 computation and if the result is smaller or equal to
        index, use index + 1.

        * gcc.dg/params/blocksort-part.c: Add dg-skip-if for
        --param min-nondebug-insn-uid=1073741824.

Jakub, if you are still waiting for an approval,  LRA change is ok for me with given max param.

Thank you for fixing this.



Reply via email to