On 12/7/23 03:39, Jakub Jelinek wrote:
On Thu, Dec 07, 2023 at 09:36:22AM +0100, Jakub Jelinek wrote:
So, one way to fix the LRA issue would be just to use
lra_insn_recog_data_len = index * 3U / 2;
if (lra_insn_recog_data_len <= index)
lra_insn_recog_data_len = index + 1;
basically do what vec.cc does. I thought we can do better for
both vec.cc and LRA on 64-bit hosts even without growing the allocated
counters, but now that I look at it again, perhaps we can't.
The above overflows already with original alloc or lra_insn_recog_data_len
0x55555556, where 0x5555555 * 3U / 2 is still 0x7fffffff
and so representable in the 32-bit, but 0x55555556 * 3U / 2 is
1. I thought (and the patch implements it) that we could use
alloc * (size_t) 3 / 2 so that on 64-bit hosts it wouldn't overflow
that quickly, but 0x55555556 * (size_t) 3 / 2 there is 0x80000001
which is still ok in unsigned, but given that vec.h then stores the
counter into unsigned m_alloc:31; bit-field, it is too much.
The patch below is what I've actually bootstrapped/regtested on
x86_64-linux and i686-linux, but given the above I think I should drop
the vec.cc hunk and change (size_t) 3 in the LRA hunk to 3U.
Here is so far untested adjusted patch, which does the computation
just in unsigned int rather than size_t, because doing it in size_t
wouldn't improve things.
2023-12-07 Jakub Jelinek <ja...@redhat.com>
PR middle-end/112411
* params.opt (-param=min-nondebug-insn-uid=): Add
IntegerRange(0, 1073741824).
* lra.cc (check_and_expand_insn_recog_data): Use 3U rather than 3
in * 3 / 2 computation and if the result is smaller or equal to
index, use index + 1.
* gcc.dg/params/blocksort-part.c: Add dg-skip-if for
--param min-nondebug-insn-uid=1073741824.
Jakub, if you are still waiting for an approval, LRA change is ok for
me with given max param.
Thank you for fixing this.