This patch fixes an ICE in smallest_mode_for_size on the attached testcase. The smallest_mode_for_size call comes from this part of the reload inheritance code in choose_reload_regs:
if (byte == 0) need_mode = mode; else need_mode = smallest_mode_for_size (GET_MODE_BITSIZE (mode) + byte * BITS_PER_UNIT, GET_MODE_CLASS (mode) == MODE_PARTIAL_INT ? MODE_INT : GET_MODE_CLASS (mode)); if ((GET_MODE_SIZE (GET_MODE (last_reg)) >= GET_MODE_SIZE (need_mode)) Here we have found that the pseudo register we need was last reloaded into LAST_REG. The mode size check is making sure LAST_REG defines every byte of the value we need (which is at byte offset BYTE and has mode MODE). In the attached testcase, LAST_REG is XImode (a 256-bit integer), and the value we need is the last vector quarter of it. BYTE is 24 and MODE is V4SF. The problem is that we then look for a 256-bit vector: smallest_mode_for_size (64 + 24 * 8, MODE_VECTOR_FLOAT) but no such mode exists. Note that this is the only use of need_mode. I don't believe the mode that is being calculated here is fundamental in any way, or that it's used later in the reload process. We have already checked that the mode change is allowed: #ifdef CANNOT_CHANGE_MODE_CLASS /* Verify that the register it's in can be used in mode MODE. */ && !REG_CANNOT_CHANGE_MODE_P (REGNO (reg_last_reload_reg[regno]), GET_MODE (reg_last_reload_reg[regno]), mode) #endif and have already calculated which hard register we would need to use after the mode change: i = REGNO (last_reg); i += subreg_regno_offset (i, GET_MODE (last_reg), byte, mode); So once we have verified that the register is suitable, we can (and do) simply use register I in mode MODE. I think the current mode is a historical left-over. Back in 2000 this code was a simple check that the old register entirely encompassed the new one: i = REGNO (last_reg) + word; last_class = REGNO_REG_CLASS (i); if ((GET_MODE_SIZE (GET_MODE (last_reg)) >= GET_MODE_SIZE (mode) + word * UNITS_PER_WORD) The register we were interested in was (reg:MODE I), and this check made sure that the old reload register defined every byte of (reg:MODE I). When CLASS_CANNOT_CHANGE_SIZE was introduced, the code became: i = REGNO (last_reg) + word; last_class = REGNO_REG_CLASS (i); if ( #ifdef CLASS_CANNOT_CHANGE_SIZE (TEST_HARD_REG_BIT (reg_class_contents[CLASS_CANNOT_CHANGE_SIZE], i) ? (GET_MODE_SIZE (GET_MODE (last_reg)) == GET_MODE_SIZE (mode) + word * UNITS_PER_WORD) : (GET_MODE_SIZE (GET_MODE (last_reg)) >= GET_MODE_SIZE (mode) + word * UNITS_PER_WORD)) #else (GET_MODE_SIZE (GET_MODE (last_reg)) >= GET_MODE_SIZE (mode) + word * UNITS_PER_WORD) #endif But I think this was bogus. The new size of the register was: GET_MODE_SIZE (mode) rather than: GET_MODE_SIZE (mode) + word * UNITS_PER_WORD Maybe something like: word == 0 && GET_MODE_SIZE (mode) == GET_MODE_SIZE (GET_MODE (last_reg)) would have been more accurate. Anyway, CLASS_CANNOT_CHANGE_SIZE proved to be too limited, so it was replaced with CLASS_CANNOT_CHANGE_MODE. The code above then became: need_mode = smallest_mode_for_size ((word+1) * UNITS_PER_WORD, GET_MODE_CLASS (mode)); if ( #ifdef CLASS_CANNOT_CHANGE_MODE (TEST_HARD_REG_BIT (reg_class_contents[(int) CLASS_CANNOT_CHANGE_MODE], i) ? ! CLASS_CANNOT_CHANGE_MODE_P (GET_MODE (last_reg), need_mode) : (GET_MODE_SIZE (GET_MODE (last_reg)) >= GET_MODE_SIZE (need_mode))) #else (GET_MODE_SIZE (GET_MODE (last_reg)) >= GET_MODE_SIZE (need_mode)) #endif with need_mode providing a mode of the same size as the then-preexisting size check. I think this mode is bogus for the same reason, and in 2005 I changed the final mode argument from "need_mode" to "mode": http://gcc.gnu.org/ml/gcc-patches/2005-02/msg01665.html That patch also fixed the smallest_mode_for_size argument so that it was a bit count rather than a byte count. Unfortunately, it seems I failed to realise that need_mode was in fact completely meaningless, and should have just been removed instead. Indeed, the bit->byte fix exposed a bug very like this one on s390: http://gcc.gnu.org/ml/gcc-patches/2005-04/msg01226.html Ulrich wisely said: As real fix, I think the computation of a "needed mode" may be completely superfluous in the first place; instead, the first clause of the 'if' could just be replaced by if ((GET_MODE_SIZE (GET_MODE (last_reg)) >= GET_MODE_SIZE (mode) + byte) to which I whined: I remember wondering about this too. Unfortunately, there are no comments at all explaining what the check is actually supposed to do, or what "need_mode" is suppsoed to be, so I thought at the time it was best to leave things be. But we were in release-paranoia mode, and that real fix never happened. I agree that removing need_mode is the right fix for that s390 PR, and for this NEON one. For completeness, Vlad hit a similar problem during the IRA work, which is whether the MODE_PARTIAL_INT check came from: http://gcc.gnu.org/ml/gcc-patches/2008-12/msg00265.html Vlad too thought that this code was probably bogus. Tested on x86_64-linux-gnu and arm-linux-gnueabi. OK to install? Richard gcc/ * reload1.c (choose_reload_regs): Use mode sizes to check whether an old relaod register completely defines the required value. gcc/testsuite/ * gcc.target/arm/neon-modes-3.c: New test. Index: gcc/reload1.c =================================================================== --- gcc/reload1.c 2011-07-05 13:15:27.000000000 +0100 +++ gcc/reload1.c 2011-07-05 13:49:02.000000000 +0100 @@ -6485,6 +6485,8 @@ choose_reload_regs (struct insn_chain *c if (regno >= 0 && reg_last_reload_reg[regno] != 0 + && (GET_MODE_SIZE (GET_MODE (reg_last_reload_reg[regno])) + >= GET_MODE_SIZE (mode) + byte) #ifdef CANNOT_CHANGE_MODE_CLASS /* Verify that the register it's in can be used in mode MODE. */ @@ -6496,24 +6498,12 @@ choose_reload_regs (struct insn_chain *c { enum reg_class rclass = rld[r].rclass, last_class; rtx last_reg = reg_last_reload_reg[regno]; - enum machine_mode need_mode; i = REGNO (last_reg); i += subreg_regno_offset (i, GET_MODE (last_reg), byte, mode); last_class = REGNO_REG_CLASS (i); - if (byte == 0) - need_mode = mode; - else - need_mode - = smallest_mode_for_size - (GET_MODE_BITSIZE (mode) + byte * BITS_PER_UNIT, - GET_MODE_CLASS (mode) == MODE_PARTIAL_INT - ? MODE_INT : GET_MODE_CLASS (mode)); - - if ((GET_MODE_SIZE (GET_MODE (last_reg)) - >= GET_MODE_SIZE (need_mode)) - && reg_reloaded_contents[i] == regno + if (reg_reloaded_contents[i] == regno && TEST_HARD_REG_BIT (reg_reloaded_valid, i) && HARD_REGNO_MODE_OK (i, rld[r].mode) && (TEST_HARD_REG_BIT (reg_class_contents[(int) rclass], i) Index: gcc/testsuite/gcc.target/arm/neon-modes-3.c =================================================================== --- /dev/null 2011-07-05 08:57:30.268785575 +0100 +++ gcc/testsuite/gcc.target/arm/neon-modes-3.c 2011-07-05 13:48:54.000000000 +0100 @@ -0,0 +1,61 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target arm_neon_ok } */ +/* { dg-options "-O" } */ +/* { dg-add-options arm_neon } */ + +#include <arm_neon.h> + +void f1 (volatile float32x4_t *dest, volatile float32x4x4_t *src, int n) +{ + float32x4x4_t a5, a6, a7, a8, a9; + int i; + + a5 = *src; + a6 = *src; + a7 = *src; + a8 = *src; + a9 = *src; + while (n--) + { + for (i = 0; i < 8; i++) + { + float32x4x4_t a0, a1, a2, a3, a4; + + a0 = *src; + a1 = *src; + a2 = *src; + a3 = *src; + a4 = *src; + *src = a0; + *dest = a0.val[0]; + *dest = a0.val[3]; + *src = a1; + *dest = a1.val[0]; + *dest = a1.val[3]; + *src = a2; + *dest = a2.val[0]; + *dest = a2.val[3]; + *src = a3; + *dest = a3.val[0]; + *dest = a3.val[3]; + *src = a4; + *dest = a4.val[0]; + *dest = a4.val[3]; + } + *src = a5; + *dest = a5.val[0]; + *dest = a5.val[3]; + *src = a6; + *dest = a6.val[0]; + *dest = a6.val[3]; + *src = a7; + *dest = a7.val[0]; + *dest = a7.val[3]; + *src = a8; + *dest = a8.val[0]; + *dest = a8.val[3]; + *src = a9; + *dest = a9.val[0]; + *dest = a9.val[3]; + } +}