In the patches: https://gcc.gnu.org/ml/gcc-patches/2019-12/msg01530.html https://gcc.gnu.org/ml/gcc-patches/2019-12/msg01533.html
Segher said the whole code was too complex. This patch is my attempt to make it somewhat easier to understand. One part that is an issue was there was a section of code to tried to prevent doing an ADDI if the register was GPR 0 (where the machine uses '0' instead of the value in GPR 0). I realized that if I changed the order of the adds, I wouldn't have to worry about adding GPR 0. For example consider: #include <altivec.h> double indexed_get1 (vector double *vp, unsigned long m) { return vec_extract (vp[m], 1); } Right now it generates: sldi 4,4,4 addi 9,3,8 lfdx 1,4,9 I.e. add the offset to the base register and then form a X-FORM load with the base and index registers. With this patch, it now generates: sldi 4,4,4 add 9,4,3 lfd 1,8(9) I.e. add the base and index registers to the temporary, and a D-FORM load (assuming the element number is constant) instead of a X-FORM load with the offset as the index. The second part of cleaning up the code was to eliminate the special purpose code that checks the addr_masks for the register type along with the code that assumed all 8-byte values needed a DS-FORM instruction. Instead I now call address_to_insn_form, which is the general address classification function added recently. That function peers into the addr_masks, etc. but it means this function at a higher abstraction layer doens't have to worry about the details. This patch does eliminate the hard_reg_and_mode_to_addr_mask function that I added recently in anticipation of using to optimize PC-relative addresses as well. When I started looking at it, I figured it simplified things if I could push all of the details to address_to_insn_form (which already knew about these things). As with the other patches, I have built and boostrapped a compiler on a little endian power8 system, and there were no regressions in the tests. Can I check this patch into the trunk? 2020-01-09 Michael Meissner <meiss...@linux.ibm.com> * config/rs6000/rs6000.c (reg_to_non_prefixed): Add forward reference. (hard_reg_and_mode_to_addr_mask): Delete, no longer used. (rs6000_adjust_vec_address): If the original vector address was REG+REG or REG+OFFSET and the element is not zero, do the add of the elements in the original address before adding the offset for the vector element. Use address_to_insn_form to validate the address using the register being loaded, rather than guessing whether the address is a DS-FORM or DQ-FORM address. Index: gcc/config/rs6000/rs6000.c =================================================================== --- gcc/config/rs6000/rs6000.c (revision 280072) +++ gcc/config/rs6000/rs6000.c (working copy) @@ -1172,6 +1172,7 @@ static bool rs6000_secondary_reload_move machine_mode, secondary_reload_info *, bool); +static enum non_prefixed_form reg_to_non_prefixed (rtx reg, machine_mode mode); rtl_opt_pass *make_pass_analyze_swaps (gcc::context*); /* Hash table stuff for keeping track of TOC entries. */ @@ -6729,30 +6730,6 @@ rs6000_expand_vector_extract (rtx target } } -/* Helper function to return an address mask based on a physical register. */ - -static addr_mask_type -hard_reg_and_mode_to_addr_mask (rtx reg, machine_mode mode) -{ - unsigned int r = reg_or_subregno (reg); - addr_mask_type addr_mask; - - gcc_assert (HARD_REGISTER_NUM_P (r)); - if (INT_REGNO_P (r)) - addr_mask = reg_addr[mode].addr_mask[RELOAD_REG_GPR]; - - else if (FP_REGNO_P (r)) - addr_mask = reg_addr[mode].addr_mask[RELOAD_REG_FPR]; - - else if (ALTIVEC_REGNO_P (r)) - addr_mask = reg_addr[mode].addr_mask[RELOAD_REG_VMX]; - - else - gcc_unreachable (); - - return addr_mask; -} - /* Return the offset within a memory object (MEM) of a vector type to a given element within the vector (ELEMENT) with an element size (SCALAR_SIZE). If the element is constant, we return a constant integer. Otherwise, we use a @@ -6805,7 +6782,6 @@ rs6000_adjust_vec_address (rtx scalar_re unsigned scalar_size = GET_MODE_SIZE (scalar_mode); rtx addr = XEXP (mem, 0); rtx new_addr; - bool valid_addr_p; gcc_assert (!reg_mentioned_p (base_tmp, addr)); gcc_assert (!reg_mentioned_p (base_tmp, element)); @@ -6833,68 +6809,30 @@ rs6000_adjust_vec_address (rtx scalar_re { rtx op0 = XEXP (addr, 0); rtx op1 = XEXP (addr, 1); - rtx insn; gcc_assert (REG_P (op0) || SUBREG_P (op0)); if (CONST_INT_P (op1) && CONST_INT_P (element_offset)) { + /* D-FORM address with constant element number. */ HOST_WIDE_INT offset = INTVAL (op1) + INTVAL (element_offset); rtx offset_rtx = GEN_INT (offset); - - /* 16-bit offset. */ - if (SIGNED_INTEGER_16BIT_P (offset) - && (scalar_size < 8 || (offset & 0x3) == 0)) - new_addr = gen_rtx_PLUS (Pmode, op0, offset_rtx); - - /* 34-bit offset if we have prefixed addresses. */ - else if (TARGET_PREFIXED_ADDR && SIGNED_INTEGER_34BIT_P (offset)) - new_addr = gen_rtx_PLUS (Pmode, op0, offset_rtx); - - else - { - /* Offset overflowed, move offset to the temporary (which will - likely be split), and do X-FORM addressing. */ - emit_move_insn (base_tmp, offset_rtx); - new_addr = gen_rtx_PLUS (Pmode, op0, base_tmp); - } + new_addr = gen_rtx_PLUS (Pmode, op0, offset_rtx); } else { - bool op1_reg_p = (REG_P (op1) || SUBREG_P (op1)); - bool ele_reg_p = (REG_P (element_offset) || SUBREG_P (element_offset)); + /* If we don't have a D-FORM address with a constant element number, + add the two elements in the current address. Then add the offset. - /* Note, ADDI requires the register being added to be a base - register. If the register was R0, load it up into the temporary - and do the add. */ - if (op1_reg_p - && (ele_reg_p || reg_or_subregno (op1) != FIRST_GPR_REGNO)) - { - insn = gen_add3_insn (base_tmp, op1, element_offset); - gcc_assert (insn != NULL_RTX); - emit_insn (insn); - } - - else if (ele_reg_p - && reg_or_subregno (element_offset) != FIRST_GPR_REGNO) - { - insn = gen_add3_insn (base_tmp, element_offset, op1); - gcc_assert (insn != NULL_RTX); - emit_insn (insn); - } - - /* Make sure we don't overwrite the temporary if the element being - extracted is variable, and we've put the offset into base_tmp - previously. */ - else if (reg_mentioned_p (base_tmp, element_offset)) - emit_insn (gen_add2_insn (base_tmp, op1)); - - else - { - emit_move_insn (base_tmp, op1); - emit_insn (gen_add2_insn (base_tmp, element_offset)); - } - - new_addr = gen_rtx_PLUS (Pmode, op0, base_tmp); + Previously, we tried to add the offset to OP1 and change the + address to an X-FORM format adding OP0 and BASE_TMP, but it became + complicated because we had to verify that op1 was not GPR0 and we + had a constant element offset (due to the way ADDI is defined). + By doing the add of OP0 and OP1 first, and then adding in the + offset, it has the benefit that if D-FORM instructions are + allowed, the offset is part of the memory access to the vector + element. */ + emit_insn (gen_rtx_SET (base_tmp, gen_rtx_PLUS (Pmode, op0, op1))); + new_addr = gen_rtx_PLUS (Pmode, base_tmp, element_offset); } } @@ -6904,27 +6842,19 @@ rs6000_adjust_vec_address (rtx scalar_re new_addr = gen_rtx_PLUS (Pmode, base_tmp, element_offset); } - /* If we have a PLUS, we need to see whether the particular register class - allows for D-FORM or X-FORM addressing. */ - if (GET_CODE (new_addr) == PLUS) - { - rtx op1 = XEXP (new_addr, 1); - addr_mask_type addr_mask - = hard_reg_and_mode_to_addr_mask (scalar_reg, scalar_mode); - - if (REG_P (op1) || SUBREG_P (op1)) - valid_addr_p = (addr_mask & RELOAD_REG_INDEXED) != 0; - else - valid_addr_p = (addr_mask & RELOAD_REG_OFFSET) != 0; - } - - else if (REG_P (new_addr) || SUBREG_P (new_addr)) - valid_addr_p = true; + /* If the address isn't valid, move the address into the temporary base + register. Some reasons it could not be valid include: - else - valid_addr_p = false; + The address offset overflowed the 16 or 34 bit offset size; + We need to use a DS-FORM load, and the bottom 2 bits are non-zero; + We need to use a DQ-FORM load, and the bottom 2 bits are non-zero; + Only X_FORM loads can be done, and the address is D_FORM. */ + + enum insn_form iform + = address_to_insn_form (new_addr, scalar_mode, + reg_to_non_prefixed (scalar_reg, scalar_mode)); - if (!valid_addr_p) + if (iform == INSN_FORM_BAD) { emit_move_insn (base_tmp, new_addr); new_addr = base_tmp; -- Michael Meissner, IBM IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797