Richard Biener <richard.guent...@gmail.com> writes: > On Fri, Nov 3, 2017 at 5:32 PM, Richard Sandiford > <richard.sandif...@linaro.org> wrote: >> A general TARGET_MEM_REF is: >> >> BASE + STEP * INDEX + INDEX2 + OFFSET >> >> After classifying the address in this way, the code that builds >> TARGET_MEM_REFs tries to simplify the address until it's valid >> for the current target and for the mode of memory being addressed. >> It does this in a fixed order: >> >> (1) add SYMBOL to BASE >> (2) add INDEX * STEP to the base, if STEP != 1 >> (3) add OFFSET to INDEX or BASE (reverted if unsuccessful) >> (4) add INDEX to BASE >> (5) add OFFSET to BASE >> >> So suppose we had an address: >> >> &symbol + offset + index * 8 (e.g. "a[i + 1]" for a global "a") >> >> on a target that only allows an index or an offset, not both. Following >> the steps above, we'd first create: >> >> tmp = symbol >> tmp2 = tmp + index * 8 >> >> Then if the given offset value was valid for the mode being addressed, >> we'd create: >> >> MEM[base:tmp2, offset:offset] >> >> while if it was invalid we'd create: >> >> tmp3 = tmp2 + offset >> MEM[base:tmp3, offset:0] >> >> The problem is that this could happen if ivopts had decided to use >> a scaled index for an address that happens to have a constant base. >> The old procedure failed to give an indexed TARGET_MEM_REF in that case, >> and adding the offset last prevented later passes from being able to >> fold the index back in. >> >> The patch avoids this by skipping (2) if BASE + INDEX * STEP >> is a legitimate address and if OFFSET is stopping the address >> being valid. >> >> Tested on aarch64-linux-gnu, x86_64-linux-gnu and powerpc64-linux-gnu. >> OK to install? >> >> Richard >> >> >> 2017-10-31 Richard Sandiford <richard.sandif...@linaro.org> >> Alan Hayward <alan.hayw...@arm.com> >> David Sherwood <david.sherw...@arm.com> >> >> gcc/ >> * tree-ssa-address.c (keep_index_p): New function. >> (create_mem_ref): Use it. Only split out the INDEX * STEP >> component if that is invalid even with the symbol and offset >> removed. >> >> Index: gcc/tree-ssa-address.c >> =================================================================== >> --- gcc/tree-ssa-address.c 2017-11-03 12:15:44.097060121 +0000 >> +++ gcc/tree-ssa-address.c 2017-11-03 12:21:18.060359821 +0000 >> @@ -746,6 +746,20 @@ gimplify_mem_ref_parts (gimple_stmt_iter >> true, GSI_SAME_STMT); >> } >> >> +/* Return true if the STEP in PARTS gives a valid BASE + INDEX * STEP >> + address for type TYPE and if the offset is making it appear invalid. */ >> + >> +static bool >> +keep_index_p (tree type, mem_address parts) > > mem_ref_valid_without_offset_p (...) > > ?
OK. >> +{ >> + if (!parts.base) >> + return false; >> + >> + gcc_assert (!parts.symbol); >> + parts.offset = NULL_TREE; >> + return valid_mem_ref_p (TYPE_MODE (type), TYPE_ADDR_SPACE (type), &parts); >> +} >> + >> /* Creates and returns a TARGET_MEM_REF for address ADDR. If necessary >> computations are emitted in front of GSI. TYPE is the mode >> of created memory reference. IV_CAND is the selected iv candidate in >> ADDR, >> @@ -809,7 +823,8 @@ create_mem_ref (gimple_stmt_iterator *gs > > Which means all of the following would be more naturally written as > >> into: >> index' = index << step; >> [... + index' + ,,,]. */ >> - if (parts.step && !integer_onep (parts.step)) >> + bool scaled_p = (parts.step && !integer_onep (parts.step)); >> + if (scaled_p && !keep_index_p (type, parts)) >> { > > if (mem_ref_valid_without_offset_p (...)) > { > ... > return create_mem_ref_raw (...); > } Is this inside the test for a scale: if (parts.step && !integer_onep (parts.step)) { if (mem_ref_valid_without_offset_p (...)) { tree tmp = parts.offset; if (parts.base) { tmp = fold_build_pointer_plus (parts.base, tmp); tmp = force_gimple_operand_gsi_1 (gsi, tmp, is_gimple_mem_ref_addr, NULL_TREE, true, GSI_SAME_STMT); } parts.base = tmp; parts.offset = NULL_TREE; mem_ref = create_mem_ref_raw (type, alias_ptr_type, &parts, true); gcc_assert (mem_ref); return mem_ref; } ...current code... } ? I can do that if you don't mind the cut-&-paste. Or I could split the code that adds the offset out into a subroutine. If we do it outside the check for a scaled index, we'd need to duplicate the logic about whether to add the offset to the index or the base. > that said, the function should ideally be re-written to try a few more > options non-destructively. Doesn't IVOPTs itself already verify > the TARGET_MEM_REFs it generates (and costs!) are valid? I think it only evaluates the cost of the address and leaves this routine to make them valid. (And the costs are usually right for SVE after the ivopts patch I posted.) We'd probably have to duplicate the same logic if ivopts legitimised the addresses itself. Thanks, Richard > > Thanks, > Richard. > >> gcc_assert (parts.index); >> parts.index = force_gimple_operand_gsi (gsi, >> @@ -821,6 +836,7 @@ create_mem_ref (gimple_stmt_iterator *gs >> mem_ref = create_mem_ref_raw (type, alias_ptr_type, &parts, true); >> if (mem_ref) >> return mem_ref; >> + scaled_p = false; >> } >> >> /* Add offset to invariant part by transforming address expression: >> @@ -832,7 +848,9 @@ create_mem_ref (gimple_stmt_iterator *gs >> index' = index + offset; >> [base + index'] >> depending on which one is invariant. */ >> - if (parts.offset && !integer_zerop (parts.offset)) >> + if (parts.offset >> + && !integer_zerop (parts.offset) >> + && (!var_in_base || !scaled_p)) >> { >> tree old_base = unshare_expr (parts.base); >> tree old_index = unshare_expr (parts.index); >> @@ -882,7 +900,7 @@ create_mem_ref (gimple_stmt_iterator *gs >> /* Transform [base + index + ...] into: >> base' = base + index; >> [base' + ...]. */ >> - if (parts.index) >> + if (parts.index && !scaled_p) >> { >> tmp = parts.index; >> parts.index = NULL_TREE;