Re: [Patch/rtl-expand] Take tree range info into account to improve LSHIFT_EXP expanding

Jeff Law Fri, 14 Aug 2015 13:25:05 -0700

On 08/14/2015 11:40 AM, Jiong Wang wrote:


   * Figuring out whether the shift source is coming from sign extension
     by checking SSA_NAME_DEF_STMT instead of deducing from tree range
     info. I fell checking the gimple statement is more reliable and
     straigtforward.

I suspect it'll also apply more often if you're looking for thenop-conversion rather than looking at range information.

I keep thinking there ought to be a generalization here so that we'renot so restrictive on the modes, but it's probably not worth doing.

In a perfect world we'd also integrate the other strategies fordouble-word shifts that exist in the various ports as special cases inexpansion and remove the double-word shift patterns for ports that don'tactually have them in hardware. But that's probably a bit much to ask-- however, it probably couldn't hurt to see if there are any that areeasily handled.


   * For the pseudo register overlaping issue, given:

       RTX target = TREE source << TREE amount

     I was trying to make sure those two SSA_NAME won't get the same
     pseudo register by comparing their assigned partition using
     var_to_partition, but I can't get the associated tree representation
     for "target" which is RTX.

     Then I just expand "source" and compare the register number with
     "target".

Right. Which is what I would have suggested. Given two pseudos, youcan just compare them for equality. If they're unequal, then its safeto assume they do not overlap.


     But I found the simplest way maybe just reorder

       low_part (target) = low_part (source) << amount
       high_part (target) = low_part (source) >> amount1

     to:

       high_part (target) = low_part (source) >> amount1
       low_part (target) = low_part (source) << amount

     then, even target and source share the same pseudo register, we can
     still get what we want, as we are operating on their subreg.

Yes.  This would work too and avoids the need to find source's pseudo.


   * Added checking on the result value of expand_variable_shift, if it's
     not the same as "target" then call emit_move_insn to do that.

Excellent.


   How is the re-worked patch looks to you?

   x86 bootstrap OK, regression OK. aarch64 bootstrap OK.

2015-08-14  Jiong.Wang<jiong.w...@arm.com>

gcc/
   * expr.c (expand_expr_real_2): Check tree format to optimize
   LSHIFT_EXPR expand.

gcc/testsuite
   * gcc.dg/wide_shift_64_1.c: New testcase.
   * gcc.dg/wide_shift_128_1.c: Likewise.
   * gcc.target/aarch64/ashlti3_1.c: Likewise.
   * gcc.target/arm/ashldisi_1.c: Likewise.

+       /* Left shfit optimization:

s/shfit/shift/

+
+          If mode == GET_MODE_WIDER_MODE (word_mode), then normally there isn't
+          native instruction to support this wide mode left shift.  Given below
+          example:
+
+           Type A = (Type) B  << C
+
+            |<           T      >|
+            |   high     |   low     |
+
+                         |<- size  ->|
+
+          By checking tree shape, if operand B is coming from signed extension,
+          then the left shift operand can be simplified into:
+
+             1. high = low >> (size - C);
+             2. low = low << C;

You'll want to reorder those to match the code you generate.

Doesn't this require that C be less than the bitsize of a word?

If C is larger than the bitsize of a word, then you need someadjustments, something like:



              1. high = low << (C - size)
              2. low = 0

Does this transformation require that the wider mode be exactly 2X thenarrower mode? If so, that needs to be verified.

+               if (GET_MODE_SIZE (rmode) < GET_MODE_SIZE (mode)

So we're assured we have a widening conversion.

+                   && ((TREE_INT_CST_LOW (treeop1) + GET_MODE_BITSIZE (rmode))
+                       >= GET_MODE_BITSIZE (word_mode)))

This test seems wrong. I'm not sure what you're really trying to testhere. It seems you want to look at the shift count relative to thebitsize of word_mode. If it's less, then generate the code youmentioned above in the comment. If it's more, then generate thesequence I referenced? Right?

I do think you need to be verifying that rmode == wordmode here. If Iunderstand everything correctly, the final value is "mode" which must be2X the size size of rmode/wordmode here, right?




The other question is are we endianness-safe in these transformations?

+                 {
+                   rtx low = simplify_gen_subreg (word_mode, op0, mode, 0);
+                   rtx tlow = simplify_gen_subreg (word_mode, target, mode, 0);
+                   rtx thigh = simplify_gen_subreg (word_mode, target, mode,
+                                                    UNITS_PER_WORD);
+                   HOST_WIDE_INT ramount = (BITS_PER_WORD
+                                            - TREE_INT_CST_LOW (treeop1));
+                   tree rshift = build_int_cst (TREE_TYPE (treeop1), ramount);
+
+                   temp = expand_variable_shift (code, word_mode, low, treeop1,
+                                                 tlow, unsignedp);

Why use "code" here right than just using LSHIFT_EXPR?  As noted earlier,

Jeff

Re: [Patch/rtl-expand] Take tree range info into account to improve LSHIFT_EXP expanding

Reply via email to