On 21/05/14 17:05, Jakub Jelinek wrote:
> On Wed, May 21, 2014 at 12:53:47PM +1000, Kugan wrote:
>> On 20/05/14 16:52, Jakub Jelinek wrote:
>>> On Tue, May 20, 2014 at 12:27:31PM +1000, Kugan wrote:
>>>> 1. Handling NOP_EXPR or CONVERT_EXPR that are in the IL because they
>>>> are required for type correctness. We have two cases here:
>>>>
>>>> A) Mode is smaller than word_mode. This is usually from where the
>>>> zero/sign extensions are showing up in final assembly.
>>>> For example :
>>>> int = (int) short
>>>> which usually expands to
>>>> (set (reg:SI )
>>>> (sext:SI (subreg:HI (reg:SI ))))
>>>> We can expand this
>>>> (set (reg:SI ) (((reg:SI ))))
>>>>
>>>> If following is true:
>>>> 1. Value stored in RHS and LHS are of the same signedness
>>>> 2. Type can hold the value. i.e., In cases like char = (char) short, we
>>>> check that the value in short is representable char type. (i.e. look at
>>>> the value range in RHS SSA_NAME and see if that can be represented in
>>>> types of LHS without overflowing)
>>>>
>>>> Subreg here is not a paradoxical subreg. We are removing the subreg and
>>>> zero/sign extend here.
>>>>
>>>> I am assuming here that QI/HI registers are represented in SImode
>>>> (basically word_mode) with zero/sign extend is used as in
>>>> (zero_extend:SI (subreg:HI (reg:SI 117)).
>>>
>>> Wouldn't it be better to just set proper flags on the SUBREG based on value
>>> range info (SUBREG_PROMOTED_VAR_P and SUBREG_PROMOTED_UNSIGNED_P)?
>>> Then not only the optimizers could eliminate in zext/sext when possible, but
>>> all other optimizations could benefit from that.
>>
>> Thanks for the comments. Here is an attempt (attached) that sets
>> SUBREG_PROMOTED_VAR_P based on value range into. Is this the good place
>> to do this ?
>
> But you aren't setting it in your patch in any way, you are just resetting
> it instead. The thing is, start with a testcase where you get that
> (subreg:HI (reg:SI)) as the RTL of some SSA_NAME (is that the case on ARM?,
> I believe on e.g. i?86/x86_64 you'd just get (reg:HI) instead and thus you
> can't take advantage of that), and at that point where it is created check
> the range info and if it is properly sign or zero extended, set
> SUBREG_PROMOTED_VAR_P and SUBREG_PROMOTED_UNSIGNED_SET.
Here is another attempt (a quick hack patch is attached). Is this
a reasonable direction? I think I will have to look for other places
where SUBREG_PROMOTED_UNSIGNED_P are used for possible optimisations.
Before that I want to make sure I am on the right track.
> Note that right now we use 2 bits for the latter, which encode values
> -1 (weirdo pointer extension), 0 (sign extension), 1 (zero extension).
> Perhaps it would be nice to allow encoding value 2 (zero and sign extension)
> for cases where the range info tells you that the value is both zero and
> sign extended (i.e. minimum of range is >= 0 and maximum is <= signed type
> maximum).
Do you suggest changing rtx_def to achieve this like the following to be
able to store 2 in SUBREG_PROMOTED_UNSIGNED_SET? probably not.
- unsigned int unchanging : 1;
+ unsigned int unchanging : 2;
Thanks,
Kugan
diff --git a/gcc/expr.c b/gcc/expr.c
index 2868d9d..15183fa 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -328,7 +328,8 @@ convert_move (rtx to, rtx from, int unsignedp)
if (GET_CODE (from) == SUBREG && SUBREG_PROMOTED_VAR_P (from)
&& (GET_MODE_PRECISION (GET_MODE (SUBREG_REG (from)))
>= GET_MODE_PRECISION (to_mode))
- && SUBREG_PROMOTED_UNSIGNED_P (from) == unsignedp)
+ && (SUBREG_PROMOTED_UNSIGNED_P (from) == 2
+ || SUBREG_PROMOTED_UNSIGNED_P (from) == unsignedp))
from = gen_lowpart (to_mode, from), from_mode = to_mode;
gcc_assert (GET_CODE (to) != SUBREG || !SUBREG_PROMOTED_VAR_P (to));
@@ -9195,6 +9196,51 @@ expand_expr_real_2 (sepops ops, rtx target, enum
machine_mode tmode,
}
#undef REDUCE_BIT_FIELD
+static bool
+is_value_extended (tree lhs, enum machine_mode rhs_mode, bool rhs_uns)
+{
+ wide_int type_min, type_max;
+ wide_int min, max;
+ unsigned int prec;
+ tree lhs_type;
+ bool lhs_uns;
+
+ if (TREE_CODE (lhs) != SSA_NAME)
+ return false;
+
+ lhs_type = lang_hooks.types.type_for_mode (rhs_mode, rhs_uns);
+ lhs_uns = TYPE_UNSIGNED (TREE_TYPE (lhs));
+
+ /* We remove extension for integrals. */
+ if (!INTEGRAL_TYPE_P (TREE_TYPE (lhs)))
+ return false;
+
+ /* Get the value range. */
+ if (POINTER_TYPE_P (TREE_TYPE (lhs))
+ || get_range_info (lhs, &min, &max) != VR_RANGE)
+ return false;
+
+ prec = min.get_precision ();
+ type_min = wide_int::from (TYPE_MIN_VALUE (lhs_type), prec, TYPE_SIGN
(lhs_type));
+ type_max = wide_int::from (TYPE_MAX_VALUE (lhs_type), prec, TYPE_SIGN
(lhs_type));
+
+ /* Signedness of LHS and RHS should match. */
+ if ((rhs_uns != lhs_uns)
+ && ((rhs_uns && !wi::neg_p (min, TYPE_SIGN (lhs_type)))
+ || (!rhs_uns && wi::neg_p (max, TYPE_SIGN (lhs_type)))))
+ lhs_uns = !lhs_uns;
+ if (rhs_uns != lhs_uns)
+ return false;
+
+ /* If rhs value range fits lhs type, extension is redundant. */
+ if (wi::cmp (max, type_max, TYPE_SIGN (lhs_type)) != 1
+ && (wi::cmp (type_min, min, TYPE_SIGN (lhs_type))) != 1)
+ return true;
+
+ return false;
+}
+
+
/* Return TRUE if expression STMT is suitable for replacement.
Never consider memory loads as replaceable, because those don't ever lead
@@ -9498,7 +9544,11 @@ expand_expr_real_1 (tree exp, rtx target, enum
machine_mode tmode,
temp = gen_lowpart_SUBREG (mode, decl_rtl);
SUBREG_PROMOTED_VAR_P (temp) = 1;
- SUBREG_PROMOTED_UNSIGNED_SET (temp, unsignedp);
+ if (is_value_extended (ssa_name, mode, !unsignedp))
+ SUBREG_PROMOTED_UNSIGNED_SET (temp, 2);
+ else
+ SUBREG_PROMOTED_UNSIGNED_SET (temp, unsignedp);
+
return temp;
}
diff --git a/gcc/rtl.h b/gcc/rtl.h
index 10ae1e9..f34a7b5 100644
--- a/gcc/rtl.h
+++ b/gcc/rtl.h
@@ -299,7 +299,7 @@ struct GTY((chain_next ("RTX_NEXT (&%h)"),
1 in a CONCAT is VAL_EXPR_IS_CLOBBERED in var-tracking.c.
1 in a preserved VALUE is PRESERVED_VALUE_P in cselib.c.
1 in a clobber temporarily created for LRA. */
- unsigned int unchanging : 1;
+ unsigned int unchanging : 2;
/* 1 in a MEM or ASM_OPERANDS expression if the memory reference is volatile.
1 in an INSN, CALL_INSN, JUMP_INSN, CODE_LABEL, BARRIER, or NOTE
if it has been deleted.