Hi! On Wed, 25 Apr 2012 13:51:16 +0200, Richard Guenther <[email protected]> wrote: > On Wed, Apr 25, 2012 at 1:27 PM, Thomas Schwinge > <[email protected]> wrote: > > On Thu, 19 Apr 2012 19:46:17 +0200, I wrote: > >> Here is my current test case, reduced from gcc.dg/tree-ssa/20030922-1.c: > >> > >> extern void abort (void); > >> > >> enum tree_code > >> { > >> BIND_EXPR, > >> }; > >> struct tree_common > >> { > >> enum tree_code code:8; > >> }; > >> void > >> voidify_wrapper_expr (struct tree_common *common) > >> { > >> switch (common->code) > >> { > >> case BIND_EXPR: > >> if (common->code != BIND_EXPR) > >> abort (); > >> } > >> } > >> > >> The diff for -fdump-tree-all between compiling with > >> -fno-strict-volatile-bitfields and -fstrict-volatile-bitfields begins > >> with the following, right in 20030922-1.c.003t.original: > >> > >> diff -ru fnsvb/20030922-1.c.003t.original fsvb/20030922-1.c.003t.original > >> --- fnsvb/20030922-1.c.003t.original 2012-04-19 16:51:18.322150866 +0200 > >> +++ fsvb/20030922-1.c.003t.original 2012-04-19 16:49:18.132088498 +0200 > >> @@ -7,7 +7,7 @@ > >> switch ((int) common->code) > >> { > >> case 0:; > >> - if (common->code != 0) > >> + if ((BIT_FIELD_REF <*common, 32, 0> & 255) != 0) > >> { > >> abort (); > >> } > >> > >> That is, for -fno-strict-volatile-bitfields the second instance of > >> »common->code« it is a component_ref, whereas for > >> -fstrict-volatile-bitfields it is a bit_field_ref. The former will be > >> optimized as intended, the latter will not. So, what should be emitted > >> in the -fstrict-volatile-bitfields case? A component_ref -- but this is > >> what Bernd's commit r182545 (for PR51200) changed, I think? Or should > >> later optimization stages learn to better deal with a bit_field_ref, and > >> where would this have to happen? > > > > I figured out why this generic code is behaving differently for SH > > targets. I compared to ARM as well as x86, for which indeed the > > optimization possibility is not lost even when compiling with > > -fstrict-volatile-bitfields. > > > > The component_ref inside the if statement (but not the one in the switch > > statement) is fed into optimize_bit_field_compare where it is optimized > > to »BIT_FIELD_REF <*common, 32, 0> & 255« only for SH, because > > get_best_mode for SH returns SImode (ARM, x86: QImode), because SH has > > »#define SLOW_BYTE_ACCESS 1«, and thus the original QImode is widened to > > SImode, hoping for better code generation »since it may eliminate > > subsequent memory access if subsequent accesses occur to other fields in > > the same word of the structure, but to different bytes«. (Well, the > > converse happens here...) After that, in optimize_bit_field_compare, for > > ARM and x86, »nbitsize == lbitsize«, and thus the optimization is > > aborted, whereas for SH it will be performed, that is, the component_ref > > is replaced with »BIT_FIELD_REF <*common, 32, 0> & 255«. > > > > If I manually force SH's SImode back to QImode (that is, abort > > optimize_bit_field_compare), the later optimization passes work as they > > do for ARM and x86. > > > > Before Bernd's r182545, optimize_bit_field_compare returned early (that > > is, aborted this optimization attempt), because »lbitsize == > > GET_MODE_BITSIZE (lmode)« (8 bits). (With the current sources, lmode is > > VOIDmode.) > > > > Is emmitting »BIT_FIELD_REF <*common, 32, 0> & 255« wrong in this case, > > or should a later optimization pass be able to figure out that > > »BIT_FIELD_REF <*common, 32, 0> & 255« is in fact the same as > > common->code, and then be able to conflate these? Any suggestions > > where/how to tackle this? > > The BIT_FIELD_REF is somewhat of a red-herring. It is created by fold-const.c > in optimize_bit_field_compare, code that I think should be removed completely. > Or it needs to be made aware of strict-volatile bitfield and C++ memory model > details.
As discussed with richi on IRC, I have now done testing (just gcc
testsuite) with optimize_bit_field_compare completely removed. For SH,
this cures my testcase posted above as well as the the following FAILs,
all as expected:
-FAIL: gcc.dg/tree-ssa/20030922-1.c scan-tree-dump-times dom2 "if " 0
+PASS: gcc.dg/tree-ssa/20030922-1.c scan-tree-dump-times dom2 "if " 0
-FAIL: gcc.dg/tree-ssa/foldconst-3.c scan-tree-dump-not optimized
"tree_code_type"
+PASS: gcc.dg/tree-ssa/foldconst-3.c scan-tree-dump-not optimized
"tree_code_type"
-FAIL: gcc.dg/tree-ssa/vrp15.c scan-tree-dump-times vrp1
"tree_code_length.42." 1
+PASS: gcc.dg/tree-ssa/vrp15.c scan-tree-dump-times vrp1
"tree_code_length.42." 1
No regressions for ARM and x86. I also manually confirmed that the
x86-specific tests introduced for PR32748 (gcc.target/i386/pr37248-*.c)
still produce the optimized (itermediate and assembly) code as suggested
in the *.c files. (So, this optimization is (also) done elsewhere,
instead of in optimize_bit_field_compare.) So, is removing it indeed the
way to go?
Instead of removing it completely -- are the diagonsis messages
(warnings) that it emits worth preserving, or are these covered
elsewhere?
gcc/
* fold-const.c (fold_binary_loc): Don't invoke
optimize_bit_field_compare.
(optimize_bit_field_compare): Remove function.
Index: fold-const.c
===================================================================
--- fold-const.c (revision 186856)
+++ fold-const.c (working copy)
@@ -103,8 +103,6 @@ static tree pedantic_omit_one_operand_loc (locatio
static tree distribute_bit_expr (location_t, enum tree_code, tree, tree, tree);
static tree make_bit_field_ref (location_t, tree, tree,
HOST_WIDE_INT, HOST_WIDE_INT, int);
-static tree optimize_bit_field_compare (location_t, enum tree_code,
- tree, tree, tree);
static tree decode_field_reference (location_t, tree, HOST_WIDE_INT *,
HOST_WIDE_INT *,
enum machine_mode *, int *, int *,
@@ -3306,182 +3304,6 @@ make_bit_field_ref (location_t loc, tree inner, tr
return result;
}
-
-/* Optimize a bit-field compare.
-
- There are two cases: First is a compare against a constant and the
- second is a comparison of two items where the fields are at the same
- bit position relative to the start of a chunk (byte, halfword, word)
- large enough to contain it. In these cases we can avoid the shift
- implicit in bitfield extractions.
-
- For constants, we emit a compare of the shifted constant with the
- BIT_AND_EXPR of a mask and a byte, halfword, or word of the operand being
- compared. For two fields at the same position, we do the ANDs with the
- similar mask and compare the result of the ANDs.
-
- CODE is the comparison code, known to be either NE_EXPR or EQ_EXPR.
- COMPARE_TYPE is the type of the comparison, and LHS and RHS
- are the left and right operands of the comparison, respectively.
-
- If the optimization described above can be done, we return the resulting
- tree. Otherwise we return zero. */
-
-static tree
-optimize_bit_field_compare (location_t loc, enum tree_code code,
- tree compare_type, tree lhs, tree rhs)
-{
- HOST_WIDE_INT lbitpos, lbitsize, rbitpos, rbitsize, nbitpos, nbitsize;
- tree type = TREE_TYPE (lhs);
- tree signed_type, unsigned_type;
- int const_p = TREE_CODE (rhs) == INTEGER_CST;
- enum machine_mode lmode, rmode, nmode;
- int lunsignedp, runsignedp;
- int lvolatilep = 0, rvolatilep = 0;
- tree linner, rinner = NULL_TREE;
- tree mask;
- tree offset;
-
- /* Get all the information about the extractions being done. If the bit size
- if the same as the size of the underlying object, we aren't doing an
- extraction at all and so can do nothing. We also don't want to
- do anything if the inner expression is a PLACEHOLDER_EXPR since we
- then will no longer be able to replace it. */
- linner = get_inner_reference (lhs, &lbitsize, &lbitpos, &offset, &lmode,
- &lunsignedp, &lvolatilep, false);
- if (linner == lhs || lbitsize == GET_MODE_BITSIZE (lmode) || lbitsize < 0
- || offset != 0 || TREE_CODE (linner) == PLACEHOLDER_EXPR)
- return 0;
-
- if (!const_p)
- {
- /* If this is not a constant, we can only do something if bit positions,
- sizes, and signedness are the same. */
- rinner = get_inner_reference (rhs, &rbitsize, &rbitpos, &offset, &rmode,
- &runsignedp, &rvolatilep, false);
-
- if (rinner == rhs || lbitpos != rbitpos || lbitsize != rbitsize
- || lunsignedp != runsignedp || offset != 0
- || TREE_CODE (rinner) == PLACEHOLDER_EXPR)
- return 0;
- }
-
- /* See if we can find a mode to refer to this field. We should be able to,
- but fail if we can't. */
- if (lvolatilep
- && GET_MODE_BITSIZE (lmode) > 0
- && flag_strict_volatile_bitfields > 0)
- nmode = lmode;
- else
- nmode = get_best_mode (lbitsize, lbitpos, 0, 0,
- const_p ? TYPE_ALIGN (TREE_TYPE (linner))
- : MIN (TYPE_ALIGN (TREE_TYPE (linner)),
- TYPE_ALIGN (TREE_TYPE (rinner))),
- word_mode, lvolatilep || rvolatilep);
- if (nmode == VOIDmode)
- return 0;
-
- /* Set signed and unsigned types of the precision of this mode for the
- shifts below. */
- signed_type = lang_hooks.types.type_for_mode (nmode, 0);
- unsigned_type = lang_hooks.types.type_for_mode (nmode, 1);
-
- /* Compute the bit position and size for the new reference and our offset
- within it. If the new reference is the same size as the original, we
- won't optimize anything, so return zero. */
- nbitsize = GET_MODE_BITSIZE (nmode);
- nbitpos = lbitpos & ~ (nbitsize - 1);
- lbitpos -= nbitpos;
- if (nbitsize == lbitsize)
- return 0;
-
- if (BYTES_BIG_ENDIAN)
- lbitpos = nbitsize - lbitsize - lbitpos;
-
- /* Make the mask to be used against the extracted field. */
- mask = build_int_cst_type (unsigned_type, -1);
- mask = const_binop (LSHIFT_EXPR, mask, size_int (nbitsize - lbitsize));
- mask = const_binop (RSHIFT_EXPR, mask,
- size_int (nbitsize - lbitsize - lbitpos));
-
- if (! const_p)
- /* If not comparing with constant, just rework the comparison
- and return. */
- return fold_build2_loc (loc, code, compare_type,
- fold_build2_loc (loc, BIT_AND_EXPR, unsigned_type,
- make_bit_field_ref (loc, linner,
- unsigned_type,
- nbitsize, nbitpos,
- 1),
- mask),
- fold_build2_loc (loc, BIT_AND_EXPR, unsigned_type,
- make_bit_field_ref (loc, rinner,
- unsigned_type,
- nbitsize, nbitpos,
- 1),
- mask));
-
- /* Otherwise, we are handling the constant case. See if the constant is too
- big for the field. Warn and return a tree of for 0 (false) if so. We do
- this not only for its own sake, but to avoid having to test for this
- error case below. If we didn't, we might generate wrong code.
-
- For unsigned fields, the constant shifted right by the field length should
- be all zero. For signed fields, the high-order bits should agree with
- the sign bit. */
-
- if (lunsignedp)
- {
- if (! integer_zerop (const_binop (RSHIFT_EXPR,
- fold_convert_loc (loc,
- unsigned_type, rhs),
- size_int (lbitsize))))
- {
- warning (0, "comparison is always %d due to width of bit-field",
- code == NE_EXPR);
- return constant_boolean_node (code == NE_EXPR, compare_type);
- }
- }
- else
- {
- tree tem = const_binop (RSHIFT_EXPR,
- fold_convert_loc (loc, signed_type, rhs),
- size_int (lbitsize - 1));
- if (! integer_zerop (tem) && ! integer_all_onesp (tem))
- {
- warning (0, "comparison is always %d due to width of bit-field",
- code == NE_EXPR);
- return constant_boolean_node (code == NE_EXPR, compare_type);
- }
- }
-
- /* Single-bit compares should always be against zero. */
- if (lbitsize == 1 && ! integer_zerop (rhs))
- {
- code = code == EQ_EXPR ? NE_EXPR : EQ_EXPR;
- rhs = build_int_cst (type, 0);
- }
-
- /* Make a new bitfield reference, shift the constant over the
- appropriate number of bits and mask it with the computed mask
- (in case this was a signed field). If we changed it, make a new one. */
- lhs = make_bit_field_ref (loc, linner, unsigned_type, nbitsize, nbitpos, 1);
- if (lvolatilep)
- {
- TREE_SIDE_EFFECTS (lhs) = 1;
- TREE_THIS_VOLATILE (lhs) = 1;
- }
-
- rhs = const_binop (BIT_AND_EXPR,
- const_binop (LSHIFT_EXPR,
- fold_convert_loc (loc, unsigned_type, rhs),
- size_int (lbitpos)),
- mask);
-
- lhs = build2_loc (loc, code, compare_type,
- build2 (BIT_AND_EXPR, unsigned_type, lhs, mask), rhs);
- return lhs;
-}
/* Subroutine for fold_truth_andor_1: decode a field reference.
@@ -12811,18 +12633,6 @@ fold_binary_loc (location_t loc,
return omit_one_operand_loc (loc, type, rslt, arg0);
}
- /* If this is a comparison of a field, we may be able to simplify it. */
- if ((TREE_CODE (arg0) == COMPONENT_REF
- || TREE_CODE (arg0) == BIT_FIELD_REF)
- /* Handle the constant case even without -O
- to make sure the warnings are given. */
- && (optimize || TREE_CODE (arg1) == INTEGER_CST))
- {
- t1 = optimize_bit_field_compare (loc, code, type, arg0, arg1);
- if (t1)
- return t1;
- }
-
/* Optimize comparisons of strlen vs zero to a compare of the
first character of the string vs zero. To wit,
strlen(ptr) == 0 => *ptr == 0
Grüße,
Thomas
pgpjszTEH5LRm.pgp
Description: PGP signature
