[Bug middle-end/71002] [6/7 Regression] -fstrict-aliasing breaks Boost's short string optimization implementation
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71002 --- Comment #11 from Tavian Barnes --- Yeah I reported the Boost bug as https://svn.boost.org/trac/boost/ticket/12183.
[Bug middle-end/71002] [6/7 Regression] -fstrict-aliasing breaks Boost's short string optimization implementation
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71002 --- Comment #10 from Richard Biener --- Hmm, so trying to preserve alias-set zero from the BIT_FIELD_REF folding using a MEM_REF and reference_alias_ptr_type doesn't work as the latter doesn't preserve the langhook effects (duh, that's some interesting thing on its own). Leaves the possibility to not doing the BIT_FIELD_REF producing folding if alias-sets would not agree. The following restores GCC 5 behavior here. Note Boost still is buggy here. Index: gcc/fold-const.c === --- gcc/fold-const.c(revision 236032) +++ gcc/fold-const.c(working copy) @@ -117,8 +117,6 @@ static enum tree_code compcode_to_compar static int operand_equal_for_comparison_p (tree, tree, tree); static int twoval_comparison_p (tree, tree *, tree *, int *); static tree eval_subst (location_t, tree, tree, tree, tree, tree); -static tree make_bit_field_ref (location_t, tree, tree, - HOST_WIDE_INT, HOST_WIDE_INT, int, int); static tree optimize_bit_field_compare (location_t, enum tree_code, tree, tree, tree); static tree decode_field_reference (location_t, tree, HOST_WIDE_INT *, @@ -3859,7 +3857,9 @@ optimize_bit_field_compare (location_t l linner = get_inner_reference (lhs, &lbitsize, &lbitpos, &offset, &lmode, &lunsignedp, &lreversep, &lvolatilep, false); if (linner == lhs || lbitsize == GET_MODE_BITSIZE (lmode) || lbitsize < 0 - || offset != 0 || TREE_CODE (linner) == PLACEHOLDER_EXPR || lvolatilep) + || offset != 0 || TREE_CODE (linner) == PLACEHOLDER_EXPR || lvolatilep + /* Make sure we can preserve alias-sets. */ + || get_alias_set (lhs) != get_alias_set (linner)) return 0; if (const_p) @@ -3874,7 +3874,9 @@ optimize_bit_field_compare (location_t l if (rinner == rhs || lbitpos != rbitpos || lbitsize != rbitsize || lunsignedp != runsignedp || lreversep != rreversep || offset != 0 -|| TREE_CODE (rinner) == PLACEHOLDER_EXPR || rvolatilep) +|| TREE_CODE (rinner) == PLACEHOLDER_EXPR || rvolatilep +/* Make sure we can preserve alias-sets. */ +|| get_alias_set (rhs) != get_alias_set (rinner)) return 0; } @@ -5791,7 +5793,10 @@ fold_truth_andor_1 (location_t loc, enum || ll_unsignedp != lr_unsignedp || rl_unsignedp != rr_unsignedp /* Make sure the two fields on the right correspond to the left without being swapped. */ - || ll_bitpos - rl_bitpos != lr_bitpos - rr_bitpos) + || ll_bitpos - rl_bitpos != lr_bitpos - rr_bitpos + /* Make sure we can preserve alias-sets. */ + || get_alias_set (ll_arg) != get_alias_set (ll_inner) + || get_alias_set (lr_arg) != get_alias_set (lr_inner)) return 0; first_bit = MIN (lr_bitpos, rr_bitpos); @@ -5921,6 +5926,10 @@ fold_truth_andor_1 (location_t loc, enum } } + /* Make sure we can preserve alias-sets. */ + if (get_alias_set (ll_arg) != get_alias_set (ll_inner)) +return NULL_TREE; + /* Construct the expression we will return. First get the component reference we will make. Unless the mask is all ones the width of that field, perform the mask operation. Then compare with the
[Bug middle-end/71002] [6/7 Regression] -fstrict-aliasing breaks Boost's short string optimization implementation
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71002 --- Comment #9 from Richard Biener --- So optimize_bit_field_compare does make a difference implementation-wise (but not really fixes the undefinedness in the boost code wrt the access of the non-active union-member when reading from is_short). optimize_bit_field_compare turns ((const struct string *) this)->m_repr.s.h.is_short != 0 into (BIT_FIELD_REF <*(const struct string *) this, 8, 0> & 1) != 0 where the former memory access uses alias-set zero because of c-common.c:c_common_get_alias_set handling of union-accesses: /* Permit type-punning when accessing a union, provided the access is directly through the union. For example, this code does not permit taking the address of a union member and then storing through it. Even the type-punning allowed here is a GCC extension, albeit a common and useful one; the C standard says that such accesses have implementation-defined behavior. */ for (u = t; TREE_CODE (u) == COMPONENT_REF || TREE_CODE (u) == ARRAY_REF; u = TREE_OPERAND (u, 0)) if (TREE_CODE (u) == COMPONENT_REF && TREE_CODE (TREE_TYPE (TREE_OPERAND (u, 0))) == UNION_TYPE) return 0; (which is something I am trying to get rid of for quite some time ...) And the latter gets alias-set (that of type 'string') (which I think is the better behavior, even if "miscompiling" this case). Not applying optimize_bit_field_compare also gets us better optimization, removing dead code already at the early GIMPLE level.
[Bug middle-end/71002] [6/7 Regression] -fstrict-aliasing breaks Boost's short string optimization implementation
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71002 --- Comment #8 from Richard Biener --- Note that ultimatively the error is still that is_short () accesses the wrong union member. I'll still see whether there is a bug in optimize_bit_field_compare.
[Bug middle-end/71002] [6/7 Regression] -fstrict-aliasing breaks Boost's short string optimization implementation
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71002 Richard Biener changed: What|Removed |Added Status|NEW |ASSIGNED Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org --- Comment #7 from Richard Biener --- So PRE sees: : # .MEM_3 = VDEF <.MEM_1(D)> MEM[(struct &)this_2(D)] ={v} {CLOBBER}; # .MEM_14 = VDEF <.MEM_3> MEM[(struct &)this_2(D)] ={v} {CLOBBER}; # .MEM_15 = VDEF <.MEM_14> MEM[(struct short_t &)this_2(D)].h.is_short = 1; # .MEM_16 = VDEF <.MEM_15> MEM[(struct short_t &)this_2(D)].h.length = 0; # .MEM_17 = VDEF <.MEM_16> MEM[(struct short_t &)this_2(D)].data[0] = 0; # VUSE <.MEM_17> _19 = BIT_FIELD_REF ; _20 = _19 & 1; if (_20 != 0) goto ; else goto ; : # VUSE <.MEM_17> _21 = BIT_FIELD_REF ; _22 = _21 & 1; if (_22 != 0) goto ; else goto ; : # .MEM_34 = VDEF <.MEM_17> short_backup = MEM[(const struct short_t &)this_2(D)]; # .MEM_35 = VDEF <.MEM_34> MEM[(struct long_t *)this_2(D)] = MEM[(const struct long_t *)str_5(D)]; # .MEM_36 = VDEF <.MEM_35> MEM[(struct short_t *)str_5(D)] = short_backup; # .MEM_37 = VDEF <.MEM_36> short_backup ={v} {CLOBBER}; goto ; : # .MEM_29 = PHI <.MEM_33(13), .MEM_37(5)> # VUSE <.MEM_29> _9 = MEM[(const struct short_t &)this_2(D)].h.length; _10 = (long unsigned int) _9; goto ; and it concludes that on the path bb2 -> bb5 -> bb10 nothing can clobber MEM[(const struct short_t &)this_2(D)].h.length and thus it is safe to replace it with zero (from the init in bb2). The "obvious" clobbering candidate is # .MEM_35 = VDEF <.MEM_34> MEM[(struct long_t *)this_2(D)] = MEM[(const struct long_t *)str_5(D)]; but that is storing using a different type. So the error must happen earlier. It might be as early as the BIT_FIELD_REF folding of the is_short read which does _19 = BIT_FIELD_REF ; It is DOM who threads further reads from the above again not considering the above store clobbering a read via 'struct string' (which does not include long_t in its alias subsets). Thus mine (to fix optimize_bit_field_compare).
[Bug middle-end/71002] [6/7 Regression] -fstrict-aliasing breaks Boost's short string optimization implementation
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71002 --- Comment #6 from rguenther at suse dot de --- On Mon, 9 May 2016, tavianator at gmail dot com wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71002 > > --- Comment #5 from Tavian Barnes --- > > But if it is not POD then assuming it gets copied correctly when > > init-constructing a POD union where they placed such object is > > an interesting assumption... > > Hrm? They seem to always copy it manually with long_t's copy constructor: > > ::new(&m_repr.long_repr()) long_t(other.m_repr.long_repr()); I stand corrected. At least they always read from short_t when computing is_short() even if that is not the active union member. Some missed optimizations: foo::foo(string) (struct foo * const this, struct string & restrict str) { ... : MEM[(struct short_t &)this_2(D)].h.is_short = 1; MEM[(struct short_t &)this_2(D)].h.length = 0; MEM[(struct short_t &)this_2(D)].data[0] = 0; _19 = BIT_FIELD_REF ; _20 = _19 & 1; if (_20 != 0) this is fold-const.c at work replacing the load from h.is_short with the BIT_FIELD_REF. But then we go to short_t short_backup(m_repr.short_repr()); m_repr.short_repr().~short_t(); ::new(&m_repr.long_repr()) long_t(other.m_repr.long_repr()); other.m_repr.long_repr().~long_t(); ::new(&other.m_repr.short_repr()) short_t(short_backup); : short_backup = MEM[(const struct short_t &)this_2(D)]; MEM[(struct long_t *)this_2(D)] = MEM[(const struct long_t *)str_5(D)]; MEM[(struct short_t *)str_5(D)] = short_backup; short_backup ={v} {CLOBBER}; goto ; which looks good. But : # prephitmp_52 = PHI <_51(10), 0(5)> goto ; : # iftmp.7_13 = PHI *this_2(D).m_len = iftmp.7_13; stores zero into the length field. Somehow PRE does this transform. I'll investigate some more.
[Bug middle-end/71002] [6/7 Regression] -fstrict-aliasing breaks Boost's short string optimization implementation
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71002 --- Comment #5 from Tavian Barnes --- > But if it is not POD then assuming it gets copied correctly when > init-constructing a POD union where they placed such object is > an interesting assumption... Hrm? They seem to always copy it manually with long_t's copy constructor: ::new(&m_repr.long_repr()) long_t(other.m_repr.long_repr());
[Bug middle-end/71002] [6/7 Regression] -fstrict-aliasing breaks Boost's short string optimization implementation
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71002 --- Comment #4 from rguenther at suse dot de --- On Mon, 9 May 2016, tavianator at gmail dot com wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71002 > > --- Comment #3 from Tavian Barnes --- > Because their long_t is not POD. I don't know why that is though. It could > be > POD if they removed the default/copy constructors and assignment operator. > > Actually they're probably worried about custom allocators where the pointer > type is not POD. So it couldn't be POD in general, and thus can't appear in a > union directly (in C++03). But if it is not POD then assuming it gets copied correctly when init-constructing a POD union where they placed such object is an interesting assumption...
[Bug middle-end/71002] [6/7 Regression] -fstrict-aliasing breaks Boost's short string optimization implementation
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71002 --- Comment #3 from Tavian Barnes --- Because their long_t is not POD. I don't know why that is though. It could be POD if they removed the default/copy constructors and assignment operator. Actually they're probably worried about custom allocators where the pointer type is not POD. So it couldn't be POD in general, and thus can't appear in a union directly (in C++03).
[Bug middle-end/71002] [6/7 Regression] -fstrict-aliasing breaks Boost's short string optimization implementation
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71002 --- Comment #2 from Richard Biener --- The issue is similar to that in PR70054 in that for example string::swap_data copy-initializes repr_t which has the long_raw_t member that is not of the type that it is modified as (for some odd reason). (just from looking at the source now, not analyzing what GCC does to it) Why doesn't boost just use union repr_t { long_t l; short_t s; ... ?
[Bug middle-end/71002] [6/7 Regression] -fstrict-aliasing breaks Boost's short string optimization implementation
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71002 Richard Biener changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2016-05-09 CC||rguenth at gcc dot gnu.org Known to work||5.3.0 Target Milestone|--- |6.2 Summary|[6 Regression] |[6/7 Regression] |-fstrict-aliasing breaks|-fstrict-aliasing breaks |Boost's short string|Boost's short string |optimization implementation |optimization implementation Ever confirmed|0 |1 Known to fail||6.1.0 --- Comment #1 from Richard Biener --- Confirmed the testcase fails with -O2 for GCC 6 and trunk but not GCC 5.