On Wed, Apr 25, 2012 at 1:27 PM, Thomas Schwinge <tho...@codesourcery.com> wrote: > Hi! > > On Thu, 19 Apr 2012 19:46:17 +0200, I wrote: >> Here is my current test case, reduced from gcc.dg/tree-ssa/20030922-1.c: >> >> extern void abort (void); >> >> enum tree_code >> { >> BIND_EXPR, >> }; >> struct tree_common >> { >> enum tree_code code:8; >> }; >> void >> voidify_wrapper_expr (struct tree_common *common) >> { >> switch (common->code) >> { >> case BIND_EXPR: >> if (common->code != BIND_EXPR) >> abort (); >> } >> } >> >> The diff for -fdump-tree-all between compiling with >> -fno-strict-volatile-bitfields and -fstrict-volatile-bitfields begins >> with the following, right in 20030922-1.c.003t.original: >> >> diff -ru fnsvb/20030922-1.c.003t.original fsvb/20030922-1.c.003t.original >> --- fnsvb/20030922-1.c.003t.original 2012-04-19 16:51:18.322150866 +0200 >> +++ fsvb/20030922-1.c.003t.original 2012-04-19 16:49:18.132088498 +0200 >> @@ -7,7 +7,7 @@ >> switch ((int) common->code) >> { >> case 0:; >> - if (common->code != 0) >> + if ((BIT_FIELD_REF <*common, 32, 0> & 255) != 0) >> { >> abort (); >> } >> >> That is, for -fno-strict-volatile-bitfields the second instance of >> »common->code« it is a component_ref, whereas for >> -fstrict-volatile-bitfields it is a bit_field_ref. The former will be >> optimized as intended, the latter will not. So, what should be emitted >> in the -fstrict-volatile-bitfields case? A component_ref -- but this is >> what Bernd's commit r182545 (for PR51200) changed, I think? Or should >> later optimization stages learn to better deal with a bit_field_ref, and >> where would this have to happen? > > I figured out why this generic code is behaving differently for SH > targets. I compared to ARM as well as x86, for which indeed the > optimization possibility is not lost even when compiling with > -fstrict-volatile-bitfields. > > The component_ref inside the if statement (but not the one in the switch > statement) is fed into optimize_bit_field_compare where it is optimized > to »BIT_FIELD_REF <*common, 32, 0> & 255« only for SH, because > get_best_mode for SH returns SImode (ARM, x86: QImode), because SH has > »#define SLOW_BYTE_ACCESS 1«, and thus the original QImode is widened to > SImode, hoping for better code generation »since it may eliminate > subsequent memory access if subsequent accesses occur to other fields in > the same word of the structure, but to different bytes«. (Well, the > converse happens here...) After that, in optimize_bit_field_compare, for > ARM and x86, »nbitsize == lbitsize«, and thus the optimization is > aborted, whereas for SH it will be performed, that is, the component_ref > is replaced with »BIT_FIELD_REF <*common, 32, 0> & 255«. > > If I manually force SH's SImode back to QImode (that is, abort > optimize_bit_field_compare), the later optimization passes work as they > do for ARM and x86. > > Before Bernd's r182545, optimize_bit_field_compare returned early (that > is, aborted this optimization attempt), because »lbitsize == > GET_MODE_BITSIZE (lmode)« (8 bits). (With the current sources, lmode is > VOIDmode.) > > Is emmitting »BIT_FIELD_REF <*common, 32, 0> & 255« wrong in this case, > or should a later optimization pass be able to figure out that > »BIT_FIELD_REF <*common, 32, 0> & 255« is in fact the same as > common->code, and then be able to conflate these? Any suggestions > where/how to tackle this?
The BIT_FIELD_REF is somewhat of a red-herring. It is created by fold-const.c in optimize_bit_field_compare, code that I think should be removed completely. Or it needs to be made aware of strict-volatile bitfield and C++ memory model details. Richard. > > Grüße, > Thomas