Hi! On Thu, 19 Apr 2012 19:46:17 +0200, I wrote: > Here is my current test case, reduced from gcc.dg/tree-ssa/20030922-1.c: > > extern void abort (void); > > enum tree_code > { > BIND_EXPR, > }; > struct tree_common > { > enum tree_code code:8; > }; > void > voidify_wrapper_expr (struct tree_common *common) > { > switch (common->code) > { > case BIND_EXPR: > if (common->code != BIND_EXPR) > abort (); > } > } > > The diff for -fdump-tree-all between compiling with > -fno-strict-volatile-bitfields and -fstrict-volatile-bitfields begins > with the following, right in 20030922-1.c.003t.original: > > diff -ru fnsvb/20030922-1.c.003t.original fsvb/20030922-1.c.003t.original > --- fnsvb/20030922-1.c.003t.original 2012-04-19 16:51:18.322150866 +0200 > +++ fsvb/20030922-1.c.003t.original 2012-04-19 16:49:18.132088498 +0200 > @@ -7,7 +7,7 @@ > switch ((int) common->code) > { > case 0:; > - if (common->code != 0) > + if ((BIT_FIELD_REF <*common, 32, 0> & 255) != 0) > { > abort (); > } > > That is, for -fno-strict-volatile-bitfields the second instance of > »common->code« it is a component_ref, whereas for > -fstrict-volatile-bitfields it is a bit_field_ref. The former will be > optimized as intended, the latter will not. So, what should be emitted > in the -fstrict-volatile-bitfields case? A component_ref -- but this is > what Bernd's commit r182545 (for PR51200) changed, I think? Or should > later optimization stages learn to better deal with a bit_field_ref, and > where would this have to happen?
I figured out why this generic code is behaving differently for SH targets. I compared to ARM as well as x86, for which indeed the optimization possibility is not lost even when compiling with -fstrict-volatile-bitfields. The component_ref inside the if statement (but not the one in the switch statement) is fed into optimize_bit_field_compare where it is optimized to »BIT_FIELD_REF <*common, 32, 0> & 255« only for SH, because get_best_mode for SH returns SImode (ARM, x86: QImode), because SH has »#define SLOW_BYTE_ACCESS 1«, and thus the original QImode is widened to SImode, hoping for better code generation »since it may eliminate subsequent memory access if subsequent accesses occur to other fields in the same word of the structure, but to different bytes«. (Well, the converse happens here...) After that, in optimize_bit_field_compare, for ARM and x86, »nbitsize == lbitsize«, and thus the optimization is aborted, whereas for SH it will be performed, that is, the component_ref is replaced with »BIT_FIELD_REF <*common, 32, 0> & 255«. If I manually force SH's SImode back to QImode (that is, abort optimize_bit_field_compare), the later optimization passes work as they do for ARM and x86. Before Bernd's r182545, optimize_bit_field_compare returned early (that is, aborted this optimization attempt), because »lbitsize == GET_MODE_BITSIZE (lmode)« (8 bits). (With the current sources, lmode is VOIDmode.) Is emmitting »BIT_FIELD_REF <*common, 32, 0> & 255« wrong in this case, or should a later optimization pass be able to figure out that »BIT_FIELD_REF <*common, 32, 0> & 255« is in fact the same as common->code, and then be able to conflate these? Any suggestions where/how to tackle this? Grüße, Thomas
pgp0SPag23wvg.pgp
Description: PGP signature