Re: [PATCH] reimplement -fstrict-volatile-bitfields v4, part 1/2
On Mon, Nov 18, 2013 at 1:11 PM, Bernd Edlinger bernd.edlin...@hotmail.de wrote: Hi, This modified test case exposes a bug in the already approved part of the strict-volatile-bitfields patch: #include stdlib.h typedef struct { char pad; int arr[0]; } __attribute__((packed)) str; str * foo (int* src) { str *s = malloc (sizeof (str) + sizeof (int)); s-arr[0] = 0x12345678; asm volatile(:::memory); *src = s-arr[0]; return s; } As we know this test case triggered a recursion in the store_bit_field on ARM and on PowerPC, which is no longer reproducible after this patch is applied: http://gcc.gnu.org/ml/gcc-patches/2013-11/msg02025.html Additionally it triggered a recursion on extract_bit_field, but _only_ on my local copy of the trunk. I had this patch installed, but did not expect it to change anything unless the values are volatile. That was cased by this hunk in the strict-volatile-bitfields v4 patch: @@ -1691,45 +1736,19 @@ extract_fixed_bit_field (enum machine_mo includes the entire field. If such a mode would be larger than a word, we won't be doing the extraction the normal way. */ - if (MEM_VOLATILE_P (op0) - flag_strict_volatile_bitfields 0) - { - if (GET_MODE_BITSIZE (GET_MODE (op0)) 0) - mode = GET_MODE (op0); - else if (target GET_MODE_BITSIZE (GET_MODE (target)) 0) - mode = GET_MODE (target); - else - mode = tmode; - } - else - mode = get_best_mode (bitsize, bitnum, 0, 0, - MEM_ALIGN (op0), word_mode, MEM_VOLATILE_P (op0)); + mode = GET_MODE (op0); + if (GET_MODE_BITSIZE (mode) == 0 + || GET_MODE_BITSIZE (mode) GET_MODE_BITSIZE (word_mode)) + mode = word_mode; + mode = get_best_mode (bitsize, bitnum, 0, 0, + MEM_ALIGN (op0), mode, MEM_VOLATILE_P (op0)); if (mode == VOIDmode) /* The only way this should occur is if the field spans word boundaries. */ return extract_split_bit_field (op0, bitsize, bitnum, unsignedp); So the problem started, because initially this function did not look at GET_MODE(op0) and always used word_mode. That was changed, but now also affected non-volatile data. Now, if we solve this differently and install the C++ memory model patch, we can avoid to introduce the recursion in the extract path, and remove these two hunks in the update patch at the same time: + else if (MEM_P (str_rtx) + MEM_VOLATILE_P (str_rtx) + flag_strict_volatile_bitfields 0) +/* This is a case where -fstrict-volatile-bitfields doesn't apply + because we can't do a single access in the declared mode of the field. + Since the incoming STR_RTX has already been adjusted to that mode, + fall back to word mode for subsequent logic. */ +str_rtx = adjust_address (str_rtx, word_mode, 0); Attached you'll find a new version of the bitfields-update patch, it is again relative to the already approved version of the volatile-bitfields patch v4, part 1/2. Boot-strapped and regression-tested on X86_64-linux-gnu. additionally tested with an ARM cross-compiler. OK for trunk? Ok. Thanks, Richard. Thanks Bernd.
RE: [PATCH] reimplement -fstrict-volatile-bitfields v4, part 1/2
Hi, This modified test case exposes a bug in the already approved part of the strict-volatile-bitfields patch: #include stdlib.h typedef struct { char pad; int arr[0]; } __attribute__((packed)) str; str * foo (int* src) { str *s = malloc (sizeof (str) + sizeof (int)); s-arr[0] = 0x12345678; asm volatile(:::memory); *src = s-arr[0]; return s; } As we know this test case triggered a recursion in the store_bit_field on ARM and on PowerPC, which is no longer reproducible after this patch is applied: http://gcc.gnu.org/ml/gcc-patches/2013-11/msg02025.html Additionally it triggered a recursion on extract_bit_field, but _only_ on my local copy of the trunk. I had this patch installed, but did not expect it to change anything unless the values are volatile. That was cased by this hunk in the strict-volatile-bitfields v4 patch: @@ -1691,45 +1736,19 @@ extract_fixed_bit_field (enum machine_mo includes the entire field. If such a mode would be larger than a word, we won't be doing the extraction the normal way. */ - if (MEM_VOLATILE_P (op0) - flag_strict_volatile_bitfields 0) - { - if (GET_MODE_BITSIZE (GET_MODE (op0)) 0) - mode = GET_MODE (op0); - else if (target GET_MODE_BITSIZE (GET_MODE (target)) 0) - mode = GET_MODE (target); - else - mode = tmode; - } - else - mode = get_best_mode (bitsize, bitnum, 0, 0, - MEM_ALIGN (op0), word_mode, MEM_VOLATILE_P (op0)); + mode = GET_MODE (op0); + if (GET_MODE_BITSIZE (mode) == 0 + || GET_MODE_BITSIZE (mode) GET_MODE_BITSIZE (word_mode)) + mode = word_mode; + mode = get_best_mode (bitsize, bitnum, 0, 0, + MEM_ALIGN (op0), mode, MEM_VOLATILE_P (op0)); if (mode == VOIDmode) /* The only way this should occur is if the field spans word boundaries. */ return extract_split_bit_field (op0, bitsize, bitnum, unsignedp); So the problem started, because initially this function did not look at GET_MODE(op0) and always used word_mode. That was changed, but now also affected non-volatile data. Now, if we solve this differently and install the C++ memory model patch, we can avoid to introduce the recursion in the extract path, and remove these two hunks in the update patch at the same time: + else if (MEM_P (str_rtx) + MEM_VOLATILE_P (str_rtx) + flag_strict_volatile_bitfields 0) + /* This is a case where -fstrict-volatile-bitfields doesn't apply + because we can't do a single access in the declared mode of the field. + Since the incoming STR_RTX has already been adjusted to that mode, + fall back to word mode for subsequent logic. */ + str_rtx = adjust_address (str_rtx, word_mode, 0); Attached you'll find a new version of the bitfields-update patch, it is again relative to the already approved version of the volatile-bitfields patch v4, part 1/2. Boot-strapped and regression-tested on X86_64-linux-gnu. additionally tested with an ARM cross-compiler. OK for trunk? Thanks Bernd.2013-11-18 Bernd Edlinger bernd.edlin...@hotmail.de Sandra Loosemore san...@codesourcery.com PR middle-end/23623 PR middle-end/48784 PR middle-end/56341 PR middle-end/56997 * expmed.c (strict_volatile_bitfield_p): Add bitregion_start and bitregion_end parameters. Test for compliance with C++ memory model. (store_bit_field): Adjust call to strict_volatile_bitfield_p. Add fallback logic for cases where -fstrict-volatile-bitfields is supposed to apply, but cannot. (extract_bit_field): Likewise. Use narrow_bit_field_mem and extract_fixed_bit_field_1 to do the extraction. (extract_fixed_bit_field): Revert to previous mode selection algorithm. Call extract_fixed_bit_field_1 to do the real work. (extract_fixed_bit_field_1): New function. testsuite: 2013-11-18 Bernd Edlinger bernd.edlin...@hotmail.de Sandra Loosemore san...@codesourcery.com * gcc.dg/pr23623.c: Update to test interaction with C++ memory model. patch-bitfields-update-1.diff Description: Binary data
RE: [PATCH] reimplement -fstrict-volatile-bitfields v4, part 1/2
Hi, On Thu, 14 Nov 2013 16:31:10, Richard Biener wrote: On Thu, Nov 14, 2013 at 11:16 AM, Bernd Edlinger bernd.edlin...@hotmail.de wrote: Hi, sorry, for the delay. Sandra seems to be even more busy than me... Attached is a combined patch of the original part 1, and the update, in diff -up format. On Mon, 11 Nov 2013 13:10:45, Richard Biener wrote: On Thu, Oct 31, 2013 at 1:46 AM, Sandra Loosemore san...@codesourcery.com wrote: On 10/29/2013 02:51 AM, Bernd Edlinger wrote: On Mon, 28 Oct 2013 21:29:24, Sandra Loosemore wrote: On 10/28/2013 03:20 AM, Bernd Edlinger wrote: I have attached an update to your patch, that should a) fix the recursion problem. b) restrict the -fstrict-volatile-bitfields to not violate the C++ memory model. Here's a new version of the update patch. Alternatively, if strict_volatile_bitfield_p returns false but flag_strict_volatile_bitfields 0, then always force to word_mode and change the -fstrict-volatile-bitfields documentation to indicate that's the fallback if the insertion/extraction cannot be done in the declared mode, rather than claiming that it tries to do the same thing as if -fstrict-volatile-bitfields were not enabled at all. I decided that this approach was more expedient, after all. I've tested this patch (in conjunction with my already-approved but not-yet-applied patch) on mainline for arm-none-eabi, x86_64-linux-gnu, and mips-linux gnu. I also backported the entire series to GCC 4.8 and tested there on arm-none-eabi and x86_64-linux-gnu. OK to apply? Hm, I can't seem to find the context for @@ -923,6 +935,14 @@ store_fixed_bit_field (str_rtx, bitsize, bitnum, 0, 0, value); return; } + else if (MEM_P (str_rtx) + MEM_VOLATILE_P (str_rtx) + flag_strict_volatile_bitfields 0) + /* This is a case where -fstrict-volatile-bitfields doesn't apply + because we can't do a single access in the declared mode of the field. + Since the incoming STR_RTX has already been adjusted to that mode, + fall back to word mode for subsequent logic. */ + str_rtx = adjust_address (str_rtx, word_mode, 0); /* Under the C++0x memory model, we must not touch bits outside the bit region. Adjust the address to start at the beginning of the and the other similar hunk. I suppose they apply to earlier patches in the series? I suppose the above applies to store_bit_field (diff -p really helps!). Why would using word_mode be any good as fallback? That is, why is Since the incoming STR_RTX has already been adjusted to that mode not the thing to fix? Well, this hunk does not force the access to be in word_mode. Instead it allows get_best_mode to choose the access to be in any mode from QI to word_mode. It is there to revert the effect of this weird code in expr.c (expand_assigment): if (volatilep flag_strict_volatile_bitfields 0) to_rtx = adjust_address (to_rtx, mode1, 0); Note that this does not even check if the access is on a bit-field ! Then why not remove that ... The problem with the strict_volatile_bitfields is that it is used already before the code reaches store_bit_field or extract_bit_field. It starts in get_inner_reference, (which is not only used in expand_assignment and expand_expr_real_1) Then this, if (volatilep flag_strict_volatile_bitfields 0) op0 = adjust_address (op0, mode1, 0); ... and this ... and then this, /* If the field is volatile, we always want an aligned access. Do this in following two situations: 1. the access is not already naturally aligned, otherwise normal (non-bitfield) volatile fields become non-addressable. 2. the bitsize is narrower than the access size. Need to extract bitfields from the access. */ || (volatilep flag_strict_volatile_bitfields 0 (bitpos % GET_MODE_ALIGNMENT (mode) != 0 || (mode1 != BLKmode bitsize GET_MODE_SIZE (mode1) * BITS_PER_UNIT))) ... or this ... As a result, a read access to an unaligned volatile data member does not even reach the expand_bit_field if flag_strict_volatile_bitfields = 0, and instead goes through convert_move (target, op0, unsignedp). I still believe the proposed patch is guaranteed to not change anything if -fno-strict-volatile-bitfields is used, and even if we can not guarantee that it creates exactly the same code for cases where the strict-volatile-bitfields does not apply, it certainly generates valid code, where we had invalid code, or ICEs without the patch. OK for trunk? Again, most of the patch is ok (and nice), the store_bit_field/extract_bit_field changes point to the above issues which we should rather fix than trying to revert them after the fact. Why is that not possible? That said, + else if (MEM_P (str_rtx) + MEM_VOLATILE_P (str_rtx) + flag_strict_volatile_bitfields 0) + /* This is a case where -fstrict-volatile-bitfields doesn't apply + because we can't do a single access in the declared mode of the field. + Since the incoming STR_RTX has already been
RE: [PATCH] reimplement -fstrict-volatile-bitfields v4, part 1/2
hmm... the above change is just not aggressive enough. with a slightly changed test case I have now got a recursion on the extract_fixed_bit_fields on ARM but interestingly not on powerpc: #include stdlib.h typedef struct { char pad; int arr[0]; } __attribute__((packed)) str; str * foo (int* src) { str *s = malloc (sizeof (str) + sizeof (int)); *src = s-arr[0]; s-arr[0] = 0x12345678; return s; } Now I think about reverting that hunk back to what I had in mind initially: else if (MEM_P (str_rtx) GET_MODE_BITSIZE (GET_MODE (str_rtx)) != 0 GET_MODE_BITSIZE (GET_MODE (str_rtx)) GET_MODE_BITSIZE (word_mode) bitnum % GET_MODE_BITSIZE (GET_MODE (str_rtx)) + bitsize GET_MODE_BITSIZE (GET_MODE (str_rtx))) /* If the mode of str_rtx is too small or unaligned fall back to word mode for subsequent logic. */ str_rtx = adjust_address (str_rtx, word_mode, 0); this fixes the recursion on the read side too. And it is limited to cases where the mode does not match the bitnum and bitsize parameters. But that's not conditional on -fstrict-volatile-bitfields and frankly it doesn't make sense to me. Why is the recursion not there for -fno-strict-volatile-bitfields? the problem here is different, than we thought. Both recursions have been observed initially only with volatile accesses and -fstrict-volatile-bitfields. That's why it looked like the right thing to do. But PR59134 shows clearly that both recursions have nothing to do with strict-volatile-bitfields. They happen just because the hardware is not always able to access unaligned data on one instruction. And the mode in str_rtx is sometimes too restrictive. Now in this test case, we have s, a packed structure, but malloc retruns a BIGGEST_ALIGNMENT. And at -O2 the back-end sees the larger alignment. But the mode of str_rtx is QImode, because that is the mode of the struct str. This mode is only good for accessing pad, but not for arr[0]. It does never make sense to access 32 bits in QImode, especially if the memory is un-aligned. That is how this hunk resolves the issue. Richard. Bernd. Bernd. Richard. -Sandra
Re: [PATCH] reimplement -fstrict-volatile-bitfields v4, part 1/2
On Fri, Nov 15, 2013 at 1:08 PM, Bernd Edlinger bernd.edlin...@hotmail.de wrote: hmm... the above change is just not aggressive enough. with a slightly changed test case I have now got a recursion on the extract_fixed_bit_fields on ARM but interestingly not on powerpc: #include stdlib.h typedef struct { char pad; int arr[0]; } __attribute__((packed)) str; str * foo (int* src) { str *s = malloc (sizeof (str) + sizeof (int)); *src = s-arr[0]; s-arr[0] = 0x12345678; return s; } Now I think about reverting that hunk back to what I had in mind initially: else if (MEM_P (str_rtx) GET_MODE_BITSIZE (GET_MODE (str_rtx)) != 0 GET_MODE_BITSIZE (GET_MODE (str_rtx)) GET_MODE_BITSIZE (word_mode) bitnum % GET_MODE_BITSIZE (GET_MODE (str_rtx)) + bitsize GET_MODE_BITSIZE (GET_MODE (str_rtx))) /* If the mode of str_rtx is too small or unaligned fall back to word mode for subsequent logic. */ str_rtx = adjust_address (str_rtx, word_mode, 0); this fixes the recursion on the read side too. And it is limited to cases where the mode does not match the bitnum and bitsize parameters. But that's not conditional on -fstrict-volatile-bitfields and frankly it doesn't make sense to me. Why is the recursion not there for -fno-strict-volatile-bitfields? the problem here is different, than we thought. Both recursions have been observed initially only with volatile accesses and -fstrict-volatile-bitfields. That's why it looked like the right thing to do. But PR59134 shows clearly that both recursions have nothing to do with strict-volatile-bitfields. They happen just because the hardware is not always able to access unaligned data on one instruction. And the mode in str_rtx is sometimes too restrictive. Now in this test case, we have s, a packed structure, but malloc retruns a BIGGEST_ALIGNMENT. And at -O2 the back-end sees the larger alignment. But the mode of str_rtx is QImode, because that is the mode of the struct str. This mode is only good for accessing pad, but not for arr[0]. It does never make sense to access 32 bits in QImode, especially if the memory is un-aligned. That is how this hunk resolves the issue. But then why is the mode QImode in the first place? The access is definitely of SImode. Richard. Richard. Bernd. Bernd. Richard. -Sandra
RE: [PATCH] reimplement -fstrict-volatile-bitfields v4, part 1/2
But then why is the mode QImode in the first place? The access is definitely of SImode. that's in the test case: s-arr[0] = 0x12345678; it is QImode from that in expand_assignment: to_rtx = expand_expr (tem, NULL_RTX, VOIDmode, EXPAND_WRITE); tem is s, a MEM_REF, of QImode, perfectly aligned. the mode is only OK to access *s or s-pad. It is wrong for s-srr[0]. in store_bit_field the mode is used in store_fixed_bit_field: mode = GET_MODE (op0); if (GET_MODE_BITSIZE (mode) == 0 || GET_MODE_BITSIZE (mode) GET_MODE_BITSIZE (word_mode)) mode = word_mode; mode = get_best_mode (bitsize, bitnum, bitregion_start, bitregion_end, MEM_ALIGN (op0), mode, MEM_VOLATILE_P (op0)); if (mode == VOIDmode) goto boom; so this restricts the possible access mode. word_mode, means no restriction. Everything would be OK if MEM_ALIGN(op0) was byte-aligned, but we have a problem if MEM_ALIGN(op0)=WORD_MODE. Do you understand? Bernd.
Re: [PATCH] reimplement -fstrict-volatile-bitfields v4, part 1/2
On Fri, Nov 15, 2013 at 2:24 PM, Bernd Edlinger bernd.edlin...@hotmail.de wrote: But then why is the mode QImode in the first place? The access is definitely of SImode. that's in the test case: s-arr[0] = 0x12345678; it is QImode from that in expand_assignment: to_rtx = expand_expr (tem, NULL_RTX, VOIDmode, EXPAND_WRITE); tem is s, a MEM_REF, of QImode, perfectly aligned. the mode is only OK to access *s or s-pad. It is wrong for s-srr[0]. I undestand that, but why isn't that immediately adjusted to use TYPE_MODE (TREE_TYPE (to))? The above expands *s, not s-arr[0]. Using the mode of *s if it is not equal to that of the destination doesn't make any sense (usually it will be BLKmode anyway). This seems to be the source of multiple problems. in store_bit_field the mode is used in store_fixed_bit_field: mode = GET_MODE (op0); if (GET_MODE_BITSIZE (mode) == 0 || GET_MODE_BITSIZE (mode) GET_MODE_BITSIZE (word_mode)) mode = word_mode; mode = get_best_mode (bitsize, bitnum, bitregion_start, bitregion_end, MEM_ALIGN (op0), mode, MEM_VOLATILE_P (op0)); if (mode == VOIDmode) goto boom; so this restricts the possible access mode. word_mode, means no restriction. Everything would be OK if MEM_ALIGN(op0) was byte-aligned, but we have a problem if MEM_ALIGN(op0)=WORD_MODE. Do you understand? Bernd.
RE: [PATCH] reimplement -fstrict-volatile-bitfields v4, part 1/2
Hi, sorry, for the delay. Sandra seems to be even more busy than me... Attached is a combined patch of the original part 1, and the update, in diff -up format. On Mon, 11 Nov 2013 13:10:45, Richard Biener wrote: On Thu, Oct 31, 2013 at 1:46 AM, Sandra Loosemore san...@codesourcery.com wrote: On 10/29/2013 02:51 AM, Bernd Edlinger wrote: On Mon, 28 Oct 2013 21:29:24, Sandra Loosemore wrote: On 10/28/2013 03:20 AM, Bernd Edlinger wrote: I have attached an update to your patch, that should a) fix the recursion problem. b) restrict the -fstrict-volatile-bitfields to not violate the C++ memory model. Here's a new version of the update patch. Alternatively, if strict_volatile_bitfield_p returns false but flag_strict_volatile_bitfields 0, then always force to word_mode and change the -fstrict-volatile-bitfields documentation to indicate that's the fallback if the insertion/extraction cannot be done in the declared mode, rather than claiming that it tries to do the same thing as if -fstrict-volatile-bitfields were not enabled at all. I decided that this approach was more expedient, after all. I've tested this patch (in conjunction with my already-approved but not-yet-applied patch) on mainline for arm-none-eabi, x86_64-linux-gnu, and mips-linux gnu. I also backported the entire series to GCC 4.8 and tested there on arm-none-eabi and x86_64-linux-gnu. OK to apply? Hm, I can't seem to find the context for @@ -923,6 +935,14 @@ store_fixed_bit_field (str_rtx, bitsize, bitnum, 0, 0, value); return; } + else if (MEM_P (str_rtx) + MEM_VOLATILE_P (str_rtx) + flag_strict_volatile_bitfields 0) + /* This is a case where -fstrict-volatile-bitfields doesn't apply + because we can't do a single access in the declared mode of the field. + Since the incoming STR_RTX has already been adjusted to that mode, + fall back to word mode for subsequent logic. */ + str_rtx = adjust_address (str_rtx, word_mode, 0); /* Under the C++0x memory model, we must not touch bits outside the bit region. Adjust the address to start at the beginning of the and the other similar hunk. I suppose they apply to earlier patches in the series? I suppose the above applies to store_bit_field (diff -p really helps!). Why would using word_mode be any good as fallback? That is, why is Since the incoming STR_RTX has already been adjusted to that mode not the thing to fix? Well, this hunk does not force the access to be in word_mode. Instead it allows get_best_mode to choose the access to be in any mode from QI to word_mode. It is there to revert the effect of this weird code in expr.c (expand_assigment): if (volatilep flag_strict_volatile_bitfields 0) to_rtx = adjust_address (to_rtx, mode1, 0); Note that this does not even check if the access is on a bit-field ! The problem with the strict_volatile_bitfields is that it is used already before the code reaches store_bit_field or extract_bit_field. It starts in get_inner_reference, (which is not only used in expand_assignment and expand_expr_real_1) Then this, if (volatilep flag_strict_volatile_bitfields 0) op0 = adjust_address (op0, mode1, 0); and then this, /* If the field is volatile, we always want an aligned access. Do this in following two situations: 1. the access is not already naturally aligned, otherwise normal (non-bitfield) volatile fields become non-addressable. 2. the bitsize is narrower than the access size. Need to extract bitfields from the access. */ || (volatilep flag_strict_volatile_bitfields 0 (bitpos % GET_MODE_ALIGNMENT (mode) != 0 || (mode1 != BLKmode bitsize GET_MODE_SIZE (mode1) * BITS_PER_UNIT))) As a result, a read access to an unaligned volatile data member does not even reach the expand_bit_field if flag_strict_volatile_bitfields = 0, and instead goes through convert_move (target, op0, unsignedp). I still believe the proposed patch is guaranteed to not change anything if -fno-strict-volatile-bitfields is used, and even if we can not guarantee that it creates exactly the same code for cases where the strict-volatile-bitfields does not apply, it certainly generates valid code, where we had invalid code, or ICEs without the patch. OK for trunk? Bernd. Richard. -Sandra patch-bitfields.diff Description: Binary data
Re: [PATCH] reimplement -fstrict-volatile-bitfields v4, part 1/2
On Thu, Nov 14, 2013 at 11:16 AM, Bernd Edlinger bernd.edlin...@hotmail.de wrote: Hi, sorry, for the delay. Sandra seems to be even more busy than me... Attached is a combined patch of the original part 1, and the update, in diff -up format. On Mon, 11 Nov 2013 13:10:45, Richard Biener wrote: On Thu, Oct 31, 2013 at 1:46 AM, Sandra Loosemore san...@codesourcery.com wrote: On 10/29/2013 02:51 AM, Bernd Edlinger wrote: On Mon, 28 Oct 2013 21:29:24, Sandra Loosemore wrote: On 10/28/2013 03:20 AM, Bernd Edlinger wrote: I have attached an update to your patch, that should a) fix the recursion problem. b) restrict the -fstrict-volatile-bitfields to not violate the C++ memory model. Here's a new version of the update patch. Alternatively, if strict_volatile_bitfield_p returns false but flag_strict_volatile_bitfields 0, then always force to word_mode and change the -fstrict-volatile-bitfields documentation to indicate that's the fallback if the insertion/extraction cannot be done in the declared mode, rather than claiming that it tries to do the same thing as if -fstrict-volatile-bitfields were not enabled at all. I decided that this approach was more expedient, after all. I've tested this patch (in conjunction with my already-approved but not-yet-applied patch) on mainline for arm-none-eabi, x86_64-linux-gnu, and mips-linux gnu. I also backported the entire series to GCC 4.8 and tested there on arm-none-eabi and x86_64-linux-gnu. OK to apply? Hm, I can't seem to find the context for @@ -923,6 +935,14 @@ store_fixed_bit_field (str_rtx, bitsize, bitnum, 0, 0, value); return; } + else if (MEM_P (str_rtx) + MEM_VOLATILE_P (str_rtx) + flag_strict_volatile_bitfields 0) + /* This is a case where -fstrict-volatile-bitfields doesn't apply + because we can't do a single access in the declared mode of the field. + Since the incoming STR_RTX has already been adjusted to that mode, + fall back to word mode for subsequent logic. */ + str_rtx = adjust_address (str_rtx, word_mode, 0); /* Under the C++0x memory model, we must not touch bits outside the bit region. Adjust the address to start at the beginning of the and the other similar hunk. I suppose they apply to earlier patches in the series? I suppose the above applies to store_bit_field (diff -p really helps!). Why would using word_mode be any good as fallback? That is, why is Since the incoming STR_RTX has already been adjusted to that mode not the thing to fix? Well, this hunk does not force the access to be in word_mode. Instead it allows get_best_mode to choose the access to be in any mode from QI to word_mode. It is there to revert the effect of this weird code in expr.c (expand_assigment): if (volatilep flag_strict_volatile_bitfields 0) to_rtx = adjust_address (to_rtx, mode1, 0); Note that this does not even check if the access is on a bit-field ! Then why not remove that ... The problem with the strict_volatile_bitfields is that it is used already before the code reaches store_bit_field or extract_bit_field. It starts in get_inner_reference, (which is not only used in expand_assignment and expand_expr_real_1) Then this, if (volatilep flag_strict_volatile_bitfields 0) op0 = adjust_address (op0, mode1, 0); ... and this ... and then this, /* If the field is volatile, we always want an aligned access. Do this in following two situations: 1. the access is not already naturally aligned, otherwise normal (non-bitfield) volatile fields become non-addressable. 2. the bitsize is narrower than the access size. Need to extract bitfields from the access. */ || (volatilep flag_strict_volatile_bitfields 0 (bitpos % GET_MODE_ALIGNMENT (mode) != 0 || (mode1 != BLKmode bitsize GET_MODE_SIZE (mode1) * BITS_PER_UNIT))) ... or this ... As a result, a read access to an unaligned volatile data member does not even reach the expand_bit_field if flag_strict_volatile_bitfields = 0, and instead goes through convert_move (target, op0, unsignedp). I still believe the proposed patch is guaranteed to not change anything if -fno-strict-volatile-bitfields is used, and even if we can not guarantee that it creates exactly the same code for cases where the strict-volatile-bitfields does not apply, it certainly generates valid code, where we had invalid code, or ICEs without the patch. OK for trunk? Again, most of the patch is ok (and nice), the store_bit_field/extract_bit_field changes point to the above issues which we should rather fix than trying to revert them after the fact. Why is that not possible? That said, + else if (MEM_P (str_rtx) + MEM_VOLATILE_P (str_rtx) + flag_strict_volatile_bitfields 0) +/* This is a case
Re: [PATCH] reimplement -fstrict-volatile-bitfields v4, part 1/2
On Thu, Oct 31, 2013 at 1:46 AM, Sandra Loosemore san...@codesourcery.com wrote: On 10/29/2013 02:51 AM, Bernd Edlinger wrote: On Mon, 28 Oct 2013 21:29:24, Sandra Loosemore wrote: On 10/28/2013 03:20 AM, Bernd Edlinger wrote: I have attached an update to your patch, that should a) fix the recursion problem. b) restrict the -fstrict-volatile-bitfields to not violate the C++ memory model. Here's a new version of the update patch. Alternatively, if strict_volatile_bitfield_p returns false but flag_strict_volatile_bitfields 0, then always force to word_mode and change the -fstrict-volatile-bitfields documentation to indicate that's the fallback if the insertion/extraction cannot be done in the declared mode, rather than claiming that it tries to do the same thing as if -fstrict-volatile-bitfields were not enabled at all. I decided that this approach was more expedient, after all. I've tested this patch (in conjunction with my already-approved but not-yet-applied patch) on mainline for arm-none-eabi, x86_64-linux-gnu, and mips-linux gnu. I also backported the entire series to GCC 4.8 and tested there on arm-none-eabi and x86_64-linux-gnu. OK to apply? Hm, I can't seem to find the context for @@ -923,6 +935,14 @@ store_fixed_bit_field (str_rtx, bitsize, bitnum, 0, 0, value); return; } + else if (MEM_P (str_rtx) + MEM_VOLATILE_P (str_rtx) + flag_strict_volatile_bitfields 0) +/* This is a case where -fstrict-volatile-bitfields doesn't apply + because we can't do a single access in the declared mode of the field. + Since the incoming STR_RTX has already been adjusted to that mode, + fall back to word mode for subsequent logic. */ +str_rtx = adjust_address (str_rtx, word_mode, 0); /* Under the C++0x memory model, we must not touch bits outside the bit region. Adjust the address to start at the beginning of the and the other similar hunk. I suppose they apply to earlier patches in the series? I suppose the above applies to store_bit_field (diff -p really helps!). Why would using word_mode be any good as fallback? That is, why is Since the incoming STR_RTX has already been adjusted to that mode not the thing to fix? Richard. -Sandra
Re: [PATCH] reimplement -fstrict-volatile-bitfields v4, part 1/2
On 10/29/2013 02:51 AM, Bernd Edlinger wrote: On Mon, 28 Oct 2013 21:29:24, Sandra Loosemore wrote: On 10/28/2013 03:20 AM, Bernd Edlinger wrote: I have attached an update to your patch, that should a) fix the recursion problem. b) restrict the -fstrict-volatile-bitfields to not violate the C++ memory model. Here's a new version of the update patch. Alternatively, if strict_volatile_bitfield_p returns false but flag_strict_volatile_bitfields 0, then always force to word_mode and change the -fstrict-volatile-bitfields documentation to indicate that's the fallback if the insertion/extraction cannot be done in the declared mode, rather than claiming that it tries to do the same thing as if -fstrict-volatile-bitfields were not enabled at all. I decided that this approach was more expedient, after all. I've tested this patch (in conjunction with my already-approved but not-yet-applied patch) on mainline for arm-none-eabi, x86_64-linux-gnu, and mips-linux gnu. I also backported the entire series to GCC 4.8 and tested there on arm-none-eabi and x86_64-linux-gnu. OK to apply? -Sandra 2013-10-30 Bernd Edlinger bernd.edlin...@hotmail.de Sandra Loosemore san...@codesourcery.com PR middle-end/23623 PR middle-end/48784 PR middle-end/56341 PR middle-end/56997 gcc/ * expmed.c (strict_volatile_bitfield_p): Add bitregion_start and bitregion_end parameters. Test for compliance with C++ memory model. (store_bit_field): Adjust call to strict_volatile_bitfield_p. Add fallback logic for cases where -fstrict-volatile-bitfields is supposed to apply, but cannot. (extract_bit-field): Likewise. * doc/invoke.texi (Code Gen Options): Better document fallback for -fstrict-volatile-bitfields. gcc/testsuite/ * gcc.dg/pr23623.c: Update to test interaction with C++ memory model. diff -u gcc/doc/invoke.texi gcc/doc/invoke.texi --- gcc/doc/invoke.texi (working copy) +++ gcc/doc/invoke.texi (working copy) @@ -21659,7 +21659,8 @@ In some cases, such as when the @code{packed} attribute is applied to a structure field, it may not be possible to access the field with a single read or write that is correctly aligned for the target machine. In this -case GCC falls back to generating multiple accesses rather than code that +case GCC falls back to generating either multiple accesses +or an access in a larger mode, rather than code that will fault or truncate the result at run time. The default value of this option is determined by the application binary diff -u gcc/testsuite/gcc.dg/pr23623.c gcc/testsuite/gcc.dg/pr23623.c --- gcc/testsuite/gcc.dg/pr23623.c (revision 0) +++ gcc/testsuite/gcc.dg/pr23623.c (revision 0) @@ -8,16 +8,19 @@ extern struct { unsigned int b : 1; + unsigned int : 31; } bf1; extern volatile struct { unsigned int b : 1; + unsigned int : 31; } bf2; extern struct { volatile unsigned int b : 1; + volatile unsigned int : 31; } bf3; void writeb(void) diff -u gcc/expmed.c gcc/expmed.c --- gcc/expmed.c (working copy) +++ gcc/expmed.c (working copy) @@ -416,12 +416,17 @@ } /* Return true if -fstrict-volatile-bitfields applies an access of OP0 - containing BITSIZE bits starting at BITNUM, with field mode FIELDMODE. */ + containing BITSIZE bits starting at BITNUM, with field mode FIELDMODE. + Return false if the access would touch memory outside the range + BITREGION_START to BITREGION_END for conformance to the C++ memory + model. */ static bool strict_volatile_bitfield_p (rtx op0, unsigned HOST_WIDE_INT bitsize, unsigned HOST_WIDE_INT bitnum, - enum machine_mode fieldmode) + enum machine_mode fieldmode, + unsigned HOST_WIDE_INT bitregion_start, + unsigned HOST_WIDE_INT bitregion_end) { unsigned HOST_WIDE_INT modesize = GET_MODE_BITSIZE (fieldmode); @@ -448,6 +453,12 @@ bitnum % GET_MODE_ALIGNMENT (fieldmode) + bitsize modesize)) return false; + /* Check for cases where the C++ memory model applies. */ + if (bitregion_end != 0 + (bitnum - bitnum % modesize bitregion_start + || bitnum - bitnum % modesize + modesize bitregion_end)) +return false; + return true; } @@ -904,7 +915,8 @@ rtx value) { /* Handle -fstrict-volatile-bitfields in the cases where it applies. */ - if (strict_volatile_bitfield_p (str_rtx, bitsize, bitnum, fieldmode)) + if (strict_volatile_bitfield_p (str_rtx, bitsize, bitnum, fieldmode, + bitregion_start, bitregion_end)) { /* Storing any naturally aligned field can be done with a simple @@ -923,6 +935,14 @@ store_fixed_bit_field (str_rtx, bitsize, bitnum, 0, 0, value); return; } + else if (MEM_P (str_rtx) + MEM_VOLATILE_P (str_rtx) + flag_strict_volatile_bitfields 0) +/* This is a case where -fstrict-volatile-bitfields doesn't apply + because we can't do a single access in the declared mode of the field. + Since the incoming STR_RTX has already been adjusted to that mode, +
RE: [PATCH] reimplement -fstrict-volatile-bitfields v4, part 1/2
Hi, On Mon, 28 Oct 2013 21:29:24, Sandra Loosemore wrote: On 10/28/2013 03:20 AM, Bernd Edlinger wrote: On Sun, 20 Oct 2013 20:23:49, Sandra Loosemore wrote: I tried a backport to GCC 4.8 and tested on arm-none-eabi. On the new pr56997-1.c testcase, it got stuck in infinite recursion between store_split_bit_field/store_fixed_bit_field and/or extract_split_bit_field/extract_fixed_bit_field. This didn't show up in my previous mainline testing. [snip] I have attached an update to your patch, that should a) fix the recursion problem. b) restrict the -fstrict-volatile-bitfields to not violate the C++ memory model. Thanks for picking up the ball on this -- I've been busy with other tasks for several days. I again tried backporting the patch series along with your fix to GCC 4.8 and tested on arm-none-eabi. I found that it was still getting stuck in infinite recursion unless the test from this patch hunk hmmm Actually I used your path on a clean 4.8.2 and built for --target=arm-eabi. I have seen the recursion on the extract_bit_field, but not on store_bit_field so far, maybe you could give me a hint which test case exposes the other flavour of this recursion problem. @@ -1699,6 +1711,14 @@ extract_bit_field (rtx str_rtx, unsigned { enum machine_mode mode1; + /* Handle extraction of unaligned fields, + this can happen in -fstrict-volatile-bitfields. */ + if (GET_MODE_BITSIZE (GET_MODE (str_rtx)) != 0 + GET_MODE_BITSIZE (GET_MODE (str_rtx)) GET_MODE_BITSIZE (word_mode) + bitnum % GET_MODE_BITSIZE (GET_MODE (str_rtx)) + bitsize + GET_MODE_BITSIZE (GET_MODE (str_rtx)) ) + str_rtx = adjust_address (str_rtx, word_mode, 0); + /* Handle -fstrict-volatile-bitfields in the cases where it applies. */ if (GET_MODE_BITSIZE (GET_MODE (str_rtx)) 0) mode1 = GET_MODE (str_rtx); was also inserted into store_bit_field. I also think that, minimally, this needs to be rewritten as something like if (MEM_P (str_rtx) STRICT_ALIGNMENT GET_MODE_BITSIZE (GET_MODE (str_rtx)) != 0 GET_MODE_BITSIZE (GET_MODE (str_rtx)) GET_MODE_BITSIZE (word_mode) (bitnum % MEM_ALIGN (str_rtx) + bitsize GET_MODE_BITSIZE (GET_MODE (str_rtx str_rtx = adjust_address (str_rtx, word_mode, 0); Yes, that looks fine. Otherwise, we'll end up using word_mode instead of the field mode on targets that support unaligned accesses. I tested that (again, 4.8 and arm-none-eabi) and results looked good, but of course ARM isn't one of the targets that supports unaligned accesses. :-S I'm still not convinced this is the right fix, though. It seems to me that callers of store_bit_field and extract_bit_field in expr.c ought not to be messing with the mode of the rtx when flag_strict_volatile_bitfields 0, because that is losing information about the original mode that we are trying to patch up here by forcing it to word_mode. Instead, I think the callers ought to be passing the declared field mode into store_bit_field and extract_bit_field, and those functions ought to change the mode of the incoming rtx to match the field mode only if strict_volatile_bitfield_p assures us that the insertion/extraction can be done in that mode. The problem starts likely at expr.c when this is done: if (volatilep flag_strict_volatile_bitfields 0) to_rtx = adjust_address (to_rtx, mode1, 0); this restricts the possible access mode not only for bit-fields but for all possible volatile members. But -fstrict-volatile-bitfields is supposed to affect bit-fields only. Alternatively, if strict_volatile_bitfield_p returns false but flag_strict_volatile_bitfields 0, then always force to word_mode and change the -fstrict-volatile-bitfields documentation to indicate that's the fallback if the insertion/extraction cannot be done in the declared mode, rather than claiming that it tries to do the same thing as if -fstrict-volatile-bitfields were not enabled at all. Either way, still needs more work, and more thorough testing (more targets, and obviously trunk as well as the 4.8 backport) before I'd consider this ready to commit. I might or might not be able to find some more time to work on this in the next week Yes. And it would be good to reach Richard's Nov-21 deadline for GCC-4.9 Bernd. -Sandra
Re: [PATCH] reimplement -fstrict-volatile-bitfields v4, part 1/2
On 10/29/2013 02:51 AM, Bernd Edlinger wrote: On Mon, 28 Oct 2013 21:29:24, Sandra Loosemore wrote: I again tried backporting the patch series along with your fix to GCC 4.8 and tested on arm-none-eabi. I found that it was still getting stuck in infinite recursion unless the test from this patch hunk hmmm Actually I used your path on a clean 4.8.2 and built for --target=arm-eabi. I have seen the recursion on the extract_bit_field, but not on store_bit_field so far, maybe you could give me a hint which test case exposes the other flavour of this recursion problem. Sorry, I was going to describe that in more detail but I forgot. It was compiling pr56997-1.c with -mthumb, like: arm-none-eabi-gcc /path/to/gcc/testsuite/gcc.dg/pr56997-1.c -c -fno-diagnostics-show-caret -fstrict-volatile-bitfields -DSTACK_SIZE=16384 -mthumb Other testing with -mthumb looked fine. Either way, still needs more work, and more thorough testing (more targets, and obviously trunk as well as the 4.8 backport) before I'd consider this ready to commit. I might or might not be able to find some more time to work on this in the next week Yes. And it would be good to reach Richard's Nov-21 deadline for GCC-4.9 Yeah, I know the clock is ticking. :-( -Sandra
RE: [PATCH] reimplement -fstrict-volatile-bitfields v4, part 1/2
Hi Sandra, On Sun, 20 Oct 2013 20:23:49, Sandra Loosemore wrote: On 10/18/2013 10:38 AM, Richard Biener wrote: Sandra Loosemore san...@codesourcery.com wrote: On 10/18/2013 04:50 AM, Richard Biener wrote: On Sat, Sep 28, 2013 at 4:19 AM, Sandra Loosemore san...@codesourcery.com wrote: This patch fixes various -fstrict-volatile-bitfields wrong-code bugs, including instances where bitfield values were being quietly truncated as well as bugs relating to using the wrong width. The code changes are identical to those in the previous version of this patch series (originally posted in June and re-pinged many times since then), but for this version I have cleaned up the test cases to remove dependencies on header files, as Bernd requested. Ok. Just to clarify, is this approval unconditional, independent of part 2 and other patches or changes still under active discussion? Yes. Hr. After some further testing, I'm afraid this patch is still buggy. :-( I tried a backport to GCC 4.8 and tested on arm-none-eabi. On the new pr56997-1.c testcase, it got stuck in infinite recursion between store_split_bit_field/store_fixed_bit_field and/or extract_split_bit_field/extract_fixed_bit_field. This didn't show up in my previous mainline testing. The difference between 4.8 and mainline head is the alignment of the incoming str_rtx passed to store_bit_field/extract_bit_field, due to the changes in r199898. The alignment is 8 bits on mainline, and 32 on 4.8. It seems to me that the bitfield code ought to handle store/extract from a more-aligned object and it's probably possible to construct an example that fails in this way on mainline as well. It looks like there are conflicting assumptions between get_best_mode, the places that call it in store_fixed_bit_field and extract_fixed_bit_field, and the code that actually does the splitting -- which uses a unit based on the MEM_ALIGN of the incoming rtx, rather than its mode. In the case where it's failing, get_best_mode is deciding it can't do a HImode access without splitting, but then the split code is assuming SImode units because of the 32-bit alignment, but not actually changing the mode of the rtx to match that. On top of that, this is one of the cases that strict_volatile_bitfield_p checks for and returns false, but the callers of store_bit_field and extract_bit_field in expr.c have already fiddled with the mode of the incoming rtx assuming that -fstrict-volatile-bitfields does apply. It doesn't get into that infinite recursion if it's compiled with -fno-strict-volatile-bitfields instead; in that case the incoming rtx has BLKmode, get_best_mode chooses SImode, and it's able to do the access without splitting at all. Anyway I tried a couple different bandaids that solved the infinite recursion problem but caused regressions elsewhere, and now I'm not sure of the right place to fix this. Given that there is also still ongoing discussion about making this a 3-way switch (etc) I am going to hold off on committing this patch for now. -Sandra I have attached an update to your patch, that should a) fix the recursion problem. b) restrict the -fstrict-volatile-bitfields to not violate the C++ memory model. Bernd. patch-bitfields-update.diff Description: Binary data
Re: [PATCH] reimplement -fstrict-volatile-bitfields v4, part 1/2
On 10/28/2013 03:20 AM, Bernd Edlinger wrote: On Sun, 20 Oct 2013 20:23:49, Sandra Loosemore wrote: I tried a backport to GCC 4.8 and tested on arm-none-eabi. On the new pr56997-1.c testcase, it got stuck in infinite recursion between store_split_bit_field/store_fixed_bit_field and/or extract_split_bit_field/extract_fixed_bit_field. This didn't show up in my previous mainline testing. [snip] I have attached an update to your patch, that should a) fix the recursion problem. b) restrict the -fstrict-volatile-bitfields to not violate the C++ memory model. Thanks for picking up the ball on this -- I've been busy with other tasks for several days. I again tried backporting the patch series along with your fix to GCC 4.8 and tested on arm-none-eabi. I found that it was still getting stuck in infinite recursion unless the test from this patch hunk @@ -1699,6 +1711,14 @@ extract_bit_field (rtx str_rtx, unsigned { enum machine_mode mode1; + /* Handle extraction of unaligned fields, + this can happen in -fstrict-volatile-bitfields. */ + if (GET_MODE_BITSIZE (GET_MODE (str_rtx)) != 0 + GET_MODE_BITSIZE (GET_MODE (str_rtx)) GET_MODE_BITSIZE (word_mode) + bitnum % GET_MODE_BITSIZE (GET_MODE (str_rtx)) + bitsize + GET_MODE_BITSIZE (GET_MODE (str_rtx)) ) +str_rtx = adjust_address (str_rtx, word_mode, 0); + /* Handle -fstrict-volatile-bitfields in the cases where it applies. */ if (GET_MODE_BITSIZE (GET_MODE (str_rtx)) 0) mode1 = GET_MODE (str_rtx); was also inserted into store_bit_field. I also think that, minimally, this needs to be rewritten as something like if (MEM_P (str_rtx) STRICT_ALIGNMENT GET_MODE_BITSIZE (GET_MODE (str_rtx)) != 0 GET_MODE_BITSIZE (GET_MODE (str_rtx)) GET_MODE_BITSIZE (word_mode) (bitnum % MEM_ALIGN (str_rtx) + bitsize GET_MODE_BITSIZE (GET_MODE (str_rtx str_rtx = adjust_address (str_rtx, word_mode, 0); Otherwise, we'll end up using word_mode instead of the field mode on targets that support unaligned accesses. I tested that (again, 4.8 and arm-none-eabi) and results looked good, but of course ARM isn't one of the targets that supports unaligned accesses. :-S I'm still not convinced this is the right fix, though. It seems to me that callers of store_bit_field and extract_bit_field in expr.c ought not to be messing with the mode of the rtx when flag_strict_volatile_bitfields 0, because that is losing information about the original mode that we are trying to patch up here by forcing it to word_mode. Instead, I think the callers ought to be passing the declared field mode into store_bit_field and extract_bit_field, and those functions ought to change the mode of the incoming rtx to match the field mode only if strict_volatile_bitfield_p assures us that the insertion/extraction can be done in that mode. Alternatively, if strict_volatile_bitfield_p returns false but flag_strict_volatile_bitfields 0, then always force to word_mode and change the -fstrict-volatile-bitfields documentation to indicate that's the fallback if the insertion/extraction cannot be done in the declared mode, rather than claiming that it tries to do the same thing as if -fstrict-volatile-bitfields were not enabled at all. Either way, still needs more work, and more thorough testing (more targets, and obviously trunk as well as the 4.8 backport) before I'd consider this ready to commit. I might or might not be able to find some more time to work on this in the next week -Sandra
RE: [PATCH] reimplement -fstrict-volatile-bitfields v4, part 1/2
Hi Richard/Joseph, I noticed, this test case crashes on arm-eabi already witout the patch. extern void abort (void); #define test_type unsigned short #define MAGIC (unsigned short)0x102u typedef struct s{ unsigned char Prefix[1]; test_type Type; }__attribute((__packed__,__aligned__(4))) ss; volatile ss v; ss g; void __attribute__((noinline)) foo (test_type u) { v.Type = u; } test_type __attribute__((noinline)) bar (void) { return v.Type; } However when compiled with -fno-strict-volatile-bitfields it does not crash, but AFAIK the generated code for foo() violates the C++ memory model: foo: @ Function supports interworking. @ args = 0, pretend = 0, frame = 0 @ frame_needed = 0, uses_anonymous_args = 0 @ link register save eliminated. ldr r2, .L2 ldr r3, [r2] bic r3, r3, #16711680 bic r3, r3, #65280 orr r3, r3, r0, asl #8 str r3, [r2] bx lr On Intel the generated code uses unaligned access, but is OK for the memory model: foo: .LFB0: .cfi_startproc movw %di, v+1(%rip) ret Am I right, or is the code OK for the Memory model? Regards Bernd.
Re: [PATCH] reimplement -fstrict-volatile-bitfields v4, part 1/2
On Wed, Oct 23, 2013 at 9:11 AM, Bernd Edlinger bernd.edlin...@hotmail.de wrote: Hi Richard/Joseph, I noticed, this test case crashes on arm-eabi already witout the patch. extern void abort (void); #define test_type unsigned short #define MAGIC (unsigned short)0x102u typedef struct s{ unsigned char Prefix[1]; test_type Type; }__attribute((__packed__,__aligned__(4))) ss; volatile ss v; ss g; void __attribute__((noinline)) foo (test_type u) { v.Type = u; } test_type __attribute__((noinline)) bar (void) { return v.Type; } However when compiled with -fno-strict-volatile-bitfields it does not crash, but AFAIK the generated code for foo() violates the C++ memory model: foo: @ Function supports interworking. @ args = 0, pretend = 0, frame = 0 @ frame_needed = 0, uses_anonymous_args = 0 @ link register save eliminated. ldrr2, .L2 ldrr3, [r2] bicr3, r3, #16711680 bicr3, r3, #65280 orrr3, r3, r0, asl #8 strr3, [r2] bxlr On Intel the generated code uses unaligned access, but is OK for the memory model: foo: .LFB0: .cfi_startproc movw%di, v+1(%rip) ret Am I right, or is the code OK for the Memory model? The C++ memory model says that you may not introduce a data-race and thus you have to access Type without touching Prefix. Richard. Regards Bernd.
RE: [PATCH] reimplement -fstrict-volatile-bitfields v4, part 1/2
Hi, On Wed, 23 Oct 2013 14:37:43, Richard Biener wrote: The C++ memory model says that you may not introduce a data-race and thus you have to access Type without touching Prefix. Thanks. Right now I see the following priorities: 1. make -fno-strict-volatile-bitfields respect the C++ memory model = I think, I know how to do that and can prepare a patch for that. This patch should fix the recursion problem that Sandra pointed out. 2. make Sandra's -fstrict-volatile-bitfields aware of the C++ memory model. = that means passing bitregion_start/end to strict_volatile_bitfield_p() and return false if the access is outside. (this is -fstrict-volatile-bitfields=gnu) 3. another patch can add -fstrict-volatile-bitfields=aapcs later, but I don't think we have to hurry for that. Bernd.
RE: [PATCH] reimplement -fstrict-volatile-bitfields v4, part 1/2
Well, one more point where the current patch is probably wrong: the AAPCS states that for volatile bit-field access: For a write operation the read must always occur even if the entire contents of the container will be replaced that means struct s { volatile int a:32; } ss; ss.a=1; //needs to read the value exactly once and write the new value. currently we just store. Bernd.
Re: [PATCH] reimplement -fstrict-volatile-bitfields v4, part 2/2
have an option for true AAPCS compliance, which will be allowed to break the C++11 memory model and And an option that addresses your requirements, which will _not_ break the C++11 memory model So the problem isn't that what *I* need conflicts with C++11, it's that what AAPCS needs conflicts?
RE: [PATCH] reimplement -fstrict-volatile-bitfields v4, part 1/2
Hi, On Sun, 20 Oct 2013 20:23:49, Sandra Loosemore wrote: Hr. After some further testing, I'm afraid this patch is still buggy. :-( I tried a backport to GCC 4.8 and tested on arm-none-eabi. On the new pr56997-1.c testcase, it got stuck in infinite recursion between store_split_bit_field/store_fixed_bit_field and/or extract_split_bit_field/extract_fixed_bit_field. This didn't show up in my previous mainline testing. The difference between 4.8 and mainline head is the alignment of the incoming str_rtx passed to store_bit_field/extract_bit_field, due to the changes in r199898. The alignment is 8 bits on mainline, and 32 on 4.8. It seems to me that the bitfield code ought to handle store/extract from a more-aligned object and it's probably possible to construct an example that fails in this way on mainline as well. It looks like there are conflicting assumptions between get_best_mode, the places that call it in store_fixed_bit_field and extract_fixed_bit_field, and the code that actually does the splitting -- which uses a unit based on the MEM_ALIGN of the incoming rtx, rather than its mode. In the case where it's failing, get_best_mode is deciding it can't do a HImode access without splitting, but then the split code is assuming SImode units because of the 32-bit alignment, but not actually changing the mode of the rtx to match that. On top of that, this is one of the cases that strict_volatile_bitfield_p checks for and returns false, but the callers of store_bit_field and extract_bit_field in expr.c have already fiddled with the mode of the incoming rtx assuming that -fstrict-volatile-bitfields does apply. It doesn't get into that infinite recursion if it's compiled with -fno-strict-volatile-bitfields instead; in that case the incoming rtx has BLKmode, get_best_mode chooses SImode, and it's able to do the access without splitting at all. Anyway I tried a couple different bandaids that solved the infinite recursion problem but caused regressions elsewhere, and now I'm not sure of the right place to fix this. Given that there is also still ongoing discussion about making this a 3-way switch (etc) I am going to hold off on committing this patch for now. -Sandra You are right, if I modify pr56997-1 the patch crashes on trunk: typedef struct s{ char Prefix; short Type; }__attribute__((packed,aligned(4))) ss; please add a test case for this. This way we have op0 aligned 32 and HI mode selected bitoffset=8 numbits=16. crashes only when reading this value, because the access tries to use SImode. For some reason the alignment seems to be wrong in 4.8. Thanks Bernd.
RE: [PATCH] reimplement -fstrict-volatile-bitfields v4, part 2/2
have an option for true AAPCS compliance, which will be allowed to break the C++11 memory model and And an option that addresses your requirements, which will _not_ break the C++11 memory model So the problem isn't that what *I* need conflicts with C++11, it's that what AAPCS needs conflicts? Yes, there are two written specifications which are in conflict AAPCS and C++11. We cannot follow both at the same time. But from this discussion I've learned, that your target's requirements can easily co-exist with the C++ memory model. Because if you only use well-formed bit-fields, the C++ memory model just allows everything, and we can choose what to do. Regards Bernd.
RE: [PATCH] reimplement -fstrict-volatile-bitfields v4, part 2/2
Bernd Edlinger bernd.edlin...@hotmail.de wrote: Hi, What I would suggest is to have a -fgnu-strict-volatile-bit-fields Why a new option? The -fstrict-volatile-bitfields option is already GCC-specific, and I think it can do what you want anyway. As I understand Richard's comment, he proposes to have an option for true AAPCS compliance, which will be allowed to break the C++11 memory model and which will _not_ be the default on any target. Name it -fstrict-volatile-bitfields. And an option that addresses your requirements, which will _not_ break the C++11 memory model and which will be the default on some targets, dependent on the respective ABI requirements. Name it -fgnu-strict-volatile-bit-fields. Yes. You could also make it -fstrict-volatile-bitfields={off,gnu,aacps} if you think that's better. Richard. Bernd.
Re: [PATCH] reimplement -fstrict-volatile-bitfields v4, part 1/2
On 10/18/2013 10:38 AM, Richard Biener wrote: Sandra Loosemore san...@codesourcery.com wrote: On 10/18/2013 04:50 AM, Richard Biener wrote: On Sat, Sep 28, 2013 at 4:19 AM, Sandra Loosemore san...@codesourcery.com wrote: This patch fixes various -fstrict-volatile-bitfields wrong-code bugs, including instances where bitfield values were being quietly truncated as well as bugs relating to using the wrong width. The code changes are identical to those in the previous version of this patch series (originally posted in June and re-pinged many times since then), but for this version I have cleaned up the test cases to remove dependencies on header files, as Bernd requested. Ok. Just to clarify, is this approval unconditional, independent of part 2 and other patches or changes still under active discussion? Yes. Hr. After some further testing, I'm afraid this patch is still buggy. :-( I tried a backport to GCC 4.8 and tested on arm-none-eabi. On the new pr56997-1.c testcase, it got stuck in infinite recursion between store_split_bit_field/store_fixed_bit_field and/or extract_split_bit_field/extract_fixed_bit_field. This didn't show up in my previous mainline testing. The difference between 4.8 and mainline head is the alignment of the incoming str_rtx passed to store_bit_field/extract_bit_field, due to the changes in r199898. The alignment is 8 bits on mainline, and 32 on 4.8. It seems to me that the bitfield code ought to handle store/extract from a more-aligned object and it's probably possible to construct an example that fails in this way on mainline as well. It looks like there are conflicting assumptions between get_best_mode, the places that call it in store_fixed_bit_field and extract_fixed_bit_field, and the code that actually does the splitting -- which uses a unit based on the MEM_ALIGN of the incoming rtx, rather than its mode. In the case where it's failing, get_best_mode is deciding it can't do a HImode access without splitting, but then the split code is assuming SImode units because of the 32-bit alignment, but not actually changing the mode of the rtx to match that. On top of that, this is one of the cases that strict_volatile_bitfield_p checks for and returns false, but the callers of store_bit_field and extract_bit_field in expr.c have already fiddled with the mode of the incoming rtx assuming that -fstrict-volatile-bitfields does apply. It doesn't get into that infinite recursion if it's compiled with -fno-strict-volatile-bitfields instead; in that case the incoming rtx has BLKmode, get_best_mode chooses SImode, and it's able to do the access without splitting at all. Anyway I tried a couple different bandaids that solved the infinite recursion problem but caused regressions elsewhere, and now I'm not sure of the right place to fix this. Given that there is also still ongoing discussion about making this a 3-way switch (etc) I am going to hold off on committing this patch for now. -Sandra
RE: [PATCH] reimplement -fstrict-volatile-bitfields v4, part 2/2
Hi, What I would suggest is to have a -fgnu-strict-volatile-bit-fields Why a new option? The -fstrict-volatile-bitfields option is already GCC-specific, and I think it can do what you want anyway. As I understand Richard's comment, he proposes to have an option for true AAPCS compliance, which will be allowed to break the C++11 memory model and which will _not_ be the default on any target. Name it -fstrict-volatile-bitfields. And an option that addresses your requirements, which will _not_ break the C++11 memory model and which will be the default on some targets, dependent on the respective ABI requirements. Name it -fgnu-strict-volatile-bit-fields. Bernd.
Re: [PATCH] reimplement -fstrict-volatile-bitfields v4, part 2/2
On Wed, Oct 9, 2013 at 3:09 AM, Bernd Edlinger bernd.edlin...@hotmail.de wrote: Hi, On Mon, 30 Sep 2013 16:18:30, DJ Delorie wrote: As per my previous comments on this patch, I will not approve the changes to the m32c backend, as they will cause real bugs in real hardware, and violate the hardware's ABI. The user may use -fno-strict-volatile-bitfields if they do not desire this behavior and understand the consequences. I am not a maintainer for the rx and h8300 ports, but they are in the same situation. To reiterate my core position: if the user defines a proper volatile int bitfield, and the compiler does anything other than an int-sized access, the compiler is WRONG. Any optimization that changes volatile accesses to something other than what the user specified is a bug that needs to be fixed before this option can be non-default. hmm, I just tried to use the latest 4.9 trunk to compile the example from the AAPCS document: struct s { volatile int a:8; volatile char b:2; }; struct s ss; int main () { ss.a=1; ss.b=1; return 0; } and the resulting code is completely against the written AAPCS specification: main: @ Function supports interworking. @ args = 0, pretend = 0, frame = 0 @ frame_needed = 0, uses_anonymous_args = 0 @ link register save eliminated. ldr r3, .L2 ldrhr2, [r3] bic r2, r2, #254 orr r2, r2, #1 strhr2, [r3]@ movhi ldrhr2, [r3] bic r2, r2, #512 orr r2, r2, #256 strhr2, [r3]@ movhi mov r0, #0 bx lr two half-word accesses, to my total surprise! As it looks like, the -fstrict-volatile-bitfields are already totally broken, apparently in favour of the C++ memory model, at least at the write-side. Note that the C++ memory model restricts the _maximum_ memory region we may touch - it does not constrain the minimum as AAPCS does. What I would suggest is to have a -fgnu-strict-volatile-bit-fields (or two levels of it) enabled by default on AAPCS targets which will follow the AAPCS if it doesn't violate the maximum memory region constraints of the C++ memory model. I never claimed that the C++ memory model is good enough for AAPCS but there was consensus that the default on AAPCS should not violate the C++ memory model by default. -fgnu-strict-volatile-bit-fields should be fully implementable in get_best_mode if you pass down the desired AAPCS mode. Richard. These are aligned accesses, not the packed structures, that was the only case where it used to work once. This must be fixed. I do not understand why we cannot agree, that at least the bug-fix part of Sandra's patch needs to be applied. Regards Bernd.
RE: [PATCH] reimplement -fstrict-volatile-bitfields v4, part 2/2
Hi, On Thu, 17 Oct 2013 16:41:13, DJ Delorie wrote: I'm starting from an MCU that doesn't work right if GCC doesn't do what the user tells GCC to do. I added -fstrict-volatile-bitfields to tell gcc that it needs to be more strict than the standard allows for bitfield access, because without that flag, there's no way to force gcc to use a specific access width on bitfields. When I added that flag, some ARM folks chose to enable it in their target, because they felt they needed it. If different ARM folks feel otherwise, that's a target problem. If the user tells gcc that a particular 32-bit memory location should be treated as both a char and a long, then gcc has been given inconsistent information. The spec says it can do what it wants to access that memory, and it does. If the user tells gcc that a particular 16-bit memory location consists of 16-bits worth of unsigned short, and the user has told gcc that it needs to be strict about accessing that field in the type specified, and gcc uses a 32-bit access anyway, gcc is wrong. I will agree with you 100% that gcc can do whatever the spec allows if the user does NOT specify -fstrict-volatile-bitfields, but the flag is there to tell gcc that the user needs stricter control than the spec demands. If the user uses that flag, *and* gives gcc information that is inconsistent with the use of that flag, then the user is wrong. I note that you assume GNU/Linux is involved, perhaps that's part of the problem. Maybe the Linux kernel needs gcc to ignore its bitfield types, but other ARM firmware may have other requirements. If you and the other ARM maintainers want to argue over whether -fstrict-volatile-bitfields is enabled by default for the ARM target, go for it. Just leave my targets alone. where those semantics contradict the semantics of ISO C. Where in the spec does it say that a compiler MUST access an unsigned short bitfield with a 32-bit access? I've seen places where it says the compiler MAY or MIGHT do it, but not MUST. Well, I'm starting to think that we can never follow the ARM AAPCS by the letter. But maybe we could implement -fstrict-volatile-bitfields in this way: We use the container mode where possible. It is always possible for well-formed bit-fields. If necessary the user has to add anonymous bit field members, or convert normal members to bit-fields. But when the bit field is conflict with the either the hardware's alignment requirements or with the C++11 memory model, we follow the latter. This means that, even for volatile data: 1. we never write anything outside the BIT_FIELD_REPRESENTATIVE. 2. we allow unaligned packed data, but we may use multiple accesses for a single read/write op. Also in this case: no data store races outside the member. Example: struct { volatile int a : 24; volatile char b; }; This is not well-formed, and -fstrict-volatile-bitfields will have no effect on it. The user has to re-write this structure to: struct { volatile int a : 24; volatile char b : 8; }; Maybe a warning for not well-formed bit-fields could be helpful... What do you think? Bernd.
Re: [PATCH] reimplement -fstrict-volatile-bitfields v4, part 1/2
On Sat, Sep 28, 2013 at 4:19 AM, Sandra Loosemore san...@codesourcery.com wrote: This patch fixes various -fstrict-volatile-bitfields wrong-code bugs, including instances where bitfield values were being quietly truncated as well as bugs relating to using the wrong width. The code changes are identical to those in the previous version of this patch series (originally posted in June and re-pinged many times since then), but for this version I have cleaned up the test cases to remove dependencies on header files, as Bernd requested. Ok. Thanks, Richard. -Sandra
Re: [PATCH] reimplement -fstrict-volatile-bitfields v4, part 1/2
On 10/18/2013 04:50 AM, Richard Biener wrote: On Sat, Sep 28, 2013 at 4:19 AM, Sandra Loosemore san...@codesourcery.com wrote: This patch fixes various -fstrict-volatile-bitfields wrong-code bugs, including instances where bitfield values were being quietly truncated as well as bugs relating to using the wrong width. The code changes are identical to those in the previous version of this patch series (originally posted in June and re-pinged many times since then), but for this version I have cleaned up the test cases to remove dependencies on header files, as Bernd requested. Ok. Just to clarify, is this approval unconditional, independent of part 2 and other patches or changes still under active discussion? -Sandra
Re: [PATCH] reimplement -fstrict-volatile-bitfields v4, part 1/2
Sandra Loosemore san...@codesourcery.com wrote: On 10/18/2013 04:50 AM, Richard Biener wrote: On Sat, Sep 28, 2013 at 4:19 AM, Sandra Loosemore san...@codesourcery.com wrote: This patch fixes various -fstrict-volatile-bitfields wrong-code bugs, including instances where bitfield values were being quietly truncated as well as bugs relating to using the wrong width. The code changes are identical to those in the previous version of this patch series (originally posted in June and re-pinged many times since then), but for this version I have cleaned up the test cases to remove dependencies on header files, as Bernd requested. Ok. Just to clarify, is this approval unconditional, independent of part 2 and other patches or changes still under active discussion? Yes. Richard. -Sandra
Re: [PATCH] reimplement -fstrict-volatile-bitfields v4, part 2/2
We use the container mode where possible. It is always possible for well-formed bit-fields. This is the only part I really need. If necessary the user has to add anonymous bit field members, or convert normal members to bit-fields. IIRC I added code to support normal members as well, the bitfields part of the option name is not entirely accurate. But when the bit field is conflict with the either the hardware's alignment requirements or with the C++11 memory model, we follow the latter. Fine with me. 2. we allow unaligned packed data, but we may use multiple accesses for a single read/write op. Also in this case: no data store races outside the member. We should note that volatile != atomic in these cases. Perhaps a warning would be in order? Example: struct { volatile int a : 24; volatile char b; }; This is not well-formed, and -fstrict-volatile-bitfields will have no effect on it. I agree this isn't well-formed, but -fs-v-b could still make a best effort since none of the fields *individually* span a natural boundary. The user has to re-write this structure to: struct { volatile int a : 24; volatile char b : 8; }; More accurate would be these: struct { volatile int a : 24; volatile int b : 8;/* overlap, should be the same type */ }; struct { volatile int a : 24; volatile int : 0; volatile char b; /* no overlap, make the padding explicit */ };
Re: [PATCH] reimplement -fstrict-volatile-bitfields v4, part 2/2
What I would suggest is to have a -fgnu-strict-volatile-bit-fields Why a new option? The -fstrict-volatile-bitfields option is already GCC-specific, and I think it can do what you want anyway.
RE: [PATCH] reimplement -fstrict-volatile-bitfields v4, part 2/2
On Wed, 16 Oct 2013 22:50:20, DJ Delorie wrote: Not all of them can work, because they describe something that can't be done in hardware. For example, the first test has an incomplete bitfield - the fields do not completely describe an int so the structure is smaller (one byte, according to sizeof()) than the access type (2-8 bytes). where in the C standard did you read the requirement that every bit field must be complete? (This is a serious question). extern struct { volatile unsigned int b : 1; } bf3; on my compiler this structure occupies 4 bytes. and it is aligned at 4 bytes. That is OK for me and AAPCS. But the access bf3.b=1 is SI mode with Sandra's patch (part 1, which just obeys the AAPCS and does nothing else) and QI mode without this patch, which is just a BUG. I am quite surprised how your target manages to avoid it? It is as Sandra said, at least on ARM -fstrict-volatile-bitfields does not function at all. And the C++11 memory model wins all the time. Looking through the tests, most of them combine packed with mismatched types. IMHO, those tests are invalid. I dont think so. They are simply packed. And volatile just says that the value may be changed by a different thread. It has a great impact on loop optimizations. Either way, if -fstrict-volatile-bitfields does not do what it's supposed to do, the correct action is to fix it - not to disable it on targets that rely on it for correct operation. Agreed. That is the number one priority here. ... I've not objected to fixing -fstrict-volatile-bitfields, or making the -fno-strict-volatile-bitfields case match the standard. I've only objected to breaking my targets by making that flag not the default. Fine. Why cant we just get this fixed? Bernd.
Re: [PATCH] reimplement -fstrict-volatile-bitfields v4, part 2/2
where in the C standard did you read the requirement that every bit field must be complete? (This is a serious question). The spec doesn't say each field must be complete, but neither does it say that the structure must be as big as the type used. If you specify int foo:1 then the compile is allowed to make the struct smaller than int. extern struct { volatile unsigned int b : 1; } bf3; on my compiler this structure occupies 4 bytes. and it is aligned at 4 bytes. That is OK for me and AAPCS. On my target, the structure occupies 1 byte. I suspect gcc is rounding up to WORD_MODE, which is 4 bytes for you and 1 for me. If so, you're relying on a coincidence, not a standard. But the access bf3.b=1 is SI mode with Sandra's patch (part 1, which just obeys the AAPCS and does nothing else) and QI mode without this patch, which is just a BUG. I am quite surprised how your target manages to avoid it? It doesn't, but I wouldn't expect it to, because IMHO that test is invalid - you have not given a precise enough test to expect consistent results. It is as Sandra said, at least on ARM -fstrict-volatile-bitfields does not function at all. And the C++11 memory model wins all the time. Are we talking about N2429? I read through the changes and it didn't preclude honoring the user's types. Looking through the tests, most of them combine packed with mismatched types. IMHO, those tests are invalid. I dont think so. They are simply packed. And volatile just says that the value may be changed by a different thread. It has a great impact on loop optimizations. Nothing you've said so far precludes honoring the user's types when the user gives you a consistent structure. Consider: typedef struct { unsigned char b0:1, b1:1, b2:1, b3:1, b4:1, b5:1, b6:1, b7:1; } Bits; extern volatile struct { Bits reg1; /* read status without resetting them */ Bits reg2; /* read status and atomically reset them */ } interrupt_handler; Without -fstrict-volatile-bitfields, gcc is allowed to use a 16-bit access to read reg1, but this causes an unexpected read to volatile reg2, which may have unintended consequences. The spec doesn't say the compiler *must* read reg2, it just *doesn't* say that it *can't*. The -fstrict-volatile-bitfields flag tells the compiler that it *can't*. IMHO this doesn't violate the spec, it just limits what the compiler can do within the spec. If ARM wants fast word-sized volatile bitfields, use int and structure your bitfields so that no int field overlaps a natural int boundary. When I added the option, I considered making it an error() to define a strict volatile bitfield that spanned the allowed boundary of the type specified, but I figured that would be too much. I've not objected to fixing -fstrict-volatile-bitfields, or making the -fno-strict-volatile-bitfields case match the standard. I've only objected to breaking my targets by making that flag not the default. Fine. Why cant we just get this fixed? Dunno. I'm only opposing the patch that disabled it on my targets.
Re: [PATCH] reimplement -fstrict-volatile-bitfields v4, part 2/2
On Thu, 17 Oct 2013, DJ Delorie wrote: It is as Sandra said, at least on ARM -fstrict-volatile-bitfields does not function at all. And the C++11 memory model wins all the time. Are we talking about N2429? I read through the changes and it didn't preclude honoring the user's types. At least on ARM, you can e.g. have a non-bit-field char that occupies part of the same 4-byte unit as an int bit-field. And the C11/C++11 memory model prohibits stores to that bit-field from doing read/modify/write on the whole 4-byte unit, as that's a store data race. If the user wants to allow that data race, they can rewrite their code by changing the char into a bit-field. I believe that in all cases where strict-volatile-bitfields ABIs are incompatible with the default --param allow-store-data-races=0, the structure can be similarly rewritten by the user so the adjacent fields become bit-fields, if they want the strict-volatile-bitfields requirements for their code to be compatible with the memory model. -- Joseph S. Myers jos...@codesourcery.com
Re: [PATCH] reimplement -fstrict-volatile-bitfields v4, part 2/2
At least on ARM, you can e.g. have a non-bit-field char that occupies part of the same 4-byte unit as an int bit-field. Ok, when the user doesn't give the compiler conflicting requirements. (which is why I contemplated making those conflicts errors at first) I'm a big fan of blaming the user when they mix things together and expect the compiler to read their mind.
Re: [PATCH] reimplement -fstrict-volatile-bitfields v4, part 2/2
On Thu, 17 Oct 2013, DJ Delorie wrote: At least on ARM, you can e.g. have a non-bit-field char that occupies part of the same 4-byte unit as an int bit-field. Ok, when the user doesn't give the compiler conflicting requirements. (which is why I contemplated making those conflicts errors at first) I don't consider them conflicting. My default starting point is that the user has a GNU C program, using target-independent semantics of GNU C, and expects that program to be compiled and executed according to those semantics. (In this case, it can be an ISO C program; no GNU extensions are required.) Those semantics include the memory model. The fact that the user happens to be compiling for an ARM processor shouldn't change those semantics. If, say, a GNU/Linux distribution contains code expecting the memory model, we shouldn't make the distributor use special options for building that code on ARM; building for a new architecture should, as far as possible, just work (and new architecture porters are hardly likely to know if one of thousands of packages might have such a requirement). You seem to be starting from a user having a program that instead of expecting the normal target-independent semantics expects some other semantics specified by an ABI document, where those semantics contradict the semantics of ISO C. -- Joseph S. Myers jos...@codesourcery.com
Re: [PATCH] reimplement -fstrict-volatile-bitfields v4, part 2/2
I'm starting from an MCU that doesn't work right if GCC doesn't do what the user tells GCC to do. I added -fstrict-volatile-bitfields to tell gcc that it needs to be more strict than the standard allows for bitfield access, because without that flag, there's no way to force gcc to use a specific access width on bitfields. When I added that flag, some ARM folks chose to enable it in their target, because they felt they needed it. If different ARM folks feel otherwise, that's a target problem. If the user tells gcc that a particular 32-bit memory location should be treated as both a char and a long, then gcc has been given inconsistent information. The spec says it can do what it wants to access that memory, and it does. If the user tells gcc that a particular 16-bit memory location consists of 16-bits worth of unsigned short, and the user has told gcc that it needs to be strict about accessing that field in the type specified, and gcc uses a 32-bit access anyway, gcc is wrong. I will agree with you 100% that gcc can do whatever the spec allows if the user does NOT specify -fstrict-volatile-bitfields, but the flag is there to tell gcc that the user needs stricter control than the spec demands. If the user uses that flag, *and* gives gcc information that is inconsistent with the use of that flag, then the user is wrong. I note that you assume GNU/Linux is involved, perhaps that's part of the problem. Maybe the Linux kernel needs gcc to ignore its bitfield types, but other ARM firmware may have other requirements. If you and the other ARM maintainers want to argue over whether -fstrict-volatile-bitfields is enabled by default for the ARM target, go for it. Just leave my targets alone. where those semantics contradict the semantics of ISO C. Where in the spec does it say that a compiler MUST access an unsigned short bitfield with a 32-bit access? I've seen places where it says the compiler MAY or MIGHT do it, but not MUST.
Re: [PATCH] reimplement -fstrict-volatile-bitfields v4, part 2/2
As it looks like, the -fstrict-volatile-bitfields are already totally broken, I tested your example on rl78-elf, with and without -fstrict-volatile-bitfields, and it generates correct code with it and incorrect code without it. Same for m32c-elf. Same for h8300-elf. Seems to be working OK for me. Either way, if -fstrict-volatile-bitfields does not do what it's supposed to do, the correct action is to fix it - not to disable it on targets that rely on it for correct operation.
Re: [PATCH] reimplement -fstrict-volatile-bitfields v4, part 2/2
On 10/16/2013 05:46 PM, DJ Delorie wrote: As it looks like, the -fstrict-volatile-bitfields are already totally broken, I tested your example on rl78-elf, with and without -fstrict-volatile-bitfields, and it generates correct code with it and incorrect code without it. Same for m32c-elf. Same for h8300-elf. Seems to be working OK for me. I'm curious; do all the test cases included in http://gcc.gnu.org/ml/gcc-patches/2013-09/msg02058.html work for you on those targets as well (without applying the rest of the patch)? Either way, if -fstrict-volatile-bitfields does not do what it's supposed to do, the correct action is to fix it - not to disable it on targets that rely on it for correct operation. Unfortunately, fixing -fstrict-volatile-bitfields to conform to the AAPCS breaks conformance with the C11/C++11 standards, and we seem to be at an impasse where global reviewers refuse to consider anything that breaks language standard conformance and target maintainers refuse to consider anything that breaks ABI conformance. Meanwhile, while we're bickering about what the default should be, ARM users (in particular) are stuck in a situation where GCC's default behavior is to generate unaligned accesses that fault at runtime or wrong code that silently truncates bitfields, and there is *no* command-line option or even build-time option they can provide to make GCC generate code that is correct according to the AAPCS. FWIW, I don't care much myself in an abstract way about whether language standard conformance or ABI conformance should rule here. But, looking at it practically, I think the battle over correctness was already over when the C and C++ standards committees voted to specify this behavior in a particular way, and that in time ABIs that specify some other behavior are going to be either revised or considered archaic or obsolete. Anyway, at this point I've given up on thinking either side is going to convince the other in this debate, and I haven't seen much evidence that either side would be willing to compromise, either (e.g., Bernd's patch to add warnings in cases where the behavior differs, or my earlier proposal to make the defaults depend on the selected language standard as well as the ABI, don't seem to have placated either side). About the only way forward I can see is if the GCC steering committee steps in with a general policy statement about what to do when language standards conflict with a target ABI. And, of course, the patch itself still needs review -Sandra, who is feeling pretty discouraged about how hard it is to contribute to GCC :-(
Re: [PATCH] reimplement -fstrict-volatile-bitfields v4, part 2/2
I'm curious; do all the test cases included in http://gcc.gnu.org/ml/gcc-patches/2013-09/msg02058.html work for you on those targets as well (without applying the rest of the patch)? Not all of them can work, because they describe something that can't be done in hardware. For example, the first test has an incomplete bitfield - the fields do not completely describe an int so the structure is smaller (one byte, according to sizeof()) than the access type (2-8 bytes). Looking through the tests, most of them combine packed with mismatched types. IMHO, those tests are invalid. Either way, if -fstrict-volatile-bitfields does not do what it's supposed to do, the correct action is to fix it - not to disable it on targets that rely on it for correct operation. Unfortunately, fixing -fstrict-volatile-bitfields to conform to the AAPCS breaks conformance with the C11/C++11 standards, The global default is -fno-strict-volatile-bitfields. The flag was added specifically to force the compiler to always use the user-specified type when accessing bitfields, at least when it was physically possible. I don't see why there's a problem. Without the flag, you follow the standard. With the flag, you do what the user told you to do. It's up to the targets to decide which way the flag defaults if the user doesn't specify. For my targets, the default must be to do what the user told you to do, regardless of the standards, because the standards lead to hardware failure. Meanwhile, while we're bickering about what the default should be, ARM users (in particular) are stuck in a situation where GCC's default behavior is to generate unaligned accesses that fault at runtime or wrong code that silently truncates bitfields, Note that if the user defines a structure that intentionally mixes types in a hardware word, the compiler can't do what it's told. It's still up to the user to write well-defined structures in those cases. Mixing packed, incomplete bitfields, and multiple types, (i.e. pr48784-1.c in that patch) is an example of a case where gcc can't do what the user asked, because what the user asked is nonsense. and there is *no* command-line option or even build-time option they can provide to make GCC generate code that is correct according to the AAPCS. At the time I added -fstrict-volatile-bitfields, the ARM maintainers seemed to agree that the new functionality was what they needed. So, -fstrict-volatile-bitfields *is* that command line option, it's just either broken for them, or they're trying to do something not physically possible. But, looking at it practically, I think the battle over correctness was already over when the C and C++ standards committees voted to specify this behavior in a particular way, and that in time ABIs that specify some other behavior are going to be either revised or considered archaic or obsolete. My problem isn't some arbitrary ABI, it's a hardware peripheral register that just can't be accessed in the wrong mode, because the peripheral won't work if you do that. Most MCU peripherals work this way. Just reading a byte or word outside the intended one could have real consequences in the hardware. About the only way forward I can see is if the GCC steering committee steps in with a general policy statement about what to do when language standards conflict with a target ABI. I've not objected to fixing -fstrict-volatile-bitfields, or making the -fno-strict-volatile-bitfields case match the standard. I've only objected to breaking my targets by making that flag not the default.
RE: [PATCH] reimplement -fstrict-volatile-bitfields v4, part 2/2
Hi, On Mon, 30 Sep 2013 16:18:30, DJ Delorie wrote: As per my previous comments on this patch, I will not approve the changes to the m32c backend, as they will cause real bugs in real hardware, and violate the hardware's ABI. The user may use -fno-strict-volatile-bitfields if they do not desire this behavior and understand the consequences. I am not a maintainer for the rx and h8300 ports, but they are in the same situation. To reiterate my core position: if the user defines a proper volatile int bitfield, and the compiler does anything other than an int-sized access, the compiler is WRONG. Any optimization that changes volatile accesses to something other than what the user specified is a bug that needs to be fixed before this option can be non-default. hmm, I just tried to use the latest 4.9 trunk to compile the example from the AAPCS document: struct s { volatile int a:8; volatile char b:2; }; struct s ss; int main () { ss.a=1; ss.b=1; return 0; } and the resulting code is completely against the written AAPCS specification: main: @ Function supports interworking. @ args = 0, pretend = 0, frame = 0 @ frame_needed = 0, uses_anonymous_args = 0 @ link register save eliminated. ldr r3, .L2 ldrh r2, [r3] bic r2, r2, #254 orr r2, r2, #1 strh r2, [r3] @ movhi ldrh r2, [r3] bic r2, r2, #512 orr r2, r2, #256 strh r2, [r3] @ movhi mov r0, #0 bx lr two half-word accesses, to my total surprise! As it looks like, the -fstrict-volatile-bitfields are already totally broken, apparently in favour of the C++ memory model, at least at the write-side. These are aligned accesses, not the packed structures, that was the only case where it used to work once. This must be fixed. I do not understand why we cannot agree, that at least the bug-fix part of Sandra's patch needs to be applied. Regards Bernd.
Re: [PATCH] reimplement -fstrict-volatile-bitfields v4, part 2/2
As per my previous comments on this patch, I will not approve the changes to the m32c backend, as they will cause real bugs in real hardware, and violate the hardware's ABI. The user may use -fno-strict-volatile-bitfields if they do not desire this behavior and understand the consequences. I am not a maintainer for the rx and h8300 ports, but they are in the same situation. To reiterate my core position: if the user defines a proper volatile int bitfield, and the compiler does anything other than an int-sized access, the compiler is WRONG. Any optimization that changes volatile accesses to something other than what the user specified is a bug that needs to be fixed before this option can be non-default.
[PATCH] reimplement -fstrict-volatile-bitfields v4, part 0/2
Here is the latest version of my -fstrict-volatile-bitfields patches. I have gone ahead and committed part 1 of the previous version of this patch series, which was approved back in mid-June. That part just removes the warning about conflicts with packed structs (which everybody seemed to agree was a bad idea) but doesn't otherwise change the behavior of -fstrict-volatile-bitfields. The code changes for the rest of the patch series are unchanged, but I've updated the test cases to remove dependencies on header files. I refreshed the patches against mainline head and retested all parts of the patch series last week. Part 1 of the current patch series incorporates parts 2 and 3 of the previous version (code changes and test cases for the various bugs that have been reported against -fstrict-volatile-bitfields). Part 2 is the same as part 4 of the previous version (changing the target-specific defaults). Hopefully having fewer parts to keep track of will make it easier to review. -Sandra
[PATCH] reimplement -fstrict-volatile-bitfields v4, part 1/2
This patch fixes various -fstrict-volatile-bitfields wrong-code bugs, including instances where bitfield values were being quietly truncated as well as bugs relating to using the wrong width. The code changes are identical to those in the previous version of this patch series (originally posted in June and re-pinged many times since then), but for this version I have cleaned up the test cases to remove dependencies on header files, as Bernd requested. -Sandra 2013-09-28 Sandra Loosemore san...@codesourcery.com PR middle-end/23623 PR middle-end/48784 PR middle-end/56341 PR middle-end/56997 gcc/ * expmed.c (strict_volatile_bitfield_p): New function. (store_bit_field_1): Don't special-case strict volatile bitfields here. (store_bit_field): Handle strict volatile bitfields here instead. (store_fixed_bit_field): Don't special-case strict volatile bitfields here. (extract_bit_field_1): Don't special-case strict volatile bitfields here. (extract_bit_field): Handle strict volatile bitfields here instead. (extract_fixed_bit_field): Don't special-case strict volatile bitfields here. Simplify surrounding code to resemble that in store_fixed_bit_field. * doc/invoke.texi (Code Gen Options): Update -fstrict-volatile-bitfields description. gcc/testsuite/ * gcc.dg/pr23623.c: New test. * gcc.dg/pr48784-1.c: New test. * gcc.dg/pr48784-2.c: New test. * gcc.dg/pr56341-1.c: New test. * gcc.dg/pr56341-2.c: New test. * gcc.dg/pr56997-1.c: New test. * gcc.dg/pr56997-2.c: New test. * gcc.dg/pr56997-3.c: New test. Index: gcc/expmed.c === --- gcc/expmed.c (revision 203003) +++ gcc/expmed.c (working copy) @@ -415,6 +415,42 @@ lowpart_bit_field_p (unsigned HOST_WIDE_ return bitnum % BITS_PER_WORD == 0; } +/* Return true if -fstrict-volatile-bitfields applies an access of OP0 + containing BITSIZE bits starting at BITNUM, with field mode FIELDMODE. */ + +static bool +strict_volatile_bitfield_p (rtx op0, unsigned HOST_WIDE_INT bitsize, + unsigned HOST_WIDE_INT bitnum, + enum machine_mode fieldmode) +{ + unsigned HOST_WIDE_INT modesize = GET_MODE_BITSIZE (fieldmode); + + /* -fstrict-volatile-bitfields must be enabled and we must have a + volatile MEM. */ + if (!MEM_P (op0) + || !MEM_VOLATILE_P (op0) + || flag_strict_volatile_bitfields = 0) +return false; + + /* Non-integral modes likely only happen with packed structures. + Punt. */ + if (!SCALAR_INT_MODE_P (fieldmode)) +return false; + + /* The bit size must not be larger than the field mode, and + the field mode must not be larger than a word. */ + if (bitsize modesize || modesize BITS_PER_WORD) +return false; + + /* Check for cases of unaligned fields that must be split. */ + if (bitnum % BITS_PER_UNIT + bitsize modesize + || (STRICT_ALIGNMENT + bitnum % GET_MODE_ALIGNMENT (fieldmode) + bitsize modesize)) +return false; + + return true; +} + /* Return true if OP is a memory and if a bitfield of size BITSIZE at bit number BITNUM can be treated as a simple value of mode MODE. */ @@ -813,12 +849,8 @@ store_bit_field_1 (rtx str_rtx, unsigned cheap register alternative is available. */ if (MEM_P (op0)) { - /* Do not use unaligned memory insvs for volatile bitfields when - -fstrict-volatile-bitfields is in effect. */ - if (!(MEM_VOLATILE_P (op0) - flag_strict_volatile_bitfields 0) - get_best_mem_extraction_insn (insv, EP_insv, bitsize, bitnum, - fieldmode) + if (get_best_mem_extraction_insn (insv, EP_insv, bitsize, bitnum, + fieldmode) store_bit_field_using_insv (insv, op0, bitsize, bitnum, value)) return true; @@ -871,6 +903,27 @@ store_bit_field (rtx str_rtx, unsigned H enum machine_mode fieldmode, rtx value) { + /* Handle -fstrict-volatile-bitfields in the cases where it applies. */ + if (strict_volatile_bitfield_p (str_rtx, bitsize, bitnum, fieldmode)) +{ + + /* Storing any naturally aligned field can be done with a simple + store. For targets that support fast unaligned memory, any + naturally sized, unit aligned field can be done directly. */ + if (simple_mem_bitfield_p (str_rtx, bitsize, bitnum, fieldmode)) + { + str_rtx = adjust_bitfield_address (str_rtx, fieldmode, + bitnum / BITS_PER_UNIT); + emit_move_insn (str_rtx, value); + } + else + /* Explicitly override the C/C++ memory model; ignore the + bit range so that we can do the access in the mode mandated + by -fstrict-volatile-bitfields instead. */ + store_fixed_bit_field (str_rtx, bitsize, bitnum, 0, 0, value); + return; +} + /* Under the C++0x memory model, we must not touch bits outside the bit region. Adjust the address to start at the beginning of the bit region. */ @@ -923,29 +976,12 @@ store_fixed_bit_field (rtx op0, unsigned if (MEM_P (op0)) { - unsigned HOST_WIDE_INT
[PATCH] reimplement -fstrict-volatile-bitfields v4, part 2/2
This patch makes -fstrict-volatile-bitfields not be the default on any target. It is unchanged from part 4 of the previous version (v3) of this patch set that I originally posted back in June and have been re-pinging many times since then. Some reviewers of my original patch series argued quite strongly that the C/C++ language semantics ought to take precedence over any target-specific ABI in cases where they conflict. I am neutral on this change myself, but if it is a requirement for approval of the other part of the patch that fixes the wrong-code bugs, I think users on targets such as ARM that currently default to enabling this flag would be better off specifying the flag explicitly to get code that does what they want, than getting broken code by default and no way to tell GCC to DTRT. :-( And that's the behavior we're stuck with if we do nothing or can't reach a consensus about what to do. :-( Bernd Edlinger has been working on a followup patch to issue warnings in cases where -fstrict-volatile-bitfields behavior conflicts with the new C/C++ memory model, which might help to ease the transition in the change of defaults. I believe this was the last version posted: http://gcc.gnu.org/ml/gcc-patches/2013-09/msg00100.html -Sandra 2013-09-28 Sandra Loosemore san...@codesourcery.com gcc/ * config/aarch64/aarch64.c (aarch64_override_options): Don't override flag_strict_volatile_bitfields. * config/arm/arm.c (arm_option_override): Likewise. * config/h8300/h8300.c (h8300_option_override): Likewise. * config/m32c/m32c.c (m32c_option_override): Likewise. * config/rx/rx.c (rx_option_override): Likewise. * config/sh/sh.c (sh_option_override): Likewise. * doc/invoke.texi (Code Gen Options): Document that -fstrict-volatile-bitfields is no longer the default on any target. gcc/testsuite/ * gcc.target/arm/volatile-bitfields-1.c: Add explicit -fstrict-volatile-bitfields. * gcc.target/arm/volatile-bitfields-2.c: Likewise. * gcc.target/arm/volatile-bitfields-3.c: Likewise. * gcc.target/arm/volatile-bitfields-4.c: Likewise. Index: gcc/config/aarch64/aarch64.c === --- gcc/config/aarch64/aarch64.c (revision 203002) +++ gcc/config/aarch64/aarch64.c (working copy) @@ -5141,10 +5141,6 @@ aarch64_override_options (void) aarch64_build_bitmask_table (); - /* This target defaults to strict volatile bitfields. */ - if (flag_strict_volatile_bitfields 0 abi_version_at_least (2)) -flag_strict_volatile_bitfields = 1; - /* If the user did not specify a processor, choose the default one for them. This will be the CPU set during configuration using --with-cpu, otherwise it is generic. */ Index: gcc/config/arm/arm.c === --- gcc/config/arm/arm.c (revision 203002) +++ gcc/config/arm/arm.c (working copy) @@ -2136,11 +2136,6 @@ arm_option_override (void) global_options.x_param_values, global_options_set.x_param_values); - /* ARM EABI defaults to strict volatile bitfields. */ - if (TARGET_AAPCS_BASED flag_strict_volatile_bitfields 0 - abi_version_at_least(2)) -flag_strict_volatile_bitfields = 1; - /* Enable sw prefetching at -O3 for CPUS that have prefetch, and we have deemed it beneficial (signified by setting num_prefetch_slots to 1 or more.) */ if (flag_prefetch_loop_arrays 0 Index: gcc/config/h8300/h8300.c === --- gcc/config/h8300/h8300.c (revision 203002) +++ gcc/config/h8300/h8300.c (working copy) @@ -437,10 +437,6 @@ h8300_option_override (void) restore er6 though, so bump up the cost. */ h8300_move_ratio = 6; } - - /* This target defaults to strict volatile bitfields. */ - if (flag_strict_volatile_bitfields 0 abi_version_at_least(2)) -flag_strict_volatile_bitfields = 1; } /* Return the byte register name for a register rtx X. B should be 0 Index: gcc/config/m32c/m32c.c === --- gcc/config/m32c/m32c.c (revision 203002) +++ gcc/config/m32c/m32c.c (working copy) @@ -416,10 +416,6 @@ m32c_option_override (void) if (TARGET_A24) flag_ivopts = 0; - /* This target defaults to strict volatile bitfields. */ - if (flag_strict_volatile_bitfields 0 abi_version_at_least(2)) -flag_strict_volatile_bitfields = 1; - /* r8c/m16c have no 16-bit indirect call, so thunks are involved. This is always worse than an absolute call. */ if (TARGET_A16) Index: gcc/config/rx/rx.c === --- gcc/config/rx/rx.c (revision 203002) +++ gcc/config/rx/rx.c (working copy) @@ -2691,10 +2691,6 @@ rx_option_override (void) } } - /* This target defaults to strict volatile bitfields. */ - if (flag_strict_volatile_bitfields 0 abi_version_at_least(2)) -
Re: [patch] reimplement -fstrict-volatile-bitfields
On Fri, Jun 14, 2013 at 6:31 PM, Sandra Loosemore san...@codesourcery.com wrote: On 06/14/2013 06:31 AM, Richard Biener wrote: I think we can split the patch up, so let me do piecewise approval of changes. The changes that remove the packedp flag passing around and remove the warning code are ok. The store_bit_field_1 change is ok, as is the similar extract_bit_field_1 change (guard the other register copying path). Please split those out and commit them separately (I suppose they are strict progressions in testing). I will work on splitting the patch, but I feel a little uncomfortable about starting to commit it piecewise without some more clear indication that this is the right direction. Otherwise we'll just be back in the situation where things are still inconsistent and broken, but in a different way than before. That leaves a much smaller set of changes for which I have one comment for now: @@ -4508,6 +4508,16 @@ get_bit_range (unsigned HOST_WIDE_INT *b gcc_assert (TREE_CODE (exp) == COMPONENT_REF); + /* If -fstrict-volatile-bitfields was given and this is a volatile + access, then we must ignore any DECL_BIT_FIELD_REPRESENTATIVE and + do the access in the mode of the field. */ + if (TREE_THIS_VOLATILE (exp) + flag_strict_volatile_bitfields 0) +{ + *bitstart = *bitend = 0; + return; +} with the past reviews where I said the implementation was broken anyway I was hinting at that DECL_BIT_FIELD_REPRESENTATIVE should simply be chosen in a way to adhere to flag_strict_volatile_bitfields. I had the impression that this alone should be enough to implement strict volatile bitfields correctly and no code in the backend would need to check for flag_strict_volatile_bitfields. I may of course be wrong here, as -fstrict-volatile-bitfields may apply to cases where the bitfield is _not_ declared volatile but only the decl? Thus, struct X { int i : 3; }; volatile struct X x; Is x.i an access that needs to adhere to strict volatile bitfield accesses? Yes, exactly; see the new pr23623.c test case included with the patch, for instance. Or the volatile bitfield access could be through a volatile-qualified pointer, as in the volatile-bitfields-3.c test case. Bernd Schmidt and I had some internal discussion about this and neither of us saw how messing with DECL_BIT_FIELD_REPRESENTATIVE could help where the field is not declared volatile but the object being referenced is. Yeah, I can see that. Note that IIRC whether you short-cut get_bit_range in the above way you still get the access to use the mode of the FIELD_DECL. As I noted in my previous message, the pr23623.c test case was failing on all the targets I tried without this patch hunk, so the access was clearly not using the mode of the FIELD_DECL without it. If the above constitutes a strict volatile bitfield access then the very correct implementation of strict volatile bitfield accesses is to adjust the memory accesses the frontends generate (consider LTO and a TU compiled with -fstrict-volatile-bitfields and another TU compiled with -fno-strict-volatile-bitfields). Which it probably does reasonably well already (setting TREE_THIS_VOLATILE on the outermost bit-field COMPONENT_REF). If that is not preserved then converting the accesses to BIT_FIELD_REFs is another possibility. That said, the behavior of GIMPLE IL should not change dependent on a compile flag. I am kind of lost, here. -fstrict-volatile-bitfields is a code generation option, not a front end option, and it seems like LTO should not need any special handling here any more than it does for any of the gazillion other code generation options supported by GCC. LTO doesn't have any handling of code-generation options. And it cannot reliably. It has a few hacks to support mixed -fstrict-aliasing units for example, but it will be completely lost if you mix -fstrict-volatile-bitfields with -fno-strict-volatile-bitfields TUs. Of course the easiest solution here is always adhere to the ABI and simply drop the command-line flag. Make it a target hook instead. That said, with the goal to have bitfield accesses lowered to their final memory access patterns way earlier than RTL expansion we'll end up using BIT_FIELD_REF-like accesses for the strict volatile bitfields at some point anyway. Richard. -Sandra
Re: [patch] reimplement -fstrict-volatile-bitfields
On Wed, Jun 12, 2013 at 11:59 PM, Sandra Loosemore san...@codesourcery.com wrote: Background: on ARM and some other targets, the ABI requires that volatile bit-fields be accessed atomically in a mode that corresponds to the declared type of the field, which conflicts with GCC's normal behavior of doing accesses in a mode that might correspond to the size of a general-purpose register, the size of the bit-field, or the bit range corresponding to the C++ memory model. This is what the -fstrict-volatile-bitfields flag does, and it is the default on ARM and other targets where the ABI requires this behavior. Both the original patch that added -fstrict-volatile-bitfields support and a subsequent followup patch that tried to unbreak handling of packed structures (where fields might not be properly aligned to do the single access otherwise mandated by -fstrict-volatile-bitfields) only handled bit-field reads, and not writes. Last year I submitted a patch we've had locally for some time to extend the existing implementation to writes, but it was rejected on the grounds that the current implementation is too broken to fix or extend. I didn't have time then to start over from scratch, and meanwhile, the bug reports about -fstrict-volatile-bitfields have continued to pile up. So let's try again to fix this, this time working from the ground up. From last year's discussion, it seemed that there were two primary objections to the current implementation: (1) It was seen as inappropriate that warnings about conflicts between unaligned fields and -fstrict-volatile-bitfields were being emitted during expand. It was suggested that any diagnostics ought to be emitted by the various language front ends instead. (2) The way packed fields are being detected is buggy and an abstraction violation, and passing around a packedp flag to all the bit-field expand functions is ugly. And, my own complaints about the current implementation: (3) Users expect packed structures to work even on targets where -fstrict-volatile-bitfields is the default, so the compiler shouldn't generate code for accesses to unaligned fields that either faults at run time due to the unaligned access or silently produces an incorrect result (e.g., by only accessing part of the bit-field), with or without a warning at compile time. (4) There's pointless divergence between the bit-field store and extract code that has led to a number of bugs. I've come up with a new patch that tries to address all these issues. For problem (1), I've eliminated the warnings from expand. I'm not opposed to adding them back to the front ends, as previously suggested, but given that they were only previously implemented for reads and not writes and that it was getting the different-warning-for-packed-fields behavior wrong in some cases, getting rid of the warnings is at least as correct as adding them for bit-field writes, too. ;-) I've killed the packedp flag from item (2) completely too. For problem (3), my reading of the ARM ABI document is that the requirements for atomic access to volatile bit-fields only apply to bit-fields that are aligned according to the ABI requirements. If a user has used GCC extensions to create a non-ABI-compliant packed structure where an atomic bit-field access of the correct size is not possible or valid on the target, then GCC ought to define some reasonable access behavior for those bit-fields too as a further extension -- whether or not it complies with the ABI requirements for unpacked structures -- rather than just generating invalid code. In particular, generating the access using whatever technique it would fall back to if -fstrict-volatile-bitfields didn't apply, in cases where it *cannot* be applied, seems perfectly reasonable to me. To address problem (4), I've tried to make the code for handling -fstrict-volatile-bitfields similar in the read and write cases. I think there is probably more that can be done here in terms of refactoring some of the now-common code and changing the interfaces to be more consistent as well, but I think it would be more clear to separate changes that are just code cleanup from those that are intended to change behavior. I'm willing to work on code refactoring as a followup patch if the maintainers recommend or require that. I've regression tested the attached patch on arm-none-eabi as well as bootstrapping and regression testing on x86_64-linux-gnu. I also did some spot testing on mipsisa32r2-sde-elf. I verified that all the new test cases pass on these targets with this patch. Without the patch, I saw these failures: pr23623.c: ARM, x86_64, MIPS pr48784-1.c: ARM, x86_64, MIPS pr48784-2.c: none pr56341-1.c: ARM, MIPS pr56341-2.c: none pr56997-1.c: ARM pr56997-2.c: ARM, MIPS pr56997-3.c: ARM, MIPS volatile-bitfields-3.c: ARM, x86_64, MIPS Here are some comments on specific parts of the patch. The meat
Re: [patch] reimplement -fstrict-volatile-bitfields
On 06/14/2013 06:31 AM, Richard Biener wrote: I think we can split the patch up, so let me do piecewise approval of changes. The changes that remove the packedp flag passing around and remove the warning code are ok. The store_bit_field_1 change is ok, as is the similar extract_bit_field_1 change (guard the other register copying path). Please split those out and commit them separately (I suppose they are strict progressions in testing). I will work on splitting the patch, but I feel a little uncomfortable about starting to commit it piecewise without some more clear indication that this is the right direction. Otherwise we'll just be back in the situation where things are still inconsistent and broken, but in a different way than before. That leaves a much smaller set of changes for which I have one comment for now: @@ -4508,6 +4508,16 @@ get_bit_range (unsigned HOST_WIDE_INT *b gcc_assert (TREE_CODE (exp) == COMPONENT_REF); + /* If -fstrict-volatile-bitfields was given and this is a volatile + access, then we must ignore any DECL_BIT_FIELD_REPRESENTATIVE and + do the access in the mode of the field. */ + if (TREE_THIS_VOLATILE (exp) + flag_strict_volatile_bitfields 0) +{ + *bitstart = *bitend = 0; + return; +} with the past reviews where I said the implementation was broken anyway I was hinting at that DECL_BIT_FIELD_REPRESENTATIVE should simply be chosen in a way to adhere to flag_strict_volatile_bitfields. I had the impression that this alone should be enough to implement strict volatile bitfields correctly and no code in the backend would need to check for flag_strict_volatile_bitfields. I may of course be wrong here, as -fstrict-volatile-bitfields may apply to cases where the bitfield is _not_ declared volatile but only the decl? Thus, struct X { int i : 3; }; volatile struct X x; Is x.i an access that needs to adhere to strict volatile bitfield accesses? Yes, exactly; see the new pr23623.c test case included with the patch, for instance. Or the volatile bitfield access could be through a volatile-qualified pointer, as in the volatile-bitfields-3.c test case. Bernd Schmidt and I had some internal discussion about this and neither of us saw how messing with DECL_BIT_FIELD_REPRESENTATIVE could help where the field is not declared volatile but the object being referenced is. Note that IIRC whether you short-cut get_bit_range in the above way you still get the access to use the mode of the FIELD_DECL. As I noted in my previous message, the pr23623.c test case was failing on all the targets I tried without this patch hunk, so the access was clearly not using the mode of the FIELD_DECL without it. If the above constitutes a strict volatile bitfield access then the very correct implementation of strict volatile bitfield accesses is to adjust the memory accesses the frontends generate (consider LTO and a TU compiled with -fstrict-volatile-bitfields and another TU compiled with -fno-strict-volatile-bitfields). Which it probably does reasonably well already (setting TREE_THIS_VOLATILE on the outermost bit-field COMPONENT_REF). If that is not preserved then converting the accesses to BIT_FIELD_REFs is another possibility. That said, the behavior of GIMPLE IL should not change dependent on a compile flag. I am kind of lost, here. -fstrict-volatile-bitfields is a code generation option, not a front end option, and it seems like LTO should not need any special handling here any more than it does for any of the gazillion other code generation options supported by GCC. -Sandra
[patch] reimplement -fstrict-volatile-bitfields
Background: on ARM and some other targets, the ABI requires that volatile bit-fields be accessed atomically in a mode that corresponds to the declared type of the field, which conflicts with GCC's normal behavior of doing accesses in a mode that might correspond to the size of a general-purpose register, the size of the bit-field, or the bit range corresponding to the C++ memory model. This is what the -fstrict-volatile-bitfields flag does, and it is the default on ARM and other targets where the ABI requires this behavior. Both the original patch that added -fstrict-volatile-bitfields support and a subsequent followup patch that tried to unbreak handling of packed structures (where fields might not be properly aligned to do the single access otherwise mandated by -fstrict-volatile-bitfields) only handled bit-field reads, and not writes. Last year I submitted a patch we've had locally for some time to extend the existing implementation to writes, but it was rejected on the grounds that the current implementation is too broken to fix or extend. I didn't have time then to start over from scratch, and meanwhile, the bug reports about -fstrict-volatile-bitfields have continued to pile up. So let's try again to fix this, this time working from the ground up. From last year's discussion, it seemed that there were two primary objections to the current implementation: (1) It was seen as inappropriate that warnings about conflicts between unaligned fields and -fstrict-volatile-bitfields were being emitted during expand. It was suggested that any diagnostics ought to be emitted by the various language front ends instead. (2) The way packed fields are being detected is buggy and an abstraction violation, and passing around a packedp flag to all the bit-field expand functions is ugly. And, my own complaints about the current implementation: (3) Users expect packed structures to work even on targets where -fstrict-volatile-bitfields is the default, so the compiler shouldn't generate code for accesses to unaligned fields that either faults at run time due to the unaligned access or silently produces an incorrect result (e.g., by only accessing part of the bit-field), with or without a warning at compile time. (4) There's pointless divergence between the bit-field store and extract code that has led to a number of bugs. I've come up with a new patch that tries to address all these issues. For problem (1), I've eliminated the warnings from expand. I'm not opposed to adding them back to the front ends, as previously suggested, but given that they were only previously implemented for reads and not writes and that it was getting the different-warning-for-packed-fields behavior wrong in some cases, getting rid of the warnings is at least as correct as adding them for bit-field writes, too. ;-) I've killed the packedp flag from item (2) completely too. For problem (3), my reading of the ARM ABI document is that the requirements for atomic access to volatile bit-fields only apply to bit-fields that are aligned according to the ABI requirements. If a user has used GCC extensions to create a non-ABI-compliant packed structure where an atomic bit-field access of the correct size is not possible or valid on the target, then GCC ought to define some reasonable access behavior for those bit-fields too as a further extension -- whether or not it complies with the ABI requirements for unpacked structures -- rather than just generating invalid code. In particular, generating the access using whatever technique it would fall back to if -fstrict-volatile-bitfields didn't apply, in cases where it *cannot* be applied, seems perfectly reasonable to me. To address problem (4), I've tried to make the code for handling -fstrict-volatile-bitfields similar in the read and write cases. I think there is probably more that can be done here in terms of refactoring some of the now-common code and changing the interfaces to be more consistent as well, but I think it would be more clear to separate changes that are just code cleanup from those that are intended to change behavior. I'm willing to work on code refactoring as a followup patch if the maintainers recommend or require that. I've regression tested the attached patch on arm-none-eabi as well as bootstrapping and regression testing on x86_64-linux-gnu. I also did some spot testing on mipsisa32r2-sde-elf. I verified that all the new test cases pass on these targets with this patch. Without the patch, I saw these failures: pr23623.c: ARM, x86_64, MIPS pr48784-1.c: ARM, x86_64, MIPS pr48784-2.c: none pr56341-1.c: ARM, MIPS pr56341-2.c: none pr56997-1.c: ARM pr56997-2.c: ARM, MIPS pr56997-3.c: ARM, MIPS volatile-bitfields-3.c: ARM, x86_64, MIPS Here are some comments on specific parts of the patch. The meat of the patch is rewriting the logic for -fstrict-volatile-bitfields in