Re: [PATCH] Fix another wrong-code bug with -fstrict-volatile-bitfields
On Mon, Mar 16, 2015 at 11:53 AM, Bernd Edlinger bernd.edlin...@hotmail.de wrote: Hi, when looking at the m68k I realized the following, which is a general problem... If the alignment of the structure is less than sizeof(field), the strict volatile bitfields code may read beyond the end of the structure! Consider this example: struct s { char x : 8; volatile unsigned int y : 31; volatile unsigned int z : 1; } __attribute__((packed)); struct s global; Here we have sizeof(struct s) = 5, alignment(global) == 1, However when we access global.z we read a 32-bit word at offset 4, which touches 3 bytes that are not safe to use. Something like that does never happen with -fno-strict-volatile-bitfields, because IIRC, with the only exception of the simple_mem_bitfield_p code path, there is never an access mode used which is larger than MEM_ALIGN(x). Are you sure? There is still PR36043 ... In this example, if I want to use the packed attribute, I also have to use the aligned(4) attribute, this satisfies the check MEM_ALIGN (op0) modesize, which is IMO always necessary for strict volatile bitfields, not only on STRICT_ALIGNMENT targets. But your patch still somehow special-cases them. On a target, that has BIGGEST_ALIGNMENT BITS_PER_WORD, to use the strict volatile bitfields, you have to add the __attribute__((aligned(4))) to the structure. I had to do that on the pr23623.c test case, to have it passed on m68k for instance. I have attached the updated patch. As explained before, the check MEM_ALIGN (op0) modesize should always be done in strict_volatile_bitfield_p. For the targets, that usually enable -fstrict-volatile-bitfields, nothing changes, Except when we use packed on the structure, we need to add also an aligned(4) attribute. For m68k where the natural alignment of any structure is =2 we need to force aligned(4) if we want to ensure the access is in SImode. Boot-strapped and reg-tested on x86_64-linux-gnu. OK for trunk? So - shouldn't the check be if (MEM_ALIGN (op0) GET_MODE_ALIGNMENT (fieldmode)) return false; instead? And looking at /* Check for cases of unaligned fields that must be split. */ - if (bitnum % BITS_PER_UNIT + bitsize modesize - || (STRICT_ALIGNMENT - bitnum % GET_MODE_ALIGNMENT (fieldmode) + bitsize modesize)) + if (bitnum % (STRICT_ALIGNMENT ? modesize : BITS_PER_UNIT) + + bitsize modesize) return false; I wonder what the required semantics are - are they not that we need to access the whole underlying field with the access (the C++ memory model only requires we don't go beyond the field)? It seems that information isn't readily available here though. So the check checks that we can access the field with a single access using fieldmode. Which means (again), if (bitnum % (STRICT_ALIGNMENT ? GET_MODE_ALIGNMENT (fieldmode) : BITS_PER_UNIT) + bitsize modesize) Also this means that for !STRICT_ALIGNMENT platforms the MEM_ALIGN check isn't sufficient which is what the other hunks in the patch are about to fix? It seems at least the @@ -988,6 +995,16 @@ store_bit_field (rtx str_rtx, unsigned HOST_WIDE_I str_rtx = narrow_bit_field_mem (str_rtx, fieldmode, bitsize, bitnum, bitnum); + if (!STRICT_ALIGNMENT + bitnum + bitsize GET_MODE_BITSIZE (fieldmode)) + { + unsigned HOST_WIDE_INT offset; + offset = (bitnum + bitsize + BITS_PER_UNIT - 1 + - GET_MODE_BITSIZE (fieldmode)) / BITS_PER_UNIT; + str_rtx = adjust_bitfield_address (str_rtx, fieldmode, offset); + bitnum -= offset * BITS_PER_UNIT; + } + gcc_assert (bitnum + bitsize = GET_MODE_BITSIZE (fieldmode)); hunks could do with a comment. That said, I fail to realize how the issue is specific to strict-volatile bitfields? I also hope somebody else will also look at the patch - I'm not feeling like the appropriate person to review changes in this area (even if I did so in the past). Thanks, Richard. Thanks Bernd.
RE: [PATCH] Fix another wrong-code bug with -fstrict-volatile-bitfields
Hi, On Mon, 30 Mar 2015 12:33:47, Richard Biener wrote: On Mon, Mar 16, 2015 at 11:53 AM, Bernd Edlinger bernd.edlin...@hotmail.de wrote: Hi, when looking at the m68k I realized the following, which is a general problem... If the alignment of the structure is less than sizeof(field), the strict volatile bitfields code may read beyond the end of the structure! Consider this example: struct s { char x : 8; volatile unsigned int y : 31; volatile unsigned int z : 1; } __attribute__((packed)); struct s global; Here we have sizeof(struct s) = 5, alignment(global) == 1, However when we access global.z we read a 32-bit word at offset 4, which touches 3 bytes that are not safe to use. Something like that does never happen with -fno-strict-volatile-bitfields, because IIRC, with the only exception of the simple_mem_bitfield_p code path, there is never an access mode used which is larger than MEM_ALIGN(x). Are you sure? There is still PR36043 ... I meant the extract_bit_field/store_bit_field, anything can happen if you use emit_move_insn directly, but extract_bit_field should not do that. In this example, if I want to use the packed attribute, I also have to use the aligned(4) attribute, this satisfies the check MEM_ALIGN (op0) modesize, which is IMO always necessary for strict volatile bitfields, not only on STRICT_ALIGNMENT targets. But your patch still somehow special-cases them. Yes. My original intention was to by-pass the strict-volatile-bitfields code path all together, for everything that it can not safely handle. that would mean: --- gcc/expmed.c (revision 221427) +++ gcc/expmed.c (working copy) @@ -472,11 +472,16 @@ strict_volatile_bitfield_p (rtx op0, unsigned HOST return false; /* Check for cases of unaligned fields that must be split. */ - if (bitnum % BITS_PER_UNIT + bitsize modesize - || (STRICT_ALIGNMENT - bitnum % GET_MODE_ALIGNMENT (fieldmode) + bitsize modesize)) + if (bitnum % modesize + bitsize modesize) return false; + /* The memory must be sufficiently aligned for a MODESIZE access. + This condition guarantees, that the memory access will not + touch anything after the end of the structure. */ + if (MEM_ALIGN (op0) modesize) + return false; + /* Check for cases where the C++ memory model applies. */ if (bitregion_end != 0 (bitnum - bitnum % modesize bitregion_start and _removing_ that: @@ -988,6 +995,16 @@ store_bit_field (rtx str_rtx, unsigned HOST_WIDE_I str_rtx = narrow_bit_field_mem (str_rtx, fieldmode, bitsize, bitnum, bitnum); + if (!STRICT_ALIGNMENT + bitnum + bitsize GET_MODE_BITSIZE (fieldmode)) + { + unsigned HOST_WIDE_INT offset; + offset = (bitnum + bitsize + BITS_PER_UNIT - 1 + - GET_MODE_BITSIZE (fieldmode)) / BITS_PER_UNIT; + str_rtx = adjust_bitfield_address (str_rtx, fieldmode, offset); + bitnum -= offset * BITS_PER_UNIT; + } + gcc_assert (bitnum + bitsize = GET_MODE_BITSIZE (fieldmode)); temp = copy_to_reg (str_rtx); if (!store_bit_field_1 (temp, bitsize, bitnum, 0, 0, fieldmode, value, true)) and _remoing_ that too: } str_rtx = narrow_bit_field_mem (str_rtx, mode1, bitsize, bitnum, bitnum); + if (!STRICT_ALIGNMENT + bitnum + bitsize GET_MODE_BITSIZE (mode1)) + { + unsigned HOST_WIDE_INT offset; + offset = (bitnum + bitsize + BITS_PER_UNIT - 1 + - GET_MODE_BITSIZE (mode1)) / BITS_PER_UNIT; + str_rtx = adjust_bitfield_address (str_rtx, mode1, offset); + bitnum -= offset * BITS_PER_UNIT; + } + gcc_assert (bitnum + bitsize = GET_MODE_BITSIZE (mode1)); str_rtx = copy_to_reg (str_rtx); } I would keep just the added assertions in the expansion path. However, I added that to do what you requested in a previous e-mail: every access is split in 4 QImode accesses. but that is as expected, because the structure is byte aligned. No, it is not expected because the CPU can handle unaligned SImode reads/writes just fine (even if not as an atomic operation). The C++ memory model allows an SImode read to s.y (-fstrict-volatile-bitfields would, as well, but the CPU doesn't guarantee atomic operation here) Richard. ... but now I am not too happy with it either. What we need for the device register accesses is just a predictable access at a predicable offset. And it should better be aligned to MODE_BITSIZE, it is irrelevant for device registers what the CPU can access. Furthermore, if we have a structure like: { int x:16; int y:16; } when MODE_ALIGNMNET(SImode)=16, it would be possible to access y as SImode at offset 0, or at offset 2, both would be equally possible. The
RE: [PATCH] Fix another wrong-code bug with -fstrict-volatile-bitfields
Hi, On Mon, 16 Mar 2015 20:31:53, Hans-Peter Nilsson wrote: On Mon, 16 Mar 2015, Eric Botcazou wrote: If we have BIGGEST_ALIGNMENT=16 that means we have likely a 16 bit architecture. I doubt that the strict alignment code makes any sense for modesize BIGGEST_ALIGNMENT. Note that m68k is a 32-bit port (UNITS_PER_WORD is 4) but, by definition of BIGGEST_ALIGNMENT, not honoring an alignment larger than it is perfectly OK. The context of the above statement is somewhat ambiguous: if the statement is taken to include no matter what is specified in the program, including use of __attribute__ ((__aligned__ ...)) then I have to object. (I guess Eric, you didn't actually mean that, though.) The definition of BIGGEST_ALIGNMENT is (tm.texi.in): Biggest alignment that any data type can require on this machine, in bits. Note that this is not the biggest alignment that is supported, just the biggest alignment that, when violated, may cause a fault. So, IMNSHO we'd better *support* a *larger* alignment, as in if the code specified it by explicit means at least up to MAX_OFILE_ALIGNMENT. But, perfectly OK when the code didn't explicitly say anything else. Absolutely. The idea if this patch, is to *require* the use of __attribute__((__aligned__(x))) for all structures that need the strict volatile bitfields semantics. At least if the ABI has BIGGEST_ALIGNMENTBITS_PER_WORD. I don't see any alternative. Thanks Bernd.
Re: [PATCH] Fix another wrong-code bug with -fstrict-volatile-bitfields
On Mon, 16 Mar 2015, Eric Botcazou wrote: If we have BIGGEST_ALIGNMENT=16 that means we have likely a 16 bit architecture. I doubt that the strict alignment code makes any sense for modesize BIGGEST_ALIGNMENT. Note that m68k is a 32-bit port (UNITS_PER_WORD is 4) but, by definition of BIGGEST_ALIGNMENT, not honoring an alignment larger than it is perfectly OK. The context of the above statement is somewhat ambiguous: if the statement is taken to include no matter what is specified in the program, including use of __attribute__ ((__aligned__ ...)) then I have to object. (I guess Eric, you didn't actually mean that, though.) The definition of BIGGEST_ALIGNMENT is (tm.texi.in): Biggest alignment that any data type can require on this machine, in bits. Note that this is not the biggest alignment that is supported, just the biggest alignment that, when violated, may cause a fault. So, IMNSHO we'd better *support* a *larger* alignment, as in if the code specified it by explicit means at least up to MAX_OFILE_ALIGNMENT. But, perfectly OK when the code didn't explicitly say anything else. BTW, unfortunately, BIGGEST_ALIGNMENT can't be blindly followed for atomic accesses to undecorated (without explicit alignment specifiers) code either. Rather the minimum alignment is the natural alignment *absolutely everywhere no matter what target* which matters for targets where BIGGEST_ALIGNMENT natural_alignment(all supported types) (well, as long as the natural alignment is smaller than the target page-size or cache-line, where at least one of the concepts is usually applicable). Q: So why not adjust the BIGGEST_ALIGNMENT definition in such targets to be at least the natural alignment of supported atomic types? A: Because that unfortunately has bad effects on generated code for all accesses to all sizes now = BIGGEST_ALIGNMENT. brgds, H-P PS. It's very unfortunate that there's now __BIGGEST_ALIGNMENT__ visible to programs.
Re: [PATCH] Fix another wrong-code bug with -fstrict-volatile-bitfields
Hans-Peter Nilsson h...@bitrange.com writes: Q: So why not adjust the BIGGEST_ALIGNMENT definition in such targets to be at least the natural alignment of supported atomic types? A: Because it's an ABI change. Andreas. -- Andreas Schwab, sch...@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 And now for something completely different.
Re: [PATCH] Fix another wrong-code bug with -fstrict-volatile-bitfields
On Tue, 17 Mar 2015, Andreas Schwab wrote: Hans-Peter Nilsson h...@bitrange.com writes: Q: So why not adjust the BIGGEST_ALIGNMENT definition in such targets to be at least the natural alignment of supported atomic types? A: Because it's an ABI change. I intended that to be included in bad effects; A: Because that unfortunately has bad effects on generated code for all accesses to all sizes now = BIGGEST_ALIGNMENT. but thanks for being specific (and I didn't remember exactly *what* bad effects it was that I saw when I tried that long ago :) brgds, H-P
RE: [PATCH] Fix another wrong-code bug with -fstrict-volatile-bitfields
Hi, when looking at the m68k I realized the following, which is a general problem... If the alignment of the structure is less than sizeof(field), the strict volatile bitfields code may read beyond the end of the structure! Consider this example: struct s { char x : 8; volatile unsigned int y : 31; volatile unsigned int z : 1; } __attribute__((packed)); struct s global; Here we have sizeof(struct s) = 5, alignment(global) == 1, However when we access global.z we read a 32-bit word at offset 4, which touches 3 bytes that are not safe to use. Something like that does never happen with -fno-strict-volatile-bitfields, because IIRC, with the only exception of the simple_mem_bitfield_p code path, there is never an access mode used which is larger than MEM_ALIGN(x). In this example, if I want to use the packed attribute, I also have to use the aligned(4) attribute, this satisfies the check MEM_ALIGN (op0) modesize, which is IMO always necessary for strict volatile bitfields, not only on STRICT_ALIGNMENT targets. On a target, that has BIGGEST_ALIGNMENT BITS_PER_WORD, to use the strict volatile bitfields, you have to add the __attribute__((aligned(4))) to the structure. I had to do that on the pr23623.c test case, to have it passed on m68k for instance. I have attached the updated patch. As explained before, the check MEM_ALIGN (op0) modesize should always be done in strict_volatile_bitfield_p. For the targets, that usually enable -fstrict-volatile-bitfields, nothing changes, Except when we use packed on the structure, we need to add also an aligned(4) attribute. For m68k where the natural alignment of any structure is =2 we need to force aligned(4) if we want to ensure the access is in SImode. Boot-strapped and reg-tested on x86_64-linux-gnu. OK for trunk? Thanks Bernd. gcc: 2015-03-15 Bernd Edlinger bernd.edlin...@hotmail.de * expmed.c (strict_volatile_bitfield_p): Check that MEM_ALIGN allows a MODESIZE access. (store_bit_field, extract_bit_field): For !STRICT_ALIGNMENT explicitly generate an unaligned access if the field crosses a word boundary. testsuite: 2015-03-15 Bernd Edlinger bernd.edlin...@hotmail.de * gcc.dg/pr23623.c: Added aligned attribute. * gcc.dg/20141029-1.c: Likewise. * gcc.dg/20150306-1.c: New test. patch-volatile-bitfields-1.diff Description: Binary data
Re: [PATCH] Fix another wrong-code bug with -fstrict-volatile-bitfields
If we have BIGGEST_ALIGNMENT=16 that means we have likely a 16 bit architecture. I doubt that the strict alignment code makes any sense for modesize BIGGEST_ALIGNMENT. Note that m68k is a 32-bit port (UNITS_PER_WORD is 4) but, by definition of BIGGEST_ALIGNMENT, not honoring an alignment larger than it is perfectly OK. -- Eric Botcazou
RE: [PATCH] Fix another wrong-code bug with -fstrict-volatile-bitfields
Hi, On Sat, 14 Mar 2015 13:24:33, Mikael Pettersson wrote: Bernd Edlinger writes: Hi, are there any more comments on this? I would like to apply the patch as is, unless we find a a way to get to a test case, maybe with a cross-compiler, where the MODE_ALIGNMENT is different from MODE_BITSIZE. Currently, I think that does not happen. On m68k-linux GET_MODE_ALIGNMENT (SImode) == 16 while GET_MODE_BITSIZE (SImode) == 32. I don't know what that means for your patch, just wanted to inform you that such targets do exist. Oh I see, thanks. This is due to BIGGEST_ALIGNMENT=16, STRICT_ALIGNMENT=1 by default on that architecture. If I change this check as suggested: if (bitnum % (STRICT_ALIGNMENT ? GET_MODE_ALIGNMENT (fieldmode) : BITS_PER_UNIT) + bitsize modesize || (STRICT_ALIGNMENT MEM_ALIGN (op0) GET_MODE_ALIGNMENT (fieldmode))) return false; Then I can get the assertion failed in store_bit_field: gcc_assert (bitnum + bitsize = GET_MODE_BITSIZE (fieldmode)); With this example: cat test.c struct s { short y:16; int x:31; }; void f(volatile struct s* z, int x) { z-x=x; } m68k-elf-gcc -fstrict-volatile-bitfields -mstrict-align -mno-align-int -O2 -S test.c test.c: In function 'f': test.c:10:7: internal compiler error: in store_bit_field, at expmed.c:1005 z-x=x; ^ what I said before.., without the patch the test case generates just invalid code which assigns only 16 bits. There is also a problem in this check. I had to make short y:16 a bit filed to bypass that, initially I wrote short y; /* Check for cases where the C++ memory model applies. */ if (bitregion_end != 0 (bitnum - bitnum % modesize bitregion_start || bitnum - bitnum % modesize + modesize - 1 bitregion_end)) return false; This assumes also that the access is at a modesize boundary. If we have BIGGEST_ALIGNMENT=16 that means we have likely a 16 bit architecture. I doubt that the strict alignment code makes any sense for modesize BIGGEST_ALIGNMENT. I think I should change this check /* The bit size must not be larger than the field mode, and the field mode must not be larger than a word. */ if (bitsize modesize || modesize BITS_PER_WORD) return false; to this: if (bitsize modesize || modesize BITS_PER_WORD || modesize BIGGEST_ALIGNMENT) return false; This should avoid these oddities. Bernd.
RE: [PATCH] Fix another wrong-code bug with -fstrict-volatile-bitfields
Bernd Edlinger writes: Hi, are there any more comments on this? I would like to apply the patch as is, unless we find a a way to get to a test case, maybe with a cross-compiler, where the MODE_ALIGNMENT is different from MODE_BITSIZE. Currently, I think that does not happen. On m68k-linux GET_MODE_ALIGNMENT (SImode) == 16 while GET_MODE_BITSIZE (SImode) == 32. I don't know what that means for your patch, just wanted to inform you that such targets do exist. /Mikael Thanks Bernd. Date: Tue, 10 Mar 2015 14:40:52 +0100 Hi Richard and Eric, On Mon, 9 Mar 2015 15:30:31, Richard Biener wrote: Reg-tested on x86_64 successfully and ARM is still running. ARM completed without regressions meanwhile. Is it OK for trunk? Looks ok to me apart from /* Check for cases of unaligned fields that must be split. */ - if (bitnum % BITS_PER_UNIT + bitsize modesize - || (STRICT_ALIGNMENT - bitnum % GET_MODE_ALIGNMENT (fieldmode) + bitsize modesize)) + if (bitnum % (STRICT_ALIGNMENT ? modesize : BITS_PER_UNIT) + + bitsize modesize + || (STRICT_ALIGNMENT MEM_ALIGN (op0) modesize)) return false; where I'd use GET_MODE_ALIGNMENT (fieldmode) rather than modesize (in both places). Please leave Eric the chance to comment. Just to clarify a few things here: I try to make the checks in strict_volatile_bitfield_p to be consistent with the strict volatile bit-field code path that follows if we return true here. I would summarize the current implementation of the strict volatile bit-field code as follows: For strict alignment, we access the structure as if it were an array of fieldmode. A multiple of modesize is added to the base address, and one single read and/or write access must be sufficient to do the job. The access must be Ok regarding the target's alignment restrictions. That does not change, what changed with the previous patch is a missed optimization with the EP_insv code pattern. For non strict alignment, a multiple of modesize is added to the base address, but if the range [bitnum:bitnum+bitsize-1] spans two fieldmode words, which should only happen if we use packed structures, a byte offset is added to the base address. The byte offset is chosen as small as possible, to not reach beyond the bit field region. That is new. This change is irrelevant for the use case of accessing a device register, but the generated code is more compact. Usually we have GET_MODE_ALIGNMENT(fieldmode)==modesize for all SCALAR_INT_MODE_P(fieldmode). The only exceptions are complex numbers, and targets where ADJUST_ALIGNMENT is used in the modes.def, right? The targets that do that are few, and the modes are mostly vector modes. So I did not find any target where the MODE_ALIGNMENT would make a difference here. Therefore I think it is more or less a matter of taste. But please correct me if I am wrong. If there are cases, where MODE_ALIGNMENTMODE_BITSIZE, changing the second condition MEM_ALIGN(op0)modesize to MEM_ALIGN(op0)GET_MODE_ALIGNMENT(filedmode) would still be consistent, and probably be more correct. But I think changing the first condition would allow cases where this assertion in the patch does no longer hold: gcc_assert (bitnum + bitsize = GET_MODE_BITSIZE (fieldmode)); Thanks Bernd. --
RE: [PATCH] Fix another wrong-code bug with -fstrict-volatile-bitfields
Hi, are there any more comments on this? I would like to apply the patch as is, unless we find a a way to get to a test case, maybe with a cross-compiler, where the MODE_ALIGNMENT is different from MODE_BITSIZE. Currently, I think that does not happen. Thanks Bernd. Date: Tue, 10 Mar 2015 14:40:52 +0100 Hi Richard and Eric, On Mon, 9 Mar 2015 15:30:31, Richard Biener wrote: Reg-tested on x86_64 successfully and ARM is still running. ARM completed without regressions meanwhile. Is it OK for trunk? Looks ok to me apart from /* Check for cases of unaligned fields that must be split. */ - if (bitnum % BITS_PER_UNIT + bitsize modesize - || (STRICT_ALIGNMENT - bitnum % GET_MODE_ALIGNMENT (fieldmode) + bitsize modesize)) + if (bitnum % (STRICT_ALIGNMENT ? modesize : BITS_PER_UNIT) + + bitsize modesize + || (STRICT_ALIGNMENT MEM_ALIGN (op0) modesize)) return false; where I'd use GET_MODE_ALIGNMENT (fieldmode) rather than modesize (in both places). Please leave Eric the chance to comment. Just to clarify a few things here: I try to make the checks in strict_volatile_bitfield_p to be consistent with the strict volatile bit-field code path that follows if we return true here. I would summarize the current implementation of the strict volatile bit-field code as follows: For strict alignment, we access the structure as if it were an array of fieldmode. A multiple of modesize is added to the base address, and one single read and/or write access must be sufficient to do the job. The access must be Ok regarding the target's alignment restrictions. That does not change, what changed with the previous patch is a missed optimization with the EP_insv code pattern. For non strict alignment, a multiple of modesize is added to the base address, but if the range [bitnum:bitnum+bitsize-1] spans two fieldmode words, which should only happen if we use packed structures, a byte offset is added to the base address. The byte offset is chosen as small as possible, to not reach beyond the bit field region. That is new. This change is irrelevant for the use case of accessing a device register, but the generated code is more compact. Usually we have GET_MODE_ALIGNMENT(fieldmode)==modesize for all SCALAR_INT_MODE_P(fieldmode). The only exceptions are complex numbers, and targets where ADJUST_ALIGNMENT is used in the modes.def, right? The targets that do that are few, and the modes are mostly vector modes. So I did not find any target where the MODE_ALIGNMENT would make a difference here. Therefore I think it is more or less a matter of taste. But please correct me if I am wrong. If there are cases, where MODE_ALIGNMENTMODE_BITSIZE, changing the second condition MEM_ALIGN(op0)modesize to MEM_ALIGN(op0)GET_MODE_ALIGNMENT(filedmode) would still be consistent, and probably be more correct. But I think changing the first condition would allow cases where this assertion in the patch does no longer hold: gcc_assert (bitnum + bitsize = GET_MODE_BITSIZE (fieldmode)); Thanks Bernd.
RE: [PATCH] Fix another wrong-code bug with -fstrict-volatile-bitfields
Hi Richard and Eric, On Mon, 9 Mar 2015 15:30:31, Richard Biener wrote: Reg-tested on x86_64 successfully and ARM is still running. ARM completed without regressions meanwhile. Is it OK for trunk? Looks ok to me apart from /* Check for cases of unaligned fields that must be split. */ - if (bitnum % BITS_PER_UNIT + bitsize modesize - || (STRICT_ALIGNMENT - bitnum % GET_MODE_ALIGNMENT (fieldmode) + bitsize modesize)) + if (bitnum % (STRICT_ALIGNMENT ? modesize : BITS_PER_UNIT) + + bitsize modesize + || (STRICT_ALIGNMENT MEM_ALIGN (op0) modesize)) return false; where I'd use GET_MODE_ALIGNMENT (fieldmode) rather than modesize (in both places). Please leave Eric the chance to comment. Just to clarify a few things here: I try to make the checks in strict_volatile_bitfield_p to be consistent with the strict volatile bit-field code path that follows if we return true here. I would summarize the current implementation of the strict volatile bit-field code as follows: For strict alignment, we access the structure as if it were an array of fieldmode. A multiple of modesize is added to the base address, and one single read and/or write access must be sufficient to do the job. The access must be Ok regarding the target's alignment restrictions. That does not change, what changed with the previous patch is a missed optimization with the EP_insv code pattern. For non strict alignment, a multiple of modesize is added to the base address, but if the range [bitnum:bitnum+bitsize-1] spans two fieldmode words, which should only happen if we use packed structures, a byte offset is added to the base address. The byte offset is chosen as small as possible, to not reach beyond the bit field region. That is new. This change is irrelevant for the use case of accessing a device register, but the generated code is more compact. Usually we have GET_MODE_ALIGNMENT(fieldmode)==modesize for all SCALAR_INT_MODE_P(fieldmode). The only exceptions are complex numbers, and targets where ADJUST_ALIGNMENT is used in the modes.def, right? The targets that do that are few, and the modes are mostly vector modes. So I did not find any target where the MODE_ALIGNMENT would make a difference here. Therefore I think it is more or less a matter of taste. But please correct me if I am wrong. If there are cases, where MODE_ALIGNMENTMODE_BITSIZE, changing the second condition MEM_ALIGN(op0)modesize to MEM_ALIGN(op0)GET_MODE_ALIGNMENT(filedmode) would still be consistent, and probably be more correct. But I think changing the first condition would allow cases where this assertion in the patch does no longer hold: gcc_assert (bitnum + bitsize = GET_MODE_BITSIZE (fieldmode)); Thanks Bernd.
Re: [PATCH] Fix another wrong-code bug with -fstrict-volatile-bitfields
On Fri, Mar 6, 2015 at 12:48 PM, Bernd Edlinger bernd.edlin...@hotmail.de wrote: Hi Richard, here is my new proposal, it addresses your objections and generates better code for this test case: main: .LFB0: .cfi_startproc pushq%rbp .cfi_def_cfa_offset 16 .cfi_offset 6, -16 movq%rsp, %rbp .cfi_def_cfa_register 6 movlglobal+1(%rip), %eax orl$2147483647, %eax movl%eax, global+1(%rip) movlglobal+1(%rip), %eax andl$2147483647, %eax cmpl$2147483647, %eax je.L2 callabort .L2: movl$0, %eax popq%rbp .cfi_def_cfa 7, 8 ret .cfi_endproc I also tried to fix the comments. Reg-tested on x86_64 successfully and ARM is still running. Is it OK for trunk? Looks ok to me apart from /* Check for cases of unaligned fields that must be split. */ - if (bitnum % BITS_PER_UNIT + bitsize modesize - || (STRICT_ALIGNMENT - bitnum % GET_MODE_ALIGNMENT (fieldmode) + bitsize modesize)) + if (bitnum % (STRICT_ALIGNMENT ? modesize : BITS_PER_UNIT) + + bitsize modesize + || (STRICT_ALIGNMENT MEM_ALIGN (op0) modesize)) return false; where I'd use GET_MODE_ALIGNMENT (fieldmode) rather than modesize (in both places). Please leave Eric the chance to comment. Thanks, Richard. Thanks Bernd.
RE: [PATCH] Fix another wrong-code bug with -fstrict-volatile-bitfields
Hi, On Thu, 5 Mar 2015 16:36:48, Richard Biener wrote: On Thu, Mar 5, 2015 at 4:05 PM, Bernd Edlinger bernd.edlin...@hotmail.de wrote: every access is split in 4 QImode accesses. but that is as expected, because the structure is byte aligned. No, it is not expected because the CPU can handle unaligned SImode reads/writes just fine (even if not as an atomic operation). The C++ memory model allows an SImode read to s.y (-fstrict-volatile-bitfields would, as well, but the CPU doesn't guarantee atomic operation here) Hmm, well. I understand. However this is the normal way how the non-strict-volatile-bitfields work. But I can probably work out a patch that enables the strict-volatile-bitfields to generate un-aligned SImode accesses when necessary on !STRICT_ALIGNMENT targets. Now, I checked again the ARM port, and I am a bit surprised, because with gcc 4.9.0 this structure gets 4 QImode accesses which is correct, but with recent trunk (gcc version 5.0.0 20150301) I get SImode accesses, but the structure is not aligned and the compiler cant possibly know how the memory will be aligned. Something must have changed in be meantime, but it wasn't by me. IIRC the field mode in this example was QImode but now it seems to be SImode. struct s { unsigned int y:31; } __attribute__((packed)); int test (volatile struct s* x) { x-y=0x7FFF; return x-y; } So what would you think of this change at strict_volatile_bitfield_p? diff -up expmed.c.jj expmed.c --- expmed.c.jj 2015-01-16 11:20:40.0 +0100 +++ expmed.c 2015-03-06 10:07:14.362383274 +0100 @@ -472,9 +472,9 @@ strict_volatile_bitfield_p (rtx op0, uns return false; /* Check for cases of unaligned fields that must be split. */ - if (bitnum % BITS_PER_UNIT + bitsize modesize - || (STRICT_ALIGNMENT - bitnum % GET_MODE_ALIGNMENT (fieldmode) + bitsize modesize)) + if (bitnum % (STRICT_ALIGNMENT ? modesize : BITS_PER_UNIT) + + bitsize modesize + || (STRICT_ALIGNMENT MEM_ALIGN (op0) modesize)) return false; /* Check for cases where the C++ memory model applies. */ of course this is incomplete, and needs special handling for !STRICT_ALIGNMENT in the strict-volatile-bitfields code path later. Thanks Bernd.
RE: [PATCH] Fix another wrong-code bug with -fstrict-volatile-bitfields
Hi Richard, here is my new proposal, it addresses your objections and generates better code for this test case: main: .LFB0: .cfi_startproc pushq %rbp .cfi_def_cfa_offset 16 .cfi_offset 6, -16 movq %rsp, %rbp .cfi_def_cfa_register 6 movl global+1(%rip), %eax orl $2147483647, %eax movl %eax, global+1(%rip) movl global+1(%rip), %eax andl $2147483647, %eax cmpl $2147483647, %eax je .L2 call abort .L2: movl $0, %eax popq %rbp .cfi_def_cfa 7, 8 ret .cfi_endproc I also tried to fix the comments. Reg-tested on x86_64 successfully and ARM is still running. Is it OK for trunk? Thanks Bernd. gcc: 2015-03-06 Bernd Edlinger bernd.edlin...@hotmail.de * expmed.c (strict_volatile_bitfield_p): For STRICT_ALIGNMENT check that MEM_ALIGN (op0) allows a MODESIZE access. (store_bit_field, extract_bit_field): For !STRICT_ALIGNMENT explicitly generate an unaligned access if the field crosses a word boundary. testsuite: 2015-03-06 Bernd Edlinger bernd.edlin...@hotmail.de * gcc.dg/20150306-1.c: New test. patch-volatile-bitfields-1.diff Description: Binary data
Re: [PATCH] Fix another wrong-code bug with -fstrict-volatile-bitfields
On Wed, Mar 4, 2015 at 3:13 PM, Bernd Edlinger bernd.edlin...@hotmail.de wrote: bounced... again, without html. Hi Richard, while working on another bug in the area of -fstrict-volatile-bitfields I became aware of another example where -fstrict-volatile-bitfields may generate wrong code. This is reproducible on a !STRICT_ALIGNMENT target like x86_64. The problem is that strict_volatile_bitfield_p tries to allow more than necessary if !STRICT_ALIGNMENT.Everything works OK on ARM for instance. If this function returns true, we may later call narrow_bit_field_mem, and the check in strict_volatile_bitfield_p should mirror the logic there: narrow_bit_field_mem just uses GET_MODE_BITSIZE (mode) and does not care about STRICT_ALIGNMENT, and in the end *new_bitnum + bitsize may reach beyond the end of the region. This causes store_fixed_bit_field_1 to silently fail to generate correct code. Hmm, but the comment sounds like if using GET_MODE_ALIGNMENT is more correct (even for !strict-alignment) - if mode is SImode and mode alignment is 16 (HImode aligned) then we don't need to split the load if bitnum is 16 and bitsize is 32. So maybe narrow_bit_field_mem needs to be fixed as well? Thanks, Richard. The attached patch was boot-strapped and regression-tested on x86_64-linux-gnu. OK for trunk and 4.9? Thanks Bernd.
Re: [PATCH] Fix another wrong-code bug with -fstrict-volatile-bitfields
On Thu, Mar 5, 2015 at 4:05 PM, Bernd Edlinger bernd.edlin...@hotmail.de wrote: Hi, On Thu, 5 Mar 2015 12:24:56, Richard Biener wrote: On Thu, Mar 5, 2015 at 12:00 PM, Bernd Edlinger bernd.edlin...@hotmail.de wrote: Hi, On Thu, 5 Mar 2015 10:00:26, Richard Biener wrote: On Wed, Mar 4, 2015 at 3:13 PM, Bernd Edlinger Maybe one thing is missing from strict_volatile_bitfield_p, I am not sure. Maybe it should check that MEM_ALIGN (op0)= GET_MODE_ALIGNMENT (fieldmode) Because the strict volatile bitfields handling will inevitably try to use the fieldmode to access the memory. Or would it be better to say MEM_ALIGN (op0)= GET_MODE_BITSIZE (fieldmode), because it is easier to explain when some one asks, when we guarantee the semantics of strict volatile bitfield? But on non-strict-align targets you can even for 1-byte aligned MEMs access an SImode field directly. So the old code looks correct to me here and the fix needs to be done somewhere else. But this SImode access is split up in several QImode or HImode accesses, in the processor's execution pipeline, finally on an external bus like AXI all memory transactions are aligned. The difference is just that some processors can split up the unaligned accesses and some need help from the compiler, but even on an x86 we have a different semantics for unaligned acceses, that is they are no longer atomic, while an aligned access is always executed as an atomic transaction on an x86 processor. Probably there is already something in the logic in expr.c that prevents these cases, because otherwise it would be way to easy to find an example for unaligned accesses to unaligned memory on STRICT_ALIGNMENT targets. Ok, what would you think about this variant? --- expmed.c.jj 2015-01-16 11:20:40.0 +0100 +++ expmed.c 2015-03-05 11:50:09.40016 +0100 @@ -472,9 +472,11 @@ strict_volatile_bitfield_p (rtx op0, uns return false; /* Check for cases of unaligned fields that must be split. */ - if (bitnum % BITS_PER_UNIT + bitsize modesize - || (STRICT_ALIGNMENT - bitnum % GET_MODE_ALIGNMENT (fieldmode) + bitsize modesize)) + if (bitnum % modesize + bitsize modesize) + return false; + + /* Check that the memory is sufficiently aligned. */ + if (MEM_ALIGN (op0) modesize) I think that only applies to strict-align targets and then only for GET_MODE_ALIGNMENT (modesize). And of course what matters is the alignment at bitnum - even though op0 may be not sufficiently aligned it may have known misalignment so that op0 + bitnum is sufficiently aligned. Testcases would need to annotate structs with packed/aligned attributes to get at these cases. For the testcase included in the patch, what does the patch end up doing? Not going the strict-volatile bitfield expansion path? That looks unnecessary on !strict-alignment targets but resonable on strict-align targets where the access would need to be splitted. So, why does it end up being splitted on !strict-align targets? gcc -fstrict-volatile-bitfields -S 20150304-1.c without patch we find this in 20150304-1.s: main: .LFB0: .cfi_startproc pushq %rbp .cfi_def_cfa_offset 16 .cfi_offset 6, -16 movq%rsp, %rbp .cfi_def_cfa_register 6 movzbl global+1(%rip), %eax orl $-1, %eax movb%al, global+1(%rip) movzbl global+2(%rip), %eax orl $-1, %eax movb%al, global+2(%rip) movzbl global+3(%rip), %eax orl $-1, %eax movb%al, global+3(%rip) movzbl global+4(%rip), %eax orl $127, %eax movb%al, global+4(%rip) movlglobal(%rip), %eax shrl$8, %eax andl$2147483647, %eax cmpl$2147483647, %eax je .L2 callabort .L2: movl$0, %eax popq%rbp .cfi_def_cfa 7, 8 ret .cfi_endproc so the write path is OK, because strict_volatile_bitfield_p returns false, because /* Check for cases where the C++ memory model applies. */ if (bitregion_end != 0 (bitnum - bitnum % modesize bitregion_start || bitnum - bitnum % modesize + modesize - 1 bitregion_end)) return false; bitregion_start=8, bitregion_end=39 bitnum - bitnum % modesize = 0 bitnum - bitnum % modesize + modesize - 1 = 31 this does not happen in the read path, and the problem is the access here does not work: str_rtx = narrow_bit_field_mem (str_rtx, fieldmode, bitsize, bitnum, bitnum); /* Explicitly override the C/C++ memory model; ignore the bit range so that we can do the access in the mode mandated by -fstrict-volatile-bitfields instead. */ store_fixed_bit_field_1 (str_rtx, bitsize, bitnum, value); str_rtx = unchanged, bitnum = 8, bitsize= 31, but
Re: [PATCH] Fix another wrong-code bug with -fstrict-volatile-bitfields
On Thu, Mar 5, 2015 at 12:00 PM, Bernd Edlinger bernd.edlin...@hotmail.de wrote: Hi, On Thu, 5 Mar 2015 10:00:26, Richard Biener wrote: On Wed, Mar 4, 2015 at 3:13 PM, Bernd Edlinger bernd.edlin...@hotmail.de wrote: bounced... again, without html. Hi Richard, while working on another bug in the area of -fstrict-volatile-bitfields I became aware of another example where -fstrict-volatile-bitfields may generate wrong code. This is reproducible on a !STRICT_ALIGNMENT target like x86_64. The problem is that strict_volatile_bitfield_p tries to allow more than necessary if !STRICT_ALIGNMENT. Everything works OK on ARM for instance. If this function returns true, we may later call narrow_bit_field_mem, and the check in strict_volatile_bitfield_p should mirror the logic there: narrow_bit_field_mem just uses GET_MODE_BITSIZE (mode) and does not care about STRICT_ALIGNMENT, and in the end *new_bitnum + bitsize may reach beyond the end of the region. This causes store_fixed_bit_field_1 to silently fail to generate correct code. Hmm, but the comment sounds like if using GET_MODE_ALIGNMENT is more correct (even for !strict-alignment) - if mode is SImode and mode alignment is 16 (HImode aligned) then we don't need to split the load if bitnum is 16 and bitsize is 32. So maybe narrow_bit_field_mem needs to be fixed as well? I'd rather not touch that function In the whole expmed.c the only place where GET_MODE_ALIGNMENT is used, is in simple_mem_bitfield_p but only if SLOW_UNALIGNED_ACCESS returns 1, which is only the case on few targets. Do you know any targets, where GET_MODE_ALIGNMENT may be less than GET_MODE_BITSIZE? DImode on i?86, I suppose any mode on targets like AVR. Maybe one thing is missing from strict_volatile_bitfield_p, I am not sure. Maybe it should check that MEM_ALIGN (op0)= GET_MODE_ALIGNMENT (fieldmode) Because the strict volatile bitfields handling will inevitably try to use the fieldmode to access the memory. Or would it be better to say MEM_ALIGN (op0)= GET_MODE_BITSIZE (fieldmode), because it is easier to explain when some one asks, when we guarantee the semantics of strict volatile bitfield? But on non-strict-align targets you can even for 1-byte aligned MEMs access an SImode field directly. So the old code looks correct to me here and the fix needs to be done somewhere else. Probably there is already something in the logic in expr.c that prevents these cases, because otherwise it would be way to easy to find an example for unaligned accesses to unaligned memory on STRICT_ALIGNMENT targets. Ok, what would you think about this variant? --- expmed.c.jj2015-01-16 11:20:40.0 +0100 +++ expmed.c2015-03-05 11:50:09.40016 +0100 @@ -472,9 +472,11 @@ strict_volatile_bitfield_p (rtx op0, uns return false; /* Check for cases of unaligned fields that must be split. */ - if (bitnum % BITS_PER_UNIT + bitsize modesize - || (STRICT_ALIGNMENT - bitnum % GET_MODE_ALIGNMENT (fieldmode) + bitsize modesize)) + if (bitnum % modesize + bitsize modesize) +return false; + + /* Check that the memory is sufficiently aligned. */ + if (MEM_ALIGN (op0) modesize) I think that only applies to strict-align targets and then only for GET_MODE_ALIGNMENT (modesize). And of course what matters is the alignment at bitnum - even though op0 may be not sufficiently aligned it may have known misalignment so that op0 + bitnum is sufficiently aligned. Testcases would need to annotate structs with packed/aligned attributes to get at these cases. For the testcase included in the patch, what does the patch end up doing? Not going the strict-volatile bitfield expansion path? That looks unnecessary on !strict-alignment targets but resonable on strict-align targets where the access would need to be splitted. So, why does it end up being splitted on !strict-align targets? Richard. return false; /* Check for cases where the C++ memory model applies. */ Trying to use an atomic access to a device register is pointless if the memory context is not aligned to the MODE_BITSIZE, that has nothing to do with MODE_ALIGNMENT, right? Thanks Bernd.
RE: [PATCH] Fix another wrong-code bug with -fstrict-volatile-bitfields
Hi, On Thu, 5 Mar 2015 10:00:26, Richard Biener wrote: On Wed, Mar 4, 2015 at 3:13 PM, Bernd Edlinger bernd.edlin...@hotmail.de wrote: bounced... again, without html. Hi Richard, while working on another bug in the area of -fstrict-volatile-bitfields I became aware of another example where -fstrict-volatile-bitfields may generate wrong code. This is reproducible on a !STRICT_ALIGNMENT target like x86_64. The problem is that strict_volatile_bitfield_p tries to allow more than necessary if !STRICT_ALIGNMENT. Everything works OK on ARM for instance. If this function returns true, we may later call narrow_bit_field_mem, and the check in strict_volatile_bitfield_p should mirror the logic there: narrow_bit_field_mem just uses GET_MODE_BITSIZE (mode) and does not care about STRICT_ALIGNMENT, and in the end *new_bitnum + bitsize may reach beyond the end of the region. This causes store_fixed_bit_field_1 to silently fail to generate correct code. Hmm, but the comment sounds like if using GET_MODE_ALIGNMENT is more correct (even for !strict-alignment) - if mode is SImode and mode alignment is 16 (HImode aligned) then we don't need to split the load if bitnum is 16 and bitsize is 32. So maybe narrow_bit_field_mem needs to be fixed as well? I'd rather not touch that function In the whole expmed.c the only place where GET_MODE_ALIGNMENT is used, is in simple_mem_bitfield_p but only if SLOW_UNALIGNED_ACCESS returns 1, which is only the case on few targets. Do you know any targets, where GET_MODE_ALIGNMENT may be less than GET_MODE_BITSIZE? Maybe one thing is missing from strict_volatile_bitfield_p, I am not sure. Maybe it should check that MEM_ALIGN (op0)= GET_MODE_ALIGNMENT (fieldmode) Because the strict volatile bitfields handling will inevitably try to use the fieldmode to access the memory. Or would it be better to say MEM_ALIGN (op0)= GET_MODE_BITSIZE (fieldmode), because it is easier to explain when some one asks, when we guarantee the semantics of strict volatile bitfield? Probably there is already something in the logic in expr.c that prevents these cases, because otherwise it would be way to easy to find an example for unaligned accesses to unaligned memory on STRICT_ALIGNMENT targets. Ok, what would you think about this variant? --- expmed.c.jj 2015-01-16 11:20:40.0 +0100 +++ expmed.c 2015-03-05 11:50:09.40016 +0100 @@ -472,9 +472,11 @@ strict_volatile_bitfield_p (rtx op0, uns return false; /* Check for cases of unaligned fields that must be split. */ - if (bitnum % BITS_PER_UNIT + bitsize modesize - || (STRICT_ALIGNMENT - bitnum % GET_MODE_ALIGNMENT (fieldmode) + bitsize modesize)) + if (bitnum % modesize + bitsize modesize) + return false; + + /* Check that the memory is sufficiently aligned. */ + if (MEM_ALIGN (op0) modesize) return false; /* Check for cases where the C++ memory model applies. */ Trying to use an atomic access to a device register is pointless if the memory context is not aligned to the MODE_BITSIZE, that has nothing to do with MODE_ALIGNMENT, right? Thanks Bernd.
RE: [PATCH] Fix another wrong-code bug with -fstrict-volatile-bitfields
Hi, On Thu, 5 Mar 2015 12:24:56, Richard Biener wrote: On Thu, Mar 5, 2015 at 12:00 PM, Bernd Edlinger bernd.edlin...@hotmail.de wrote: Hi, On Thu, 5 Mar 2015 10:00:26, Richard Biener wrote: On Wed, Mar 4, 2015 at 3:13 PM, Bernd Edlinger Maybe one thing is missing from strict_volatile_bitfield_p, I am not sure. Maybe it should check that MEM_ALIGN (op0)= GET_MODE_ALIGNMENT (fieldmode) Because the strict volatile bitfields handling will inevitably try to use the fieldmode to access the memory. Or would it be better to say MEM_ALIGN (op0)= GET_MODE_BITSIZE (fieldmode), because it is easier to explain when some one asks, when we guarantee the semantics of strict volatile bitfield? But on non-strict-align targets you can even for 1-byte aligned MEMs access an SImode field directly. So the old code looks correct to me here and the fix needs to be done somewhere else. But this SImode access is split up in several QImode or HImode accesses, in the processor's execution pipeline, finally on an external bus like AXI all memory transactions are aligned. The difference is just that some processors can split up the unaligned accesses and some need help from the compiler, but even on an x86 we have a different semantics for unaligned acceses, that is they are no longer atomic, while an aligned access is always executed as an atomic transaction on an x86 processor. Probably there is already something in the logic in expr.c that prevents these cases, because otherwise it would be way to easy to find an example for unaligned accesses to unaligned memory on STRICT_ALIGNMENT targets. Ok, what would you think about this variant? --- expmed.c.jj 2015-01-16 11:20:40.0 +0100 +++ expmed.c 2015-03-05 11:50:09.40016 +0100 @@ -472,9 +472,11 @@ strict_volatile_bitfield_p (rtx op0, uns return false; /* Check for cases of unaligned fields that must be split. */ - if (bitnum % BITS_PER_UNIT + bitsize modesize - || (STRICT_ALIGNMENT - bitnum % GET_MODE_ALIGNMENT (fieldmode) + bitsize modesize)) + if (bitnum % modesize + bitsize modesize) + return false; + + /* Check that the memory is sufficiently aligned. */ + if (MEM_ALIGN (op0) modesize) I think that only applies to strict-align targets and then only for GET_MODE_ALIGNMENT (modesize). And of course what matters is the alignment at bitnum - even though op0 may be not sufficiently aligned it may have known misalignment so that op0 + bitnum is sufficiently aligned. Testcases would need to annotate structs with packed/aligned attributes to get at these cases. For the testcase included in the patch, what does the patch end up doing? Not going the strict-volatile bitfield expansion path? That looks unnecessary on !strict-alignment targets but resonable on strict-align targets where the access would need to be splitted. So, why does it end up being splitted on !strict-align targets? gcc -fstrict-volatile-bitfields -S 20150304-1.c without patch we find this in 20150304-1.s: main: .LFB0: .cfi_startproc pushq %rbp .cfi_def_cfa_offset 16 .cfi_offset 6, -16 movq %rsp, %rbp .cfi_def_cfa_register 6 movzbl global+1(%rip), %eax orl $-1, %eax movb %al, global+1(%rip) movzbl global+2(%rip), %eax orl $-1, %eax movb %al, global+2(%rip) movzbl global+3(%rip), %eax orl $-1, %eax movb %al, global+3(%rip) movzbl global+4(%rip), %eax orl $127, %eax movb %al, global+4(%rip) movl global(%rip), %eax shrl $8, %eax andl $2147483647, %eax cmpl $2147483647, %eax je .L2 call abort .L2: movl $0, %eax popq %rbp .cfi_def_cfa 7, 8 ret .cfi_endproc so the write path is OK, because strict_volatile_bitfield_p returns false, because /* Check for cases where the C++ memory model applies. */ if (bitregion_end != 0 (bitnum - bitnum % modesize bitregion_start || bitnum - bitnum % modesize + modesize - 1 bitregion_end)) return false; bitregion_start=8, bitregion_end=39 bitnum - bitnum % modesize = 0 bitnum - bitnum % modesize + modesize - 1 = 31 this does not happen in the read path, and the problem is the access here does not work: str_rtx = narrow_bit_field_mem (str_rtx, fieldmode, bitsize, bitnum, bitnum); /* Explicitly override the C/C++ memory model; ignore the bit range so that we can do the access in the mode mandated by -fstrict-volatile-bitfields instead. */ store_fixed_bit_field_1 (str_rtx, bitsize, bitnum, value); str_rtx = unchanged, bitnum = 8, bitsize= 31, but store_fixed_bit_fileld_1 can not handle that. BTW: I can make the write code path also fail if I change this bit in the test case add :8 to char
RE: [PATCH] Fix another wrong-code bug with -fstrict-volatile-bitfields
bounced... again, without html. Hi Richard, while working on another bug in the area of -fstrict-volatile-bitfields I became aware of another example where -fstrict-volatile-bitfields may generate wrong code. This is reproducible on a !STRICT_ALIGNMENT target like x86_64. The problem is that strict_volatile_bitfield_p tries to allow more than necessary if !STRICT_ALIGNMENT. Everything works OK on ARM for instance. If this function returns true, we may later call narrow_bit_field_mem, and the check in strict_volatile_bitfield_p should mirror the logic there: narrow_bit_field_mem just uses GET_MODE_BITSIZE (mode) and does not care about STRICT_ALIGNMENT, and in the end *new_bitnum + bitsize may reach beyond the end of the region. This causes store_fixed_bit_field_1 to silently fail to generate correct code. The attached patch was boot-strapped and regression-tested on x86_64-linux-gnu. OK for trunk and 4.9? Thanks Bernd. gcc: 2015-03-04 Bernd Edlinger bernd.edlin...@hotmail.de * expmed.c (strict_volatile_bitfield_p): Don't return different results if !STRICT_ALIGNMENT. Use GET_MODE_BITSIZE instead of GET_MODE_ALIGNMENT here. testsuite: 2015-03-04 Bernd Edlinger bernd.edlin...@hotmail.de * gcc.dg/20150304-1.c: New test. patch-volatile-bitfields-1.diff Description: Binary data