Re: [PATCH] Fix another wrong-code bug with -fstrict-volatile-bitfields

2015-03-30 Thread Richard Biener
On Mon, Mar 16, 2015 at 11:53 AM, Bernd Edlinger
bernd.edlin...@hotmail.de wrote:

 Hi,


 when looking at the m68k I realized the following, which is
 a general problem...

 If the alignment of the structure is less than sizeof(field), the
 strict volatile bitfields code may read beyond the end of the
 structure!

 Consider this example:

 struct s
 {
   char x : 8;
   volatile unsigned int y : 31;
   volatile unsigned int z : 1;
 } __attribute__((packed));

 struct s global;


 Here we have sizeof(struct s) = 5, alignment(global) == 1,
 However when we access global.z we read a 32-bit word
 at offset 4, which touches 3 bytes that are not safe to use.

 Something like that does never happen with -fno-strict-volatile-bitfields,
 because IIRC, with the only exception of the simple_mem_bitfield_p code path,
 there is never an access mode used which is larger than MEM_ALIGN(x).

Are you sure?  There is still PR36043 ...

 In this example, if I want to use the packed attribute,
 I also have to use the aligned(4) attribute, this satisfies the
 check MEM_ALIGN (op0)  modesize, which is IMO always necessary
 for strict volatile bitfields, not only on STRICT_ALIGNMENT targets.

But your patch still somehow special-cases them.

 On a target, that has BIGGEST_ALIGNMENT  BITS_PER_WORD,
 to use the strict volatile bitfields, you have to add the 
 __attribute__((aligned(4)))
 to the structure.

 I had to do that on the pr23623.c test case, to have it passed on m68k for 
 instance.


 I have attached the updated patch.  As explained before, the check
 MEM_ALIGN (op0)  modesize should always be done in 
 strict_volatile_bitfield_p.

 For the targets, that usually enable -fstrict-volatile-bitfields, nothing 
 changes,
 Except when we use packed on the structure, we need to add also an 
 aligned(4)
 attribute. For m68k where the natural alignment of any structure is =2 we 
 need to
 force aligned(4) if we want to ensure the access is in SImode.

 Boot-strapped and reg-tested on x86_64-linux-gnu.
 OK for trunk?

So - shouldn't the check be

  if (MEM_ALIGN (op0)  GET_MODE_ALIGNMENT (fieldmode))
return false;

instead?  And looking at

   /* Check for cases of unaligned fields that must be split.  */
-  if (bitnum % BITS_PER_UNIT + bitsize  modesize
-  || (STRICT_ALIGNMENT
-  bitnum % GET_MODE_ALIGNMENT (fieldmode) + bitsize  modesize))
+  if (bitnum % (STRICT_ALIGNMENT ? modesize : BITS_PER_UNIT)
+  + bitsize  modesize)
 return false;

I wonder what the required semantics are - are they not that we need to access
the whole underlying field with the access (the C++ memory model only
requires we don't go beyond the field)?  It seems that information isn't readily
available here though.  So the check checks that we can access the field
with a single access using fieldmode.  Which means (again),

  if (bitnum % (STRICT_ALIGNMENT ? GET_MODE_ALIGNMENT (fieldmode) :
BITS_PER_UNIT)
  + bitsize  modesize)

Also this means that for !STRICT_ALIGNMENT platforms the MEM_ALIGN check isn't
sufficient which is what the other hunks in the patch are about to fix?

It seems at least the

@@ -988,6 +995,16 @@ store_bit_field (rtx str_rtx, unsigned HOST_WIDE_I

  str_rtx = narrow_bit_field_mem (str_rtx, fieldmode, bitsize, bitnum,
  bitnum);
+ if (!STRICT_ALIGNMENT
+  bitnum + bitsize  GET_MODE_BITSIZE (fieldmode))
+   {
+ unsigned HOST_WIDE_INT offset;
+ offset = (bitnum + bitsize + BITS_PER_UNIT - 1
+   - GET_MODE_BITSIZE (fieldmode)) / BITS_PER_UNIT;
+ str_rtx = adjust_bitfield_address (str_rtx, fieldmode, offset);
+ bitnum -= offset * BITS_PER_UNIT;
+   }
+ gcc_assert (bitnum + bitsize = GET_MODE_BITSIZE (fieldmode));

hunks could do with a comment.

That said, I fail to realize how the issue is specific to
strict-volatile bitfields?

I also hope somebody else will also look at the patch - I'm not feeling like the
appropriate person to review changes in this area (even if I did so in
the past).

Thanks,
Richard.


 Thanks
 Bernd.



RE: [PATCH] Fix another wrong-code bug with -fstrict-volatile-bitfields

2015-03-30 Thread Bernd Edlinger
Hi,

On Mon, 30 Mar 2015 12:33:47, Richard Biener wrote:

 On Mon, Mar 16, 2015 at 11:53 AM, Bernd Edlinger
 bernd.edlin...@hotmail.de wrote:

 Hi,


 when looking at the m68k I realized the following, which is
 a general problem...

 If the alignment of the structure is less than sizeof(field), the
 strict volatile bitfields code may read beyond the end of the
 structure!

 Consider this example:

 struct s
 {
 char x : 8;
 volatile unsigned int y : 31;
 volatile unsigned int z : 1;
 } __attribute__((packed));

 struct s global;


 Here we have sizeof(struct s) = 5, alignment(global) == 1,
 However when we access global.z we read a 32-bit word
 at offset 4, which touches 3 bytes that are not safe to use.

 Something like that does never happen with -fno-strict-volatile-bitfields,
 because IIRC, with the only exception of the simple_mem_bitfield_p code path,
 there is never an access mode used which is larger than MEM_ALIGN(x).

 Are you sure? There is still PR36043 ...


I meant the extract_bit_field/store_bit_field,
anything can happen if you use emit_move_insn directly,
but extract_bit_field should not do that.

 In this example, if I want to use the packed attribute,
 I also have to use the aligned(4) attribute, this satisfies the
 check MEM_ALIGN (op0)  modesize, which is IMO always necessary
 for strict volatile bitfields, not only on STRICT_ALIGNMENT targets.

 But your patch still somehow special-cases them.

Yes. My original intention was to by-pass the strict-volatile-bitfields code
path all together, for everything that it can not safely handle.

that would mean:

--- gcc/expmed.c    (revision 221427)
+++ gcc/expmed.c    (working copy)
@@ -472,11 +472,16 @@ strict_volatile_bitfield_p (rtx op0, unsigned HOST
 return false;

   /* Check for cases of unaligned fields that must be split.  */
-  if (bitnum % BITS_PER_UNIT + bitsize modesize
-  || (STRICT_ALIGNMENT
-  bitnum % GET_MODE_ALIGNMENT (fieldmode) + bitsize modesize))
+  if (bitnum % modesize + bitsize modesize)
 return false;

+  /* The memory must be sufficiently aligned for a MODESIZE access.
+ This condition guarantees, that the memory access will not
+ touch anything after the end of the structure.  */
+  if (MEM_ALIGN (op0)  modesize)
+    return false;
+
   /* Check for cases where the C++ memory model applies.  */
   if (bitregion_end != 0
    (bitnum - bitnum % modesize  bitregion_start


and _removing_ that:

@@ -988,6 +995,16 @@ store_bit_field (rtx str_rtx, unsigned HOST_WIDE_I

  str_rtx = narrow_bit_field_mem (str_rtx, fieldmode, bitsize, bitnum,
  bitnum);
+ if (!STRICT_ALIGNMENT
+  bitnum + bitsize GET_MODE_BITSIZE (fieldmode))
+   {
+ unsigned HOST_WIDE_INT offset;
+ offset = (bitnum + bitsize + BITS_PER_UNIT - 1
+   - GET_MODE_BITSIZE (fieldmode)) / BITS_PER_UNIT;
+ str_rtx = adjust_bitfield_address (str_rtx, fieldmode, offset);
+ bitnum -= offset * BITS_PER_UNIT;
+   }
+ gcc_assert (bitnum + bitsize = GET_MODE_BITSIZE (fieldmode));
  temp = copy_to_reg (str_rtx);
  if (!store_bit_field_1 (temp, bitsize, bitnum, 0, 0,
  fieldmode, value, true))

and _remoing_ that too:

    }

   str_rtx = narrow_bit_field_mem (str_rtx, mode1, bitsize, bitnum,
  bitnum);
+  if (!STRICT_ALIGNMENT
+  bitnum + bitsize GET_MODE_BITSIZE (mode1))
+   {
+ unsigned HOST_WIDE_INT offset;
+ offset = (bitnum + bitsize + BITS_PER_UNIT - 1
+   - GET_MODE_BITSIZE (mode1)) / BITS_PER_UNIT;
+ str_rtx = adjust_bitfield_address (str_rtx, mode1, offset);
+ bitnum -= offset * BITS_PER_UNIT;
+   }
+  gcc_assert (bitnum + bitsize = GET_MODE_BITSIZE (mode1));
   str_rtx = copy_to_reg (str_rtx);
 }


I would keep just the added assertions in the expansion path.

However, I added that to do what you requested in a previous e-mail:

 every access is split in 4 QImode accesses. but that is as
 expected, because the structure is byte aligned.
  
 No, it is not expected because the CPU can handle unaligned SImode
 reads/writes just fine (even if not as an atomic operation).
 The C++ memory model allows an SImode read to s.y (-fstrict-volatile-bitfields
 would, as well, but the CPU doesn't guarantee atomic operation here)
  
 Richard.

... but now I am not too happy with it either.
What we need for the device register accesses is just a predictable access
at a predicable offset. And it should better be aligned to MODE_BITSIZE,
it is irrelevant for device registers what the CPU can access.

Furthermore, if we have a structure like: { int x:16; int y:16; }
when MODE_ALIGNMNET(SImode)=16, it would be possible
to access y as SImode at offset 0, or at offset 2, both would be equally 
possible.

The 

RE: [PATCH] Fix another wrong-code bug with -fstrict-volatile-bitfields

2015-03-18 Thread Bernd Edlinger
Hi,


On Mon, 16 Mar 2015 20:31:53, Hans-Peter Nilsson wrote:

 On Mon, 16 Mar 2015, Eric Botcazou wrote:
 If we have BIGGEST_ALIGNMENT=16 that means we have likely a 16 bit
 architecture. I doubt that the strict alignment code makes any sense for
 modesize BIGGEST_ALIGNMENT.

 Note that m68k is a 32-bit port (UNITS_PER_WORD is 4) but, by definition of
 BIGGEST_ALIGNMENT, not honoring an alignment larger than it is perfectly OK.

 The context of the above statement is somewhat ambiguous: if the
 statement is taken to include no matter what is specified in
 the program, including use of __attribute__ ((__aligned__ ...))
 then I have to object. (I guess Eric, you didn't actually mean
 that, though.)

 The definition of BIGGEST_ALIGNMENT is (tm.texi.in):
 Biggest alignment that any data type can require on this
 machine, in bits. Note that this is not the biggest alignment
 that is supported, just the biggest alignment that, when
 violated, may cause a fault.

 So, IMNSHO we'd better *support* a *larger* alignment, as in if
 the code specified it by explicit means at least up to
 MAX_OFILE_ALIGNMENT. But, perfectly OK when the code didn't
 explicitly say anything else.


Absolutely.

The idea if this patch, is to *require* the use of 
__attribute__((__aligned__(x)))
for all structures that need the strict volatile bitfields semantics.
At least if the ABI has BIGGEST_ALIGNMENTBITS_PER_WORD.

I don't see any alternative.


Thanks
Bernd.
  

Re: [PATCH] Fix another wrong-code bug with -fstrict-volatile-bitfields

2015-03-16 Thread Hans-Peter Nilsson
On Mon, 16 Mar 2015, Eric Botcazou wrote:
  If we have BIGGEST_ALIGNMENT=16 that means we have likely a 16 bit
  architecture. I doubt that the strict alignment code makes any sense for
  modesize BIGGEST_ALIGNMENT.

 Note that m68k is a 32-bit port (UNITS_PER_WORD is 4) but, by definition of
 BIGGEST_ALIGNMENT, not honoring an alignment larger than it is perfectly OK.

The context of the above statement is somewhat ambiguous: if the
statement is taken to include no matter what is specified in
the program, including use of __attribute__ ((__aligned__ ...))
then I have to object.  (I guess Eric, you didn't actually mean
that, though.)

The definition of BIGGEST_ALIGNMENT is (tm.texi.in):
Biggest alignment that any data type can require on this
machine, in bits.  Note that this is not the biggest alignment
that is supported, just the biggest alignment that, when
violated, may cause a fault.

So, IMNSHO we'd better *support* a *larger* alignment, as in if
the code specified it by explicit means at least up to
MAX_OFILE_ALIGNMENT.  But, perfectly OK when the code didn't
explicitly say anything else.

BTW, unfortunately, BIGGEST_ALIGNMENT can't be blindly followed
for atomic accesses to undecorated (without explicit alignment
specifiers) code either.  Rather the minimum alignment is the
natural alignment *absolutely everywhere no matter what target*
which matters for targets where
 BIGGEST_ALIGNMENT  natural_alignment(all supported types)
(well, as long as the natural alignment is smaller than the
target page-size or cache-line, where at least one of the
concepts is usually applicable).

Q: So why not adjust the BIGGEST_ALIGNMENT definition in such
targets to be at least the natural alignment of supported
atomic types?
A: Because that unfortunately has bad effects on generated code
for all accesses to all sizes now = BIGGEST_ALIGNMENT.

brgds, H-P
PS. It's very unfortunate that there's now __BIGGEST_ALIGNMENT__
visible to programs.


Re: [PATCH] Fix another wrong-code bug with -fstrict-volatile-bitfields

2015-03-16 Thread Andreas Schwab
Hans-Peter Nilsson h...@bitrange.com writes:

 Q: So why not adjust the BIGGEST_ALIGNMENT definition in such
 targets to be at least the natural alignment of supported
 atomic types?

A: Because it's an ABI change.

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
And now for something completely different.


Re: [PATCH] Fix another wrong-code bug with -fstrict-volatile-bitfields

2015-03-16 Thread Hans-Peter Nilsson
On Tue, 17 Mar 2015, Andreas Schwab wrote:
 Hans-Peter Nilsson h...@bitrange.com writes:

  Q: So why not adjust the BIGGEST_ALIGNMENT definition in such
  targets to be at least the natural alignment of supported
  atomic types?

 A: Because it's an ABI change.

I intended that to be included in bad effects;
A: Because that unfortunately has bad effects on generated code
for all accesses to all sizes now = BIGGEST_ALIGNMENT.

but thanks for being specific (and I didn't remember exactly
*what* bad effects it was that I saw when I tried that long ago :)

brgds, H-P


RE: [PATCH] Fix another wrong-code bug with -fstrict-volatile-bitfields

2015-03-16 Thread Bernd Edlinger

Hi,


when looking at the m68k I realized the following, which is
a general problem...

If the alignment of the structure is less than sizeof(field), the
strict volatile bitfields code may read beyond the end of the
structure!

Consider this example:

struct s
{
  char x : 8;
  volatile unsigned int y : 31;
  volatile unsigned int z : 1;
} __attribute__((packed));

struct s global;


Here we have sizeof(struct s) = 5, alignment(global) == 1,
However when we access global.z we read a 32-bit word
at offset 4, which touches 3 bytes that are not safe to use. 

Something like that does never happen with -fno-strict-volatile-bitfields,
because IIRC, with the only exception of the simple_mem_bitfield_p code path,
there is never an access mode used which is larger than MEM_ALIGN(x).

In this example, if I want to use the packed attribute,
I also have to use the aligned(4) attribute, this satisfies the
check MEM_ALIGN (op0)  modesize, which is IMO always necessary
for strict volatile bitfields, not only on STRICT_ALIGNMENT targets.

On a target, that has BIGGEST_ALIGNMENT  BITS_PER_WORD,
to use the strict volatile bitfields, you have to add the 
__attribute__((aligned(4)))
to the structure.

I had to do that on the pr23623.c test case, to have it passed on m68k for 
instance.


I have attached the updated patch.  As explained before, the check
MEM_ALIGN (op0)  modesize should always be done in strict_volatile_bitfield_p.

For the targets, that usually enable -fstrict-volatile-bitfields, nothing 
changes,
Except when we use packed on the structure, we need to add also an aligned(4)
attribute. For m68k where the natural alignment of any structure is =2 we need 
to
force aligned(4) if we want to ensure the access is in SImode.

Boot-strapped and reg-tested on x86_64-linux-gnu.
OK for trunk?


Thanks
Bernd.
  gcc:
2015-03-15  Bernd Edlinger  bernd.edlin...@hotmail.de

* expmed.c (strict_volatile_bitfield_p): Check that MEM_ALIGN allows
a MODESIZE access.
(store_bit_field, extract_bit_field): For !STRICT_ALIGNMENT explicitly
generate an unaligned access if the field crosses a word boundary.

testsuite:
2015-03-15  Bernd Edlinger  bernd.edlin...@hotmail.de

* gcc.dg/pr23623.c: Added aligned attribute.
* gcc.dg/20141029-1.c: Likewise.
* gcc.dg/20150306-1.c: New test.


patch-volatile-bitfields-1.diff
Description: Binary data


Re: [PATCH] Fix another wrong-code bug with -fstrict-volatile-bitfields

2015-03-16 Thread Eric Botcazou
 If we have BIGGEST_ALIGNMENT=16 that means we have likely a 16 bit
 architecture. I doubt that the strict alignment code makes any sense for
 modesize BIGGEST_ALIGNMENT.

Note that m68k is a 32-bit port (UNITS_PER_WORD is 4) but, by definition of 
BIGGEST_ALIGNMENT, not honoring an alignment larger than it is perfectly OK.

-- 
Eric Botcazou


RE: [PATCH] Fix another wrong-code bug with -fstrict-volatile-bitfields

2015-03-14 Thread Bernd Edlinger

Hi,

On Sat, 14 Mar 2015 13:24:33, Mikael Pettersson wrote:

 Bernd Edlinger writes:
 Hi,

 are there any more comments on this?

 I would like to apply the patch as is, unless we find a
 a way to get to a test case, maybe with a cross-compiler,
 where the MODE_ALIGNMENT is different from MODE_BITSIZE.

 Currently, I think that does not happen.

 On m68k-linux GET_MODE_ALIGNMENT (SImode) == 16 while
 GET_MODE_BITSIZE (SImode) == 32.

 I don't know what that means for your patch, just wanted
 to inform you that such targets do exist.


Oh I see, thanks.

This is due to BIGGEST_ALIGNMENT=16, STRICT_ALIGNMENT=1 by default on
that architecture.

If I change this check as suggested:

  if (bitnum % (STRICT_ALIGNMENT ? GET_MODE_ALIGNMENT (fieldmode) : 
BITS_PER_UNIT)
  + bitsize modesize
  || (STRICT_ALIGNMENT  MEM_ALIGN (op0)  GET_MODE_ALIGNMENT (fieldmode)))
    return false;


Then I can get the assertion failed in store_bit_field:
  gcc_assert (bitnum + bitsize = GET_MODE_BITSIZE (fieldmode));

With this example:

cat test.c
struct s
{
  short y:16;
  int x:31;
};

void
f(volatile struct s* z, int x)
{
  z-x=x;
}

m68k-elf-gcc -fstrict-volatile-bitfields -mstrict-align -mno-align-int -O2 -S 
test.c
test.c: In function 'f':
test.c:10:7: internal compiler error: in store_bit_field, at expmed.c:1005
   z-x=x;
   ^

what I said before..,

without the patch the test case generates
just invalid code which assigns only 16 bits.

There is also a problem in this check.  I had to make
short y:16 a bit filed to bypass that, initially I wrote short y;

  /* Check for cases where the C++ memory model applies.  */
  if (bitregion_end != 0
   (bitnum - bitnum % modesize  bitregion_start
  || bitnum - bitnum % modesize + modesize - 1 bitregion_end))
    return false;

This assumes also that the access is at a modesize boundary.

If we have BIGGEST_ALIGNMENT=16 that means we have likely a 16 bit
architecture. I doubt that the strict alignment code makes any sense for
modesize BIGGEST_ALIGNMENT.

I think I should change this check

  /* The bit size must not be larger than the field mode, and
 the field mode must not be larger than a word.  */
  if (bitsize modesize || modesize BITS_PER_WORD)
    return false;

to this:

  if (bitsize modesize || modesize BITS_PER_WORD
  || modesize BIGGEST_ALIGNMENT)
    return false;

This should avoid these oddities.


Bernd.
  

RE: [PATCH] Fix another wrong-code bug with -fstrict-volatile-bitfields

2015-03-14 Thread Mikael Pettersson
Bernd Edlinger writes:
  Hi,
  
  are there any more comments on this?
  
  I would like to apply the patch as is, unless we find a
  a way to get to a test case, maybe with a cross-compiler,
  where the MODE_ALIGNMENT is different from MODE_BITSIZE.
  
  Currently, I think that does not happen.

On m68k-linux GET_MODE_ALIGNMENT (SImode) == 16 while
GET_MODE_BITSIZE (SImode) == 32.

I don't know what that means for your patch, just wanted
to inform you that such targets do exist.

/Mikael

  
  Thanks
  Bernd.
  
   Date: Tue, 10 Mar 2015 14:40:52 +0100
  
   Hi Richard and Eric,
  
   On Mon, 9 Mar 2015 15:30:31, Richard Biener wrote:
   Reg-tested on x86_64 successfully and ARM is still running.
  
   ARM completed without regressions meanwhile.
  
  
   Is it OK for trunk?
  
   Looks ok to me apart from
  
   /* Check for cases of unaligned fields that must be split. */
   - if (bitnum % BITS_PER_UNIT + bitsize modesize
   - || (STRICT_ALIGNMENT
   -  bitnum % GET_MODE_ALIGNMENT (fieldmode) + bitsize modesize))
   + if (bitnum % (STRICT_ALIGNMENT ? modesize : BITS_PER_UNIT)
   + + bitsize modesize
   + || (STRICT_ALIGNMENT  MEM_ALIGN (op0)  modesize))
   return false;
  
   where I'd use GET_MODE_ALIGNMENT (fieldmode) rather than modesize
   (in both places).
  
   Please leave Eric the chance to comment.
  
  
   Just to clarify a few things here:
  
   I try to make the checks in strict_volatile_bitfield_p
   to be consistent with the strict volatile bit-field code path
   that follows if we return true here.
  
   I would summarize the current implementation of the
   strict volatile bit-field code as follows:
  
   For strict alignment, we access the structure as
   if it were an array of fieldmode.  A multiple of modesize
   is added to the base address, and one single read and/or
   write access must be sufficient to do the job.  The access
   must be Ok regarding the target's alignment restrictions.
   That does not change, what changed with the previous patch
   is a missed optimization with the EP_insv code pattern.
  
   For non strict alignment, a multiple of modesize is added
   to the base address, but if the range [bitnum:bitnum+bitsize-1]
   spans two fieldmode words, which should only happen if we
   use packed structures, a byte offset is added to the base address.
   The byte offset is chosen as small as possible, to not reach beyond
   the bit field region.  That is new.  This change is irrelevant for the use 
   case
   of accessing a device register, but the generated code is more compact.
  
   Usually we have GET_MODE_ALIGNMENT(fieldmode)==modesize
   for all SCALAR_INT_MODE_P(fieldmode).
  
   The only exceptions are complex numbers, and targets where
   ADJUST_ALIGNMENT is used in the modes.def, right?
  
   The targets that do that are few, and the modes are mostly vector modes.
   So I did not find any target where the MODE_ALIGNMENT would make
   a difference here.  Therefore I think it is more or less a matter of taste.
   But please correct me if I am wrong.
  
   If there are cases, where MODE_ALIGNMENTMODE_BITSIZE,
   changing the second condition MEM_ALIGN(op0)modesize to
   MEM_ALIGN(op0)GET_MODE_ALIGNMENT(filedmode) would
   still be consistent, and probably be more correct.
  
   But I think changing the first condition would allow cases where this
   assertion in the patch does no longer hold:
   gcc_assert (bitnum + bitsize = GET_MODE_BITSIZE (fieldmode));
  
  
  
   Thanks
   Bernd.
  
 
-- 


RE: [PATCH] Fix another wrong-code bug with -fstrict-volatile-bitfields

2015-03-13 Thread Bernd Edlinger
Hi,

are there any more comments on this?

I would like to apply the patch as is, unless we find a
a way to get to a test case, maybe with a cross-compiler,
where the MODE_ALIGNMENT is different from MODE_BITSIZE.

Currently, I think that does not happen.

Thanks
Bernd.

 Date: Tue, 10 Mar 2015 14:40:52 +0100

 Hi Richard and Eric,

 On Mon, 9 Mar 2015 15:30:31, Richard Biener wrote:
 Reg-tested on x86_64 successfully and ARM is still running.

 ARM completed without regressions meanwhile.


 Is it OK for trunk?

 Looks ok to me apart from

 /* Check for cases of unaligned fields that must be split. */
 - if (bitnum % BITS_PER_UNIT + bitsize modesize
 - || (STRICT_ALIGNMENT
 -  bitnum % GET_MODE_ALIGNMENT (fieldmode) + bitsize modesize))
 + if (bitnum % (STRICT_ALIGNMENT ? modesize : BITS_PER_UNIT)
 + + bitsize modesize
 + || (STRICT_ALIGNMENT  MEM_ALIGN (op0)  modesize))
 return false;

 where I'd use GET_MODE_ALIGNMENT (fieldmode) rather than modesize
 (in both places).

 Please leave Eric the chance to comment.


 Just to clarify a few things here:

 I try to make the checks in strict_volatile_bitfield_p
 to be consistent with the strict volatile bit-field code path
 that follows if we return true here.

 I would summarize the current implementation of the
 strict volatile bit-field code as follows:

 For strict alignment, we access the structure as
 if it were an array of fieldmode.  A multiple of modesize
 is added to the base address, and one single read and/or
 write access must be sufficient to do the job.  The access
 must be Ok regarding the target's alignment restrictions.
 That does not change, what changed with the previous patch
 is a missed optimization with the EP_insv code pattern.

 For non strict alignment, a multiple of modesize is added
 to the base address, but if the range [bitnum:bitnum+bitsize-1]
 spans two fieldmode words, which should only happen if we
 use packed structures, a byte offset is added to the base address.
 The byte offset is chosen as small as possible, to not reach beyond
 the bit field region.  That is new.  This change is irrelevant for the use 
 case
 of accessing a device register, but the generated code is more compact.

 Usually we have GET_MODE_ALIGNMENT(fieldmode)==modesize
 for all SCALAR_INT_MODE_P(fieldmode).

 The only exceptions are complex numbers, and targets where
 ADJUST_ALIGNMENT is used in the modes.def, right?

 The targets that do that are few, and the modes are mostly vector modes.
 So I did not find any target where the MODE_ALIGNMENT would make
 a difference here.  Therefore I think it is more or less a matter of taste.
 But please correct me if I am wrong.

 If there are cases, where MODE_ALIGNMENTMODE_BITSIZE,
 changing the second condition MEM_ALIGN(op0)modesize to
 MEM_ALIGN(op0)GET_MODE_ALIGNMENT(filedmode) would
 still be consistent, and probably be more correct.

 But I think changing the first condition would allow cases where this
 assertion in the patch does no longer hold:
 gcc_assert (bitnum + bitsize = GET_MODE_BITSIZE (fieldmode));



 Thanks
 Bernd.

  

RE: [PATCH] Fix another wrong-code bug with -fstrict-volatile-bitfields

2015-03-10 Thread Bernd Edlinger
Hi Richard and Eric,

On Mon, 9 Mar 2015 15:30:31, Richard Biener wrote:
 Reg-tested on x86_64 successfully and ARM is still running.

ARM completed without regressions meanwhile.


 Is it OK for trunk?

 Looks ok to me apart from

 /* Check for cases of unaligned fields that must be split. */
 - if (bitnum % BITS_PER_UNIT + bitsize modesize
 - || (STRICT_ALIGNMENT
 -  bitnum % GET_MODE_ALIGNMENT (fieldmode) + bitsize modesize))
 + if (bitnum % (STRICT_ALIGNMENT ? modesize : BITS_PER_UNIT)
 + + bitsize modesize
 + || (STRICT_ALIGNMENT  MEM_ALIGN (op0)  modesize))
 return false;

 where I'd use GET_MODE_ALIGNMENT (fieldmode) rather than modesize
 (in both places).

 Please leave Eric the chance to comment.


Just to clarify a few things here:

I try to make the checks in strict_volatile_bitfield_p
to be consistent with the strict volatile bit-field code path
that follows if we return true here.

I would summarize the current implementation of the
strict volatile bit-field code as follows:

For strict alignment, we access the structure as
if it were an array of fieldmode.  A multiple of modesize
is added to the base address, and one single read and/or
write access must be sufficient to do the job.  The access
must be Ok regarding the target's alignment restrictions.
That does not change, what changed with the previous patch
is a missed optimization with the EP_insv code pattern.

For non strict alignment, a multiple of modesize is added
to the base address, but if the range [bitnum:bitnum+bitsize-1]
spans two fieldmode words, which should only happen if we
use packed structures, a byte offset is added to the base address.
The byte offset is chosen as small as possible, to not reach beyond
the bit field region.  That is new.  This change is irrelevant for the use case
of accessing a device register, but the generated code is more compact.

Usually we have GET_MODE_ALIGNMENT(fieldmode)==modesize
for all SCALAR_INT_MODE_P(fieldmode).

The only exceptions are complex numbers, and targets where 
ADJUST_ALIGNMENT is used in the modes.def, right?

The targets that do that are few, and the modes are mostly vector modes.
So I did not find any target where the MODE_ALIGNMENT would make
a difference here.  Therefore I think it is more or less a matter of taste.
But please correct me if I am wrong.

If there are cases, where MODE_ALIGNMENTMODE_BITSIZE, 
changing the second condition MEM_ALIGN(op0)modesize to
MEM_ALIGN(op0)GET_MODE_ALIGNMENT(filedmode) would
still be consistent, and probably be more correct.

But I think changing the first condition would allow cases where this
assertion in the patch does no longer hold:
gcc_assert (bitnum + bitsize = GET_MODE_BITSIZE (fieldmode));



Thanks
Bernd.
  

Re: [PATCH] Fix another wrong-code bug with -fstrict-volatile-bitfields

2015-03-09 Thread Richard Biener
On Fri, Mar 6, 2015 at 12:48 PM, Bernd Edlinger
bernd.edlin...@hotmail.de wrote:
 Hi Richard,

 here is my new proposal, it addresses your objections and generates
 better code for this test case:

 main:
 .LFB0:
 .cfi_startproc
 pushq%rbp
 .cfi_def_cfa_offset 16
 .cfi_offset 6, -16
 movq%rsp, %rbp
 .cfi_def_cfa_register 6
 movlglobal+1(%rip), %eax
 orl$2147483647, %eax
 movl%eax, global+1(%rip)
 movlglobal+1(%rip), %eax
 andl$2147483647, %eax
 cmpl$2147483647, %eax
 je.L2
 callabort
 .L2:
 movl$0, %eax
 popq%rbp
 .cfi_def_cfa 7, 8
 ret
 .cfi_endproc


 I also tried to fix the comments.

 Reg-tested on x86_64 successfully and ARM is still running.

 Is it OK for trunk?

Looks ok to me apart from

   /* Check for cases of unaligned fields that must be split.  */
-  if (bitnum % BITS_PER_UNIT + bitsize  modesize
-  || (STRICT_ALIGNMENT
-  bitnum % GET_MODE_ALIGNMENT (fieldmode) + bitsize  modesize))
+  if (bitnum % (STRICT_ALIGNMENT ? modesize : BITS_PER_UNIT)
+  + bitsize  modesize
+  || (STRICT_ALIGNMENT  MEM_ALIGN (op0)  modesize))
 return false;

where I'd use GET_MODE_ALIGNMENT (fieldmode) rather than modesize
(in both places).

Please leave Eric the chance to comment.

Thanks,
Richard.



 Thanks
 Bernd.



RE: [PATCH] Fix another wrong-code bug with -fstrict-volatile-bitfields

2015-03-06 Thread Bernd Edlinger
Hi,

On Thu, 5 Mar 2015 16:36:48, Richard Biener wrote:

 On Thu, Mar 5, 2015 at 4:05 PM, Bernd Edlinger
 bernd.edlin...@hotmail.de wrote:

 every access is split in 4 QImode accesses. but that is as
 expected, because the structure is byte aligned.

 No, it is not expected because the CPU can handle unaligned SImode
 reads/writes just fine (even if not as an atomic operation).
 The C++ memory model allows an SImode read to s.y (-fstrict-volatile-bitfields
 would, as well, but the CPU doesn't guarantee atomic operation here)


Hmm, well.  I understand.

However this is the normal way how the non-strict-volatile-bitfields
work.

But I can probably work out a patch that enables the strict-volatile-bitfields
to generate un-aligned SImode accesses when necessary on !STRICT_ALIGNMENT
targets.

Now, I checked again the ARM port, and I am a bit surprised, because
with gcc 4.9.0 this structure gets 4 QImode accesses which is correct,
but with recent trunk (gcc version 5.0.0 20150301) I get SImode accesses,
but the structure is not aligned and the compiler cant possibly know how the 
memory
will be aligned.  Something must have changed in be meantime, but it wasn't by 
me.
IIRC the field mode in this example was QImode but now it seems to be SImode.


struct s
{
  unsigned int y:31;
} __attribute__((packed));

int
test (volatile struct s* x)
{
  x-y=0x7FFF;
  return x-y;
}


So what would you think of this change at strict_volatile_bitfield_p?

diff -up expmed.c.jj expmed.c
--- expmed.c.jj    2015-01-16 11:20:40.0 +0100
+++ expmed.c    2015-03-06 10:07:14.362383274 +0100
@@ -472,9 +472,9 @@ strict_volatile_bitfield_p (rtx op0, uns
 return false;
 
   /* Check for cases of unaligned fields that must be split.  */
-  if (bitnum % BITS_PER_UNIT + bitsize modesize
-  || (STRICT_ALIGNMENT
-       bitnum % GET_MODE_ALIGNMENT (fieldmode) + bitsize modesize))
+  if (bitnum % (STRICT_ALIGNMENT ? modesize : BITS_PER_UNIT)
+  + bitsize modesize
+  || (STRICT_ALIGNMENT  MEM_ALIGN (op0)  modesize))
 return false;
 
   /* Check for cases where the C++ memory model applies.  */


of course this is incomplete, and needs special handling for !STRICT_ALIGNMENT
in the strict-volatile-bitfields code path later.



Thanks
Bernd.
  

RE: [PATCH] Fix another wrong-code bug with -fstrict-volatile-bitfields

2015-03-06 Thread Bernd Edlinger
Hi Richard,

here is my new proposal, it addresses your objections and generates
better code for this test case:

main:
.LFB0:
    .cfi_startproc
    pushq    %rbp
    .cfi_def_cfa_offset 16
    .cfi_offset 6, -16
    movq    %rsp, %rbp
    .cfi_def_cfa_register 6
    movl    global+1(%rip), %eax
    orl    $2147483647, %eax
    movl    %eax, global+1(%rip)
    movl    global+1(%rip), %eax
    andl    $2147483647, %eax
    cmpl    $2147483647, %eax
    je    .L2
    call    abort
.L2:
    movl    $0, %eax
    popq    %rbp
    .cfi_def_cfa 7, 8
    ret
    .cfi_endproc


I also tried to fix the comments.

Reg-tested on x86_64 successfully and ARM is still running.

Is it OK for trunk?



Thanks
Bernd.
  gcc:
2015-03-06  Bernd Edlinger  bernd.edlin...@hotmail.de

* expmed.c (strict_volatile_bitfield_p): For STRICT_ALIGNMENT
check that MEM_ALIGN (op0) allows a MODESIZE access.
(store_bit_field, extract_bit_field): For !STRICT_ALIGNMENT explicitly
generate an unaligned access if the field crosses a word boundary.

testsuite:
2015-03-06  Bernd Edlinger  bernd.edlin...@hotmail.de

* gcc.dg/20150306-1.c: New test.


patch-volatile-bitfields-1.diff
Description: Binary data


Re: [PATCH] Fix another wrong-code bug with -fstrict-volatile-bitfields

2015-03-05 Thread Richard Biener
On Wed, Mar 4, 2015 at 3:13 PM, Bernd Edlinger
bernd.edlin...@hotmail.de wrote:
 bounced... again, without html.


 Hi Richard,

 while working on another bug in the area of   -fstrict-volatile-bitfields
 I became aware of another example where   -fstrict-volatile-bitfields may 
 generate
 wrong code.   This is reproducible on a !STRICT_ALIGNMENT target like x86_64.

 The problem is   that strict_volatile_bitfield_p tries to allow more than 
 necessary
 if !STRICT_ALIGNMENT.Everything works OK on ARM for instance.

 If this function returns true, we may later call narrow_bit_field_mem, and
 the check in strict_volatile_bitfield_p should mirror the logic there:
 narrow_bit_field_mem just uses GET_MODE_BITSIZE (mode) and does not
 care about STRICT_ALIGNMENT, and in the end  *new_bitnum + bitsize may
 reach beyond the end of the region. This causes store_fixed_bit_field_1
 to silently fail to generate correct code.

Hmm, but the comment sounds like if using GET_MODE_ALIGNMENT is
more correct (even for !strict-alignment) - if mode is SImode and mode
alignment is 16 (HImode aligned) then we don't need to split the load
if bitnum is 16 and bitsize is 32.

So maybe narrow_bit_field_mem needs to be fixed as well?

Thanks,
Richard.

 The attached patch was   boot-strapped and
 regression-tested on x86_64-linux-gnu.

 OK for trunk and 4.9?


 Thanks
 Bernd.



Re: [PATCH] Fix another wrong-code bug with -fstrict-volatile-bitfields

2015-03-05 Thread Richard Biener
On Thu, Mar 5, 2015 at 4:05 PM, Bernd Edlinger
bernd.edlin...@hotmail.de wrote:
 Hi,

 On Thu, 5 Mar 2015 12:24:56, Richard Biener wrote:

 On Thu, Mar 5, 2015 at 12:00 PM, Bernd Edlinger
 bernd.edlin...@hotmail.de wrote:
 Hi,

 On Thu, 5 Mar 2015 10:00:26, Richard Biener wrote:

 On Wed, Mar 4, 2015 at 3:13 PM, Bernd Edlinger
 Maybe one thing is missing from strict_volatile_bitfield_p, I am not sure.

 Maybe it should check that MEM_ALIGN (op0)= GET_MODE_ALIGNMENT (fieldmode)
 Because the strict volatile bitfields handling will inevitably try to use
 the fieldmode to access the memory.

 Or would it be better to say MEM_ALIGN (op0)= GET_MODE_BITSIZE (fieldmode),
 because it is easier to explain when some one asks, when we guarantee the 
 semantics
 of strict volatile bitfield?

 But on non-strict-align targets you can even for 1-byte aligned MEMs
 access an SImode field directly. So the old code looks correct to me
 here and the fix needs to be done somewhere else.


 But this SImode access is split up in several QImode or HImode accesses,
 in the processor's execution pipeline, finally on an external bus like AXI 
 all memory
 transactions are aligned.  The difference is just that some processors can 
 split up
 the unaligned accesses and some need help from the compiler, but even on an
 x86 we have a different semantics for unaligned acceses, that is they are no 
 longer atomic,
 while an aligned access is always executed as an atomic transaction on an x86 
 processor.

 Probably there is already something in the logic in expr.c that prevents 
 these cases,
 because otherwise it would be way to easy to find an example for unaligned 
 accesses
 to unaligned memory on STRICT_ALIGNMENT targets.


 Ok, what would you think about this variant?

 --- expmed.c.jj 2015-01-16 11:20:40.0 +0100
 +++ expmed.c 2015-03-05 11:50:09.40016 +0100
 @@ -472,9 +472,11 @@ strict_volatile_bitfield_p (rtx op0, uns
 return false;

 /* Check for cases of unaligned fields that must be split. */
 - if (bitnum % BITS_PER_UNIT + bitsize modesize
 - || (STRICT_ALIGNMENT
 -  bitnum % GET_MODE_ALIGNMENT (fieldmode) + bitsize modesize))
 + if (bitnum % modesize + bitsize modesize)
 + return false;
 +
 + /* Check that the memory is sufficiently aligned. */
 + if (MEM_ALIGN (op0)  modesize)

 I think that only applies to strict-align targets and then only for
 GET_MODE_ALIGNMENT (modesize). And of course what matters
 is the alignment at bitnum - even though op0 may be not sufficiently
 aligned it may have known misalignment so that op0 + bitnum is
 sufficiently aligned.

 Testcases would need to annotate structs with packed/aligned attributes
 to get at these cases.

 For the testcase included in the patch, what does the patch end up doing?
 Not going the strict-volatile bitfield expansion path? That looks unnecessary
 on !strict-alignment targets but resonable on strict-align targets where the
 access would need to be splitted. So, why does it end up being splitted
 on !strict-align targets?


 gcc -fstrict-volatile-bitfields -S 20150304-1.c
 without patch we find this in 20150304-1.s:

 main:
 .LFB0:
 .cfi_startproc
 pushq   %rbp
 .cfi_def_cfa_offset 16
 .cfi_offset 6, -16
 movq%rsp, %rbp
 .cfi_def_cfa_register 6
 movzbl  global+1(%rip), %eax
 orl $-1, %eax
 movb%al, global+1(%rip)
 movzbl  global+2(%rip), %eax
 orl $-1, %eax
 movb%al, global+2(%rip)
 movzbl  global+3(%rip), %eax
 orl $-1, %eax
 movb%al, global+3(%rip)
 movzbl  global+4(%rip), %eax
 orl $127, %eax
 movb%al, global+4(%rip)
 movlglobal(%rip), %eax
 shrl$8, %eax
 andl$2147483647, %eax
 cmpl$2147483647, %eax
 je  .L2
 callabort
 .L2:
 movl$0, %eax
 popq%rbp
 .cfi_def_cfa 7, 8
 ret
 .cfi_endproc


 so the write path is OK, because strict_volatile_bitfield_p returns false,
 because

   /* Check for cases where the C++ memory model applies.  */
   if (bitregion_end != 0
(bitnum - bitnum % modesize  bitregion_start
   || bitnum - bitnum % modesize + modesize - 1 bitregion_end))
 return false;


 bitregion_start=8, bitregion_end=39
 bitnum - bitnum % modesize = 0
 bitnum - bitnum % modesize + modesize - 1 = 31


 this does not happen in the read path, and the problem is the access
 here does not work:

   str_rtx = narrow_bit_field_mem (str_rtx, fieldmode, bitsize, bitnum,
   bitnum);
   /* Explicitly override the C/C++ memory model; ignore the
  bit range so that we can do the access in the mode mandated
  by -fstrict-volatile-bitfields instead.  */
   store_fixed_bit_field_1 (str_rtx, bitsize, bitnum, value);


 str_rtx = unchanged, bitnum = 8, bitsize= 31, but 

Re: [PATCH] Fix another wrong-code bug with -fstrict-volatile-bitfields

2015-03-05 Thread Richard Biener
On Thu, Mar 5, 2015 at 12:00 PM, Bernd Edlinger
bernd.edlin...@hotmail.de wrote:
 Hi,

 On Thu, 5 Mar 2015 10:00:26, Richard Biener wrote:

 On Wed, Mar 4, 2015 at 3:13 PM, Bernd Edlinger
 bernd.edlin...@hotmail.de wrote:
 bounced... again, without html.


 Hi Richard,

 while working on another bug in the area of -fstrict-volatile-bitfields
 I became aware of another example where -fstrict-volatile-bitfields may 
 generate
 wrong code. This is reproducible on a !STRICT_ALIGNMENT target like x86_64.

 The problem is that strict_volatile_bitfield_p tries to allow more than 
 necessary
 if !STRICT_ALIGNMENT. Everything works OK on ARM for instance.

 If this function returns true, we may later call narrow_bit_field_mem, and
 the check in strict_volatile_bitfield_p should mirror the logic there:
 narrow_bit_field_mem just uses GET_MODE_BITSIZE (mode) and does not
 care about STRICT_ALIGNMENT, and in the end *new_bitnum + bitsize may
 reach beyond the end of the region. This causes store_fixed_bit_field_1
 to silently fail to generate correct code.

 Hmm, but the comment sounds like if using GET_MODE_ALIGNMENT is
 more correct (even for !strict-alignment) - if mode is SImode and mode
 alignment is 16 (HImode aligned) then we don't need to split the load
 if bitnum is 16 and bitsize is 32.

 So maybe narrow_bit_field_mem needs to be fixed as well?


 I'd rather not touch that function

 In the whole expmed.c the only place where  GET_MODE_ALIGNMENT
 is used, is in simple_mem_bitfield_p but only if SLOW_UNALIGNED_ACCESS
 returns 1, which is only the case on few targets.
 Do you know any targets, where GET_MODE_ALIGNMENT may be less than
 GET_MODE_BITSIZE?

DImode on i?86, I suppose any mode on targets like AVR.

 Maybe one thing is missing from strict_volatile_bitfield_p, I am not sure.

 Maybe it should check that MEM_ALIGN (op0)= GET_MODE_ALIGNMENT (fieldmode)
 Because the strict volatile bitfields handling will inevitably try to use
 the fieldmode to access the memory.

 Or would it be better to say MEM_ALIGN (op0)= GET_MODE_BITSIZE (fieldmode),
 because it is easier to explain when some one asks, when we guarantee the 
 semantics
 of strict volatile bitfield?

But on non-strict-align targets you can even for 1-byte aligned MEMs
access an SImode field directly.  So the old code looks correct to me
here and the fix needs to be done somewhere else.

 Probably there is already something in the logic in expr.c that prevents 
 these cases,
 because otherwise it would be way to easy to find an example for unaligned 
 accesses
 to unaligned memory on STRICT_ALIGNMENT targets.


 Ok, what would you think about this variant?

 --- expmed.c.jj2015-01-16 11:20:40.0 +0100
 +++ expmed.c2015-03-05 11:50:09.40016 +0100
 @@ -472,9 +472,11 @@ strict_volatile_bitfield_p (rtx op0, uns
  return false;

/* Check for cases of unaligned fields that must be split.  */
 -  if (bitnum % BITS_PER_UNIT + bitsize modesize
 -  || (STRICT_ALIGNMENT
 -   bitnum % GET_MODE_ALIGNMENT (fieldmode) + bitsize modesize))
 +  if (bitnum % modesize + bitsize modesize)
 +return false;
 +
 +  /* Check that the memory is sufficiently aligned.  */
 +  if (MEM_ALIGN (op0)  modesize)

I think that only applies to strict-align targets and then only for
GET_MODE_ALIGNMENT (modesize).  And of course what matters
is the alignment at bitnum - even though op0 may be not sufficiently
aligned it may have known misalignment so that op0 + bitnum is
sufficiently aligned.

Testcases would need to annotate structs with packed/aligned attributes
to get at these cases.

For the testcase included in the patch, what does the patch end up doing?
Not going the strict-volatile bitfield expansion path?  That looks unnecessary
on !strict-alignment targets but resonable on strict-align targets where the
access would need to be splitted.  So, why does it end up being splitted
on !strict-align targets?

Richard.


  return false;

/* Check for cases where the C++ memory model applies.  */


 Trying to use an atomic access to a device register is pointless if the
 memory context is not aligned to the MODE_BITSIZE, that has nothing
 to do with MODE_ALIGNMENT, right?


 Thanks
 Bernd.




RE: [PATCH] Fix another wrong-code bug with -fstrict-volatile-bitfields

2015-03-05 Thread Bernd Edlinger
Hi,

On Thu, 5 Mar 2015 10:00:26, Richard Biener wrote:

 On Wed, Mar 4, 2015 at 3:13 PM, Bernd Edlinger
 bernd.edlin...@hotmail.de wrote:
 bounced... again, without html.


 Hi Richard,

 while working on another bug in the area of -fstrict-volatile-bitfields
 I became aware of another example where -fstrict-volatile-bitfields may 
 generate
 wrong code. This is reproducible on a !STRICT_ALIGNMENT target like x86_64.

 The problem is that strict_volatile_bitfield_p tries to allow more than 
 necessary
 if !STRICT_ALIGNMENT. Everything works OK on ARM for instance.

 If this function returns true, we may later call narrow_bit_field_mem, and
 the check in strict_volatile_bitfield_p should mirror the logic there:
 narrow_bit_field_mem just uses GET_MODE_BITSIZE (mode) and does not
 care about STRICT_ALIGNMENT, and in the end *new_bitnum + bitsize may
 reach beyond the end of the region. This causes store_fixed_bit_field_1
 to silently fail to generate correct code.

 Hmm, but the comment sounds like if using GET_MODE_ALIGNMENT is
 more correct (even for !strict-alignment) - if mode is SImode and mode
 alignment is 16 (HImode aligned) then we don't need to split the load
 if bitnum is 16 and bitsize is 32.

 So maybe narrow_bit_field_mem needs to be fixed as well?


I'd rather not touch that function

In the whole expmed.c the only place where  GET_MODE_ALIGNMENT
is used, is in simple_mem_bitfield_p but only if SLOW_UNALIGNED_ACCESS
returns 1, which is only the case on few targets.
Do you know any targets, where GET_MODE_ALIGNMENT may be less than
GET_MODE_BITSIZE?

Maybe one thing is missing from strict_volatile_bitfield_p, I am not sure.

Maybe it should check that MEM_ALIGN (op0)= GET_MODE_ALIGNMENT (fieldmode)
Because the strict volatile bitfields handling will inevitably try to use
the fieldmode to access the memory.

Or would it be better to say MEM_ALIGN (op0)= GET_MODE_BITSIZE (fieldmode),
because it is easier to explain when some one asks, when we guarantee the 
semantics
of strict volatile bitfield?

Probably there is already something in the logic in expr.c that prevents these 
cases,
because otherwise it would be way to easy to find an example for unaligned 
accesses
to unaligned memory on STRICT_ALIGNMENT targets.


Ok, what would you think about this variant?

--- expmed.c.jj    2015-01-16 11:20:40.0 +0100
+++ expmed.c    2015-03-05 11:50:09.40016 +0100
@@ -472,9 +472,11 @@ strict_volatile_bitfield_p (rtx op0, uns
 return false;
 
   /* Check for cases of unaligned fields that must be split.  */
-  if (bitnum % BITS_PER_UNIT + bitsize modesize
-  || (STRICT_ALIGNMENT
-       bitnum % GET_MODE_ALIGNMENT (fieldmode) + bitsize modesize))
+  if (bitnum % modesize + bitsize modesize)
+    return false;
+
+  /* Check that the memory is sufficiently aligned.  */
+  if (MEM_ALIGN (op0)  modesize)
 return false;
 
   /* Check for cases where the C++ memory model applies.  */


Trying to use an atomic access to a device register is pointless if the
memory context is not aligned to the MODE_BITSIZE, that has nothing
to do with MODE_ALIGNMENT, right?


Thanks
Bernd.

  

RE: [PATCH] Fix another wrong-code bug with -fstrict-volatile-bitfields

2015-03-05 Thread Bernd Edlinger
Hi,

On Thu, 5 Mar 2015 12:24:56, Richard Biener wrote:

 On Thu, Mar 5, 2015 at 12:00 PM, Bernd Edlinger
 bernd.edlin...@hotmail.de wrote:
 Hi,

 On Thu, 5 Mar 2015 10:00:26, Richard Biener wrote:

 On Wed, Mar 4, 2015 at 3:13 PM, Bernd Edlinger
 Maybe one thing is missing from strict_volatile_bitfield_p, I am not sure.

 Maybe it should check that MEM_ALIGN (op0)= GET_MODE_ALIGNMENT (fieldmode)
 Because the strict volatile bitfields handling will inevitably try to use
 the fieldmode to access the memory.

 Or would it be better to say MEM_ALIGN (op0)= GET_MODE_BITSIZE (fieldmode),
 because it is easier to explain when some one asks, when we guarantee the 
 semantics
 of strict volatile bitfield?

 But on non-strict-align targets you can even for 1-byte aligned MEMs
 access an SImode field directly. So the old code looks correct to me
 here and the fix needs to be done somewhere else.


But this SImode access is split up in several QImode or HImode accesses,
in the processor's execution pipeline, finally on an external bus like AXI all 
memory
transactions are aligned.  The difference is just that some processors can 
split up
the unaligned accesses and some need help from the compiler, but even on an
x86 we have a different semantics for unaligned acceses, that is they are no 
longer atomic,
while an aligned access is always executed as an atomic transaction on an x86 
processor.

 Probably there is already something in the logic in expr.c that prevents 
 these cases,
 because otherwise it would be way to easy to find an example for unaligned 
 accesses
 to unaligned memory on STRICT_ALIGNMENT targets.


 Ok, what would you think about this variant?

 --- expmed.c.jj 2015-01-16 11:20:40.0 +0100
 +++ expmed.c 2015-03-05 11:50:09.40016 +0100
 @@ -472,9 +472,11 @@ strict_volatile_bitfield_p (rtx op0, uns
 return false;

 /* Check for cases of unaligned fields that must be split. */
 - if (bitnum % BITS_PER_UNIT + bitsize modesize
 - || (STRICT_ALIGNMENT
 -  bitnum % GET_MODE_ALIGNMENT (fieldmode) + bitsize modesize))
 + if (bitnum % modesize + bitsize modesize)
 + return false;
 +
 + /* Check that the memory is sufficiently aligned. */
 + if (MEM_ALIGN (op0)  modesize)

 I think that only applies to strict-align targets and then only for
 GET_MODE_ALIGNMENT (modesize). And of course what matters
 is the alignment at bitnum - even though op0 may be not sufficiently
 aligned it may have known misalignment so that op0 + bitnum is
 sufficiently aligned.

 Testcases would need to annotate structs with packed/aligned attributes
 to get at these cases.

 For the testcase included in the patch, what does the patch end up doing?
 Not going the strict-volatile bitfield expansion path? That looks unnecessary
 on !strict-alignment targets but resonable on strict-align targets where the
 access would need to be splitted. So, why does it end up being splitted
 on !strict-align targets?


gcc -fstrict-volatile-bitfields -S 20150304-1.c
without patch we find this in 20150304-1.s:

main:
.LFB0:
    .cfi_startproc
    pushq   %rbp
    .cfi_def_cfa_offset 16
    .cfi_offset 6, -16
    movq    %rsp, %rbp
    .cfi_def_cfa_register 6
    movzbl  global+1(%rip), %eax
    orl $-1, %eax
    movb    %al, global+1(%rip)
    movzbl  global+2(%rip), %eax
    orl $-1, %eax
    movb    %al, global+2(%rip)
    movzbl  global+3(%rip), %eax
    orl $-1, %eax
    movb    %al, global+3(%rip)
    movzbl  global+4(%rip), %eax
    orl $127, %eax
    movb    %al, global+4(%rip)
    movl    global(%rip), %eax
    shrl    $8, %eax
    andl    $2147483647, %eax
    cmpl    $2147483647, %eax
    je  .L2
    call    abort
.L2:
    movl    $0, %eax
    popq    %rbp
    .cfi_def_cfa 7, 8
    ret
    .cfi_endproc


so the write path is OK, because strict_volatile_bitfield_p returns false,
because

  /* Check for cases where the C++ memory model applies.  */
  if (bitregion_end != 0
   (bitnum - bitnum % modesize  bitregion_start
  || bitnum - bitnum % modesize + modesize - 1 bitregion_end))
    return false;


bitregion_start=8, bitregion_end=39
bitnum - bitnum % modesize = 0
bitnum - bitnum % modesize + modesize - 1 = 31


this does not happen in the read path, and the problem is the access
here does not work:

  str_rtx = narrow_bit_field_mem (str_rtx, fieldmode, bitsize, bitnum,
  bitnum);
  /* Explicitly override the C/C++ memory model; ignore the
 bit range so that we can do the access in the mode mandated
 by -fstrict-volatile-bitfields instead.  */
  store_fixed_bit_field_1 (str_rtx, bitsize, bitnum, value);


str_rtx = unchanged, bitnum = 8, bitsize= 31, but store_fixed_bit_fileld_1
can not handle that.

BTW: I can make the write code path also fail if I change this bit
in the test case add :8 to char 

RE: [PATCH] Fix another wrong-code bug with -fstrict-volatile-bitfields

2015-03-04 Thread Bernd Edlinger
bounced... again, without html.


Hi Richard,

while working on another bug in the area of   -fstrict-volatile-bitfields
I became aware of another example where   -fstrict-volatile-bitfields may 
generate
wrong code.   This is reproducible on a !STRICT_ALIGNMENT target like x86_64.

The problem is   that strict_volatile_bitfield_p tries to allow more than 
necessary  
if !STRICT_ALIGNMENT.    Everything works OK on ARM for instance.

If this function returns true, we may later call narrow_bit_field_mem, and  
the check in strict_volatile_bitfield_p should mirror the logic there:  
narrow_bit_field_mem just uses GET_MODE_BITSIZE (mode) and does not  
care about STRICT_ALIGNMENT, and in the end  *new_bitnum + bitsize may  
reach beyond the end of the region. This causes store_fixed_bit_field_1  
to silently fail to generate correct code.

The attached patch was   boot-strapped and
regression-tested on x86_64-linux-gnu.

OK for trunk and 4.9?


Thanks
Bernd. 
  gcc:
2015-03-04  Bernd Edlinger  bernd.edlin...@hotmail.de

* expmed.c (strict_volatile_bitfield_p): Don't return different
results if !STRICT_ALIGNMENT.
Use GET_MODE_BITSIZE instead of GET_MODE_ALIGNMENT here.

testsuite:
2015-03-04  Bernd Edlinger  bernd.edlin...@hotmail.de

* gcc.dg/20150304-1.c: New test.


patch-volatile-bitfields-1.diff
Description: Binary data