Re: aliasing

2024-03-18 Thread David Brown

On 18/03/2024 16:00, Martin Uecker via Gcc wrote:

Am Montag, dem 18.03.2024 um 14:29 +0100 schrieb David Brown:


On 18/03/2024 12:41, Martin Uecker wrote:



Hi David,

Am Montag, dem 18.03.2024 um 10:00 +0100 schrieb David Brown:

Hi,

I would very glad to see this change in the standards.


Should "byte type" include all character types (signed, unsigned and
plain), or should it be restricted to "unsigned char" since that is the
"byte" type ?  (I think allowing all character types makes sense, but
only unsigned char is guaranteed to be suitable for general object
backing store.)


At the moment, the special type that can access all others are
all non-atomic character types.  So for symmetry reasons, it
seems that this is also what we want for backing store.

I am not sure what you mean by "only unsigned char". Are you talking
about C++?  "unsigned char" has no special role in C.



"unsigned char" does have a special role in C - in 6.2.6.1p4 it
describes any object as being able to be copied to an array of unsigned
char to get the "object representation".
  The same is not true for an
array of "signed char".  I think it would be possible to have an
implementation where "signed char" was 8-bit two's complement except
that 0x80 would be a trap representation rather than -128.  I am not
sure of the consequences of such an implementation (assuming I am even
correct in it being allowed).


Yes, but with C23 this is not possible anymore. I think signed
char or char should work equally well now.


I have just noticed that in C23, the SCHAR_MIN is -128 (or -2 ^ (N-1) in 
general), eliminating the possibility of having a trap value for signed 
char (or any other integer type without padding bits).  There's always a 
bit of jumping around in the C standards to get the complete picture!


But as I said in another post, I still worry a little about the unsigned 
to signed conversion being implementation-defined, and therefore not 
guaranteed to work in a way that preserves the underlying object 
representation.  I think it should be possible to make a small change to 
the description of unsigned to signed conversions to eliminate that.





I see people making a lot of assumptions in their embedded programming
that are not fully justified in the C standards.  Sometimes the
assumptions are just bad, or it would be easy to write code without the
assumptions.  But at other times it would be very awkward or inefficient
to write code that is completely "safe" (in terms of having fully
defined behaviour from the C standards or from implementation-dependent
behaviour).  Making your own dynamic memory allocation functions is one
such case.  So I have a tendency to jump on any suggestion of changes to
the C (or C++) standards that could let people write such essential code
in a safer or more efficient manner.


That something is undefined does not automatically mean it is
forbidden or unsafe.  It simply means it is not portable.  


That is the case for things that are not defined in the C standards, but 
defined elsewhere.  If the behaviour of a piece of code is not defined 
anywhere for the toolchain you are using, then it is inherently unsafe 
to use.  ("Forbidden" is another matter.  It might be "forbidden" by 
your coding standards, or your boss, but the language and tools don't 
forbid things!)


Something that is not defined in the C standards, but defined in your 
compiler manual or additional standards (such as POSIX) is safe to use 
but limited in portability.


And of course something that is "inherently unsafe" may be considered 
safe in practice, by analysing the generated object code or doing 
exhaustive testing.



I think
in the embedded space it will be difficult to make everything well
defined.  


Yes, that is absolutely true.  (And it is even more difficult if you try 
to restrict yourself to things with full definitions in the C standards 
or explicit implementation-defined behaviour documented by toolchains. 
You almost invariably need some degree of compiler extensions for parts 
of the code.)


But I want to reduce to the smallest practical level the amount of code 
that "works in practice" rather than "known to work by design".



But I fully agree that widely used techniques should
ideally be based on defined behavior and we should  change the
standard accordingly.



Yes, where possible and practical, the standard provide the guarantees 
that programmers need.  Failing that, compiler extensions are good too - 
I'd be very happy with a GCC variable __attribute__ "backing_store" that 
could be applied to allocator backing stores and provide the aliasing 
guarantees needed.  (It might even be needed anyway, to work well with 
the "malloc" attribute, even with your change to the standard.)



David



Re: aliasing

2024-03-18 Thread David Brown

On 18/03/2024 17:46, David Brown via Gcc wrote:

On 18/03/2024 14:54, Andreas Schwab via Gcc wrote:

On Mär 18 2024, David Brown wrote:


I think it would be possible to have an implementation where "signed
char" was 8-bit two's complement except that 0x80 would be a trap
representation rather than -128.


signed char cannot have padding bits, thus it cannot have a trap
representation.



The premise is correct (no padding bits are allowed in signed char), but 
does it follow that it cannot have a trap representation?


5.2.4.2.1p3 in C23 makes the range of a signed integer type go from
- (2 ^ (N-1)) to (2 ^ (N-1)) - 1, which means all values are valid and 
there can be no trap value if there are no padding bits.


I don't think 
the standards are clear either way here - I think the committee missed a 
chance to tidy up the description a bit more when C23 removed formats 
other than two's complement for signed integer types.


I also feel slightly uneasy using signed char for accessing object 
representations since the object representation is defined in terms of 
an unsigned char array, and conversion from unsigned char to signed char 
is implementation-defined.  (This too could have been tightened in C23, 
as there is unlikely to be any implementation that does not do the 
conversion in the obvious manner.)


But I am perhaps worrying too much here.









Re: aliasing

2024-03-18 Thread David Brown via Gcc

On 18/03/2024 14:54, Andreas Schwab via Gcc wrote:

On Mär 18 2024, David Brown wrote:


I think it would be possible to have an implementation where "signed
char" was 8-bit two's complement except that 0x80 would be a trap
representation rather than -128.


signed char cannot have padding bits, thus it cannot have a trap
representation.



The premise is correct (no padding bits are allowed in signed char), but 
does it follow that it cannot have a trap representation?  I don't think 
the standards are clear either way here - I think the committee missed a 
chance to tidy up the description a bit more when C23 removed formats 
other than two's complement for signed integer types.


I also feel slightly uneasy using signed char for accessing object 
representations since the object representation is defined in terms of 
an unsigned char array, and conversion from unsigned char to signed char 
is implementation-defined.  (This too could have been tightened in C23, 
as there is unlikely to be any implementation that does not do the 
conversion in the obvious manner.)


But I am perhaps worrying too much here.






Re: aliasing

2024-03-18 Thread Martin Uecker via Gcc
Am Montag, dem 18.03.2024 um 14:21 +0100 schrieb Richard Biener:
> On Mon, Mar 18, 2024 at 12:56 PM Martin Uecker  wrote:
> > 
> > Am Montag, dem 18.03.2024 um 11:55 +0100 schrieb Martin Uecker:
> > > Am Montag, dem 18.03.2024 um 09:26 +0100 schrieb Richard Biener:
> > > > On Mon, Mar 18, 2024 at 8:03 AM Martin Uecker  wrote:
> > > > 
> > > 
> > > > 
> > > > Let me give you an complication example made valid in C++:
> > > > 
> > > > struct B { float x; float y; };
> > > > struct X { int n; char buf[8]; } x, y;
> > > > 
> > > > void foo(struct B *b)
> > > > {
> > > >   memcpy (x.buf, b, sizeof (struct B)); // in C++:  new (x.buf) B (*b);
> > > 
> > > Let's make it an explicit store for the moment
> > > (should not make a difference though):
> > > 
> > > *(struct B*)x.buf = *b;
> > > 
> > > >   y = x; // (*)
> > > > }
> > > > 
> > > > What's the effective type of 'x' in the 'y = x' copy?
> > > 
> > > Good point. The existing wording would take the declared
> > > type of x as the effective type, but this may not be
> > > what you are interested in. Let's assume that x has no declared
> > > type but that it had effective type struct X before the
> > > store to x.buf (because of an even earlier store to
> > > x with type struct X).
> > > 
> > > There is a general question how stores to subobjects
> > > affect effective types and I do not think this is clear
> > > even before this proposed change.
> > 
> > Actually, I think this is not allowed because:
> > 
> > "An object shall have its stored value accessed only by an
> > lvalue expression that has one of the following types:
> > 
> > — a type compatible with the effective type of the object,
> > ...
> > — an aggregate or union type that includes one of the
> > aforementioned types among its members (including,
> > recursively, a member of a subaggregate or contained union), or
> > 
> > — a character type."
> > 
> > ... and we would need to move "a character type" above
> > in the list to make it defined.
> 
> So after
> 
> *(struct B*)x.buf = *b;
> 
> 'x' cannot be used to access itself?  In particular also
> an access to 'x.n' is affected by this?

According to the current wording and assuming x has no
a declared type,  x.buf would acquire an effective 
type of struct B. Then if  x.buf is read as part of 
a x it is accessed with an lvalue of struct X (which
does not include a struct B but a character buffer).

So yes, currently it would  be undefined behavior 
and the proposed wording would not change this. Clearly,
we should include an additional change to fix this.

> 
> You are right that the current wording of the standard doesn't
> clarify any of this but this kind of storage abstraction is used
> commonly in the embedded world when there's no runtime
> library providing allocation.  And you said you want to make
> the standard closer to implementation practice ...

Well, we are working on it... Any help is much appreciated.

> 
> Elsewhere when doing 'y = x' people refer to the wording that
> aggregates are copied elementwise but it's not specified how
> those elementwise accesses work - the lvalues are still of type
> X here or are new lvalues implicitly formed and fall under the
> other wordings? 

I think there is no wording for elementwise copy.

My understanding is that the 

"...an aggregate or union type that includes..."

wording above is supposed to define this via an lvalue
access with aggregate or union type.  It blesses the
implied access to the elements via the access with 
an lvalue which has the type of the aggregate.  


>  Can I thus form an effective type of X by
> storing it's subobjects at respective offsets (ignoring padding,
> for example) and can I then use an lvalue of type 'X' to access
> the whole aggregate?

I think this is defined behavior.  The subjects get
their effective types via the individual stores and then 
the access using lvalue of type 'X' is ok according to
the "..an aggregate or union type that includes.."
rule.


Martin




Re: aliasing

2024-03-18 Thread Martin Uecker via Gcc
Am Montag, dem 18.03.2024 um 14:29 +0100 schrieb David Brown:
> 
> On 18/03/2024 12:41, Martin Uecker wrote:
> > 
> > 
> > Hi David,
> > 
> > Am Montag, dem 18.03.2024 um 10:00 +0100 schrieb David Brown:
> > > Hi,
> > > 
> > > I would very glad to see this change in the standards.
> > > 
> > > 
> > > Should "byte type" include all character types (signed, unsigned and
> > > plain), or should it be restricted to "unsigned char" since that is the
> > > "byte" type ?  (I think allowing all character types makes sense, but
> > > only unsigned char is guaranteed to be suitable for general object
> > > backing store.)
> > 
> > At the moment, the special type that can access all others are
> > all non-atomic character types.  So for symmetry reasons, it
> > seems that this is also what we want for backing store.
> > 
> > I am not sure what you mean by "only unsigned char". Are you talking
> > about C++?  "unsigned char" has no special role in C.
> > 
> 
> "unsigned char" does have a special role in C - in 6.2.6.1p4 it 
> describes any object as being able to be copied to an array of unsigned 
> char to get the "object representation". 
>  The same is not true for an 
> array of "signed char".  I think it would be possible to have an 
> implementation where "signed char" was 8-bit two's complement except 
> that 0x80 would be a trap representation rather than -128.  I am not 
> sure of the consequences of such an implementation (assuming I am even 
> correct in it being allowed).

Yes, but with C23 this is not possible anymore. I think signed
char or char should work equally well now. 

> 
> > > 
> > > Should it also include "uint8_t" (if it exists) ?  "uint8_t" is often an
> > > alias for "unsigned char", but it could be something different, like an
> > > alias for __UINT8_TYPE__, or "unsigned int
> > > __attribute__((mode(QImode)))", which is used in the AVR gcc port.
> > 
> > I think this might be a reason to not include it, as it could
> > affect aliasing analysis. At least, this would be a different
> > independent change to consider.
> > 
> 
> I think it is important that there is a guarantee here, because people 
> do use uint8_t as a generic "raw memory" type.  Embedded standards like 
> MISRA strongly discourage the use of "unsized" types such as "unsigned 
> char", and it is generally assumed that "uint8_t" has the aliasing 
> superpowers of a character type.  But it is possible that the a change 
> would be better put in the library section on  rather than 
> this section.
> 
> > > 
> > > In my line of work - small-systems embedded development - it is common
> > > to have "home-made" or specialised memory allocation systems rather than
> > > relying on a generic heap.  This is, I think, some of the "existing
> > > practice" that you are considering here - there is a "backing store" of
> > > some sort that can be allocated and used as objects of a type other than
> > > the declared type of the backing store.  While a simple unsigned char
> > > array is a very common kind of backing store, there are others that are
> > > used, and it would be good to be sure of the correctness guarantees for
> > > these.  Possibilities that I have seen include:
> > > 
> > > unsigned char heap1[N];
> > > 
> > > uint8_t heap2[N];
> > > 
> > > union {
> > >   double dummy_for_alignment;
> > >   char heap[N];
> > > } heap3;
> > > 
> > > struct {
> > >   uint32_t capacity;
> > >   uint8_t * p_next_free;
> > >   uint8_t heap[N];
> > > } heap4;
> > > 
> > > uint32_t heap5[N];
> > > 
> > > Apart from this last one, if "uint8_t" is guaranteed to be a "byte
> > > type", then I believe your wording means that these unions and structs
> > > would also work as "byte arrays".  But it might be useful to add a
> > > footnote clarifying that.
> > > 
> > 
> > I need to think about this.
> > 
> 
> Thank you.
> 
> I see people making a lot of assumptions in their embedded programming 
> that are not fully justified in the C standards.  Sometimes the 
> assumptions are just bad, or it would be easy to write code without the 
> assumptions.  But at other times it would be very awkward or inefficient 
> to write code that is completely "safe" (in terms of having fully 
> defined behaviour from the C standards or from implementation-dependent 
> behaviour).  Making your own dynamic memory allocation functions is one 
> such case.  So I have a tendency to jump on any suggestion of changes to 
> the C (or C++) standards that could let people write such essential code 
> in a safer or more efficient manner.

That something is undefined does not automatically mean it is 
forbidden or unsafe.  It simply means it is not portable.  I think
in the embedded space it will be difficult to make everything well
defined.  But I fully agree that widely used techniques should
ideally be based on defined behavior and we should  change the
standard accordingly.

> 
> > > (It is also not uncommon to have the backing space allocated by the
> > > linker, but then 

Re: aliasing

2024-03-18 Thread Andreas Schwab via Gcc
On Mär 18 2024, David Brown wrote:

> I think it would be possible to have an implementation where "signed
> char" was 8-bit two's complement except that 0x80 would be a trap
> representation rather than -128.

signed char cannot have padding bits, thus it cannot have a trap
representation.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."


Re: aliasing

2024-03-18 Thread David Brown




On 18/03/2024 12:41, Martin Uecker wrote:



Hi David,

Am Montag, dem 18.03.2024 um 10:00 +0100 schrieb David Brown:

Hi,

I would very glad to see this change in the standards.


Should "byte type" include all character types (signed, unsigned and
plain), or should it be restricted to "unsigned char" since that is the
"byte" type ?  (I think allowing all character types makes sense, but
only unsigned char is guaranteed to be suitable for general object
backing store.)


At the moment, the special type that can access all others are
all non-atomic character types.  So for symmetry reasons, it
seems that this is also what we want for backing store.

I am not sure what you mean by "only unsigned char". Are you talking
about C++?  "unsigned char" has no special role in C.



"unsigned char" does have a special role in C - in 6.2.6.1p4 it 
describes any object as being able to be copied to an array of unsigned 
char to get the "object representation".  The same is not true for an 
array of "signed char".  I think it would be possible to have an 
implementation where "signed char" was 8-bit two's complement except 
that 0x80 would be a trap representation rather than -128.  I am not 
sure of the consequences of such an implementation (assuming I am even 
correct in it being allowed).




Should it also include "uint8_t" (if it exists) ?  "uint8_t" is often an
alias for "unsigned char", but it could be something different, like an
alias for __UINT8_TYPE__, or "unsigned int
__attribute__((mode(QImode)))", which is used in the AVR gcc port.


I think this might be a reason to not include it, as it could
affect aliasing analysis. At least, this would be a different
independent change to consider.



I think it is important that there is a guarantee here, because people 
do use uint8_t as a generic "raw memory" type.  Embedded standards like 
MISRA strongly discourage the use of "unsized" types such as "unsigned 
char", and it is generally assumed that "uint8_t" has the aliasing 
superpowers of a character type.  But it is possible that the a change 
would be better put in the library section on  rather than 
this section.




In my line of work - small-systems embedded development - it is common
to have "home-made" or specialised memory allocation systems rather than
relying on a generic heap.  This is, I think, some of the "existing
practice" that you are considering here - there is a "backing store" of
some sort that can be allocated and used as objects of a type other than
the declared type of the backing store.  While a simple unsigned char
array is a very common kind of backing store, there are others that are
used, and it would be good to be sure of the correctness guarantees for
these.  Possibilities that I have seen include:

unsigned char heap1[N];

uint8_t heap2[N];

union {
double dummy_for_alignment;
char heap[N];
} heap3;

struct {
uint32_t capacity;
uint8_t * p_next_free;
uint8_t heap[N];
} heap4;

uint32_t heap5[N];

Apart from this last one, if "uint8_t" is guaranteed to be a "byte
type", then I believe your wording means that these unions and structs
would also work as "byte arrays".  But it might be useful to add a
footnote clarifying that.



I need to think about this.



Thank you.

I see people making a lot of assumptions in their embedded programming 
that are not fully justified in the C standards.  Sometimes the 
assumptions are just bad, or it would be easy to write code without the 
assumptions.  But at other times it would be very awkward or inefficient 
to write code that is completely "safe" (in terms of having fully 
defined behaviour from the C standards or from implementation-dependent 
behaviour).  Making your own dynamic memory allocation functions is one 
such case.  So I have a tendency to jump on any suggestion of changes to 
the C (or C++) standards that could let people write such essential code 
in a safer or more efficient manner.



(It is also not uncommon to have the backing space allocated by the
linker, but then it falls under the existing "no declared type" case.)


Yes, although with the change we would make the "no declared type" also
be byte arrays, so there is then simply no difference anymore.



Fair enough.  (Linker-defined storage does not just have no declared 
type, it has no directly declared size or other properties either.  The 
start and the stop of the storage area is typically declared as "extern 
uint8_t __space_start[], __space_stop[];", or perhaps as single 
characters or uint32_t types.  The space in between is just calculated 
as the difference between pointers to these.)





I would not want uint32_t to be considered an "alias anything" type, but
I have occasionally seen such types used for memory store backings.  It
is perhaps worth considering defining "byte type" as "non-atomic
character type, [u]int8_t (if they exist), or other
implementation-defined types".


This could make sense, the question 

Re: aliasing

2024-03-18 Thread Richard Biener via Gcc
On Mon, Mar 18, 2024 at 12:56 PM Martin Uecker  wrote:
>
> Am Montag, dem 18.03.2024 um 11:55 +0100 schrieb Martin Uecker:
> > Am Montag, dem 18.03.2024 um 09:26 +0100 schrieb Richard Biener:
> > > On Mon, Mar 18, 2024 at 8:03 AM Martin Uecker  wrote:
> > >
> >
> > >
> > > Let me give you an complication example made valid in C++:
> > >
> > > struct B { float x; float y; };
> > > struct X { int n; char buf[8]; } x, y;
> > >
> > > void foo(struct B *b)
> > > {
> > >   memcpy (x.buf, b, sizeof (struct B)); // in C++:  new (x.buf) B (*b);
> >
> > Let's make it an explicit store for the moment
> > (should not make a difference though):
> >
> > *(struct B*)x.buf = *b;
> >
> > >   y = x; // (*)
> > > }
> > >
> > > What's the effective type of 'x' in the 'y = x' copy?
> >
> > Good point. The existing wording would take the declared
> > type of x as the effective type, but this may not be
> > what you are interested in. Let's assume that x has no declared
> > type but that it had effective type struct X before the
> > store to x.buf (because of an even earlier store to
> > x with type struct X).
> >
> > There is a general question how stores to subobjects
> > affect effective types and I do not think this is clear
> > even before this proposed change.
>
> Actually, I think this is not allowed because:
>
> "An object shall have its stored value accessed only by an
> lvalue expression that has one of the following types:
>
> — a type compatible with the effective type of the object,
> ...
> — an aggregate or union type that includes one of the
> aforementioned types among its members (including,
> recursively, a member of a subaggregate or contained union), or
>
> — a character type."
>
> ... and we would need to move "a character type" above
> in the list to make it defined.

So after

*(struct B*)x.buf = *b;

'x' cannot be used to access itself?  In particular also
an access to 'x.n' is affected by this?

You are right that the current wording of the standard doesn't
clarify any of this but this kind of storage abstraction is used
commonly in the embedded world when there's no runtime
library providing allocation.  And you said you want to make
the standard closer to implementation practice ...

Elsewhere when doing 'y = x' people refer to the wording that
aggregates are copied elementwise but it's not specified how
those elementwise accesses work - the lvalues are still of type
X here or are new lvalues implicitly formed and fall under the
other wordings?  Can I thus form an effective type of X by
storing it's subobjects at respective offsets (ignoring padding,
for example) and can I then use an lvalue of type 'X' to access
the whole aggregate?

Richard.

> Martin
>
>


Re: aliasing

2024-03-18 Thread Martin Uecker via Gcc
Am Montag, dem 18.03.2024 um 11:55 +0100 schrieb Martin Uecker:
> Am Montag, dem 18.03.2024 um 09:26 +0100 schrieb Richard Biener:
> > On Mon, Mar 18, 2024 at 8:03 AM Martin Uecker  wrote:
> > 
> 
> > 
> > Let me give you an complication example made valid in C++:
> > 
> > struct B { float x; float y; };
> > struct X { int n; char buf[8]; } x, y;
> > 
> > void foo(struct B *b)
> > {
> >   memcpy (x.buf, b, sizeof (struct B)); // in C++:  new (x.buf) B (*b);
> 
> Let's make it an explicit store for the moment
> (should not make a difference though):
> 
> *(struct B*)x.buf = *b;
> 
> >   y = x; // (*)
> > }
> > 
> > What's the effective type of 'x' in the 'y = x' copy? 
> 
> Good point. The existing wording would take the declared
> type of x as the effective type, but this may not be
> what you are interested in. Let's assume that x has no declared
> type but that it had effective type struct X before the
> store to x.buf (because of an even earlier store to 
> x with type struct X).
> 
> There is a general question how stores to subobjects
> affect effective types and I do not think this is clear
> even before this proposed change.

Actually, I think this is not allowed because:

"An object shall have its stored value accessed only by an
lvalue expression that has one of the following types:

— a type compatible with the effective type of the object,
...
— an aggregate or union type that includes one of the
aforementioned types among its members (including,
recursively, a member of a subaggregate or contained union), or

— a character type."

... and we would need to move "a character type" above
in the list to make it defined.

Martin




Re: aliasing

2024-03-18 Thread Martin Uecker via Gcc



Hi David,

Am Montag, dem 18.03.2024 um 10:00 +0100 schrieb David Brown:
> Hi,
> 
> I would very glad to see this change in the standards.
> 
> 
> Should "byte type" include all character types (signed, unsigned and 
> plain), or should it be restricted to "unsigned char" since that is the 
> "byte" type ?  (I think allowing all character types makes sense, but 
> only unsigned char is guaranteed to be suitable for general object 
> backing store.)

At the moment, the special type that can access all others are
all non-atomic character types.  So for symmetry reasons, it
seems that this is also what we want for backing store.

I am not sure what you mean by "only unsigned char". Are you talking
about C++?  "unsigned char" has no special role in C.

> 
> Should it also include "uint8_t" (if it exists) ?  "uint8_t" is often an 
> alias for "unsigned char", but it could be something different, like an 
> alias for __UINT8_TYPE__, or "unsigned int 
> __attribute__((mode(QImode)))", which is used in the AVR gcc port.

I think this might be a reason to not include it, as it could
affect aliasing analysis. At least, this would be a different
independent change to consider.

> 
> In my line of work - small-systems embedded development - it is common 
> to have "home-made" or specialised memory allocation systems rather than 
> relying on a generic heap.  This is, I think, some of the "existing 
> practice" that you are considering here - there is a "backing store" of 
> some sort that can be allocated and used as objects of a type other than 
> the declared type of the backing store.  While a simple unsigned char 
> array is a very common kind of backing store, there are others that are 
> used, and it would be good to be sure of the correctness guarantees for 
> these.  Possibilities that I have seen include:
> 
> unsigned char heap1[N];
> 
> uint8_t heap2[N];
> 
> union {
>   double dummy_for_alignment;
>   char heap[N];
> } heap3;
> 
> struct {
>   uint32_t capacity;
>   uint8_t * p_next_free;
>   uint8_t heap[N];
> } heap4;
> 
> uint32_t heap5[N];
> 
> Apart from this last one, if "uint8_t" is guaranteed to be a "byte 
> type", then I believe your wording means that these unions and structs 
> would also work as "byte arrays".  But it might be useful to add a 
> footnote clarifying that.
> 

I need to think about this. 

> (It is also not uncommon to have the backing space allocated by the 
> linker, but then it falls under the existing "no declared type" case.)

Yes, although with the change we would make the "no declared type" also 
be byte arrays, so there is then simply no difference anymore.

> 
> 
> I would not want uint32_t to be considered an "alias anything" type, but 
> I have occasionally seen such types used for memory store backings.  It 
> is perhaps worth considering defining "byte type" as "non-atomic 
> character type, [u]int8_t (if they exist), or other 
> implementation-defined types".

This could make sense, the question is whether we want to encourage
the use of other types for this use case, as this would then not
be portable.

Are there important reason for not using "unsigned char" ?

> 
> Some other compilers might guarantee not to do type-based alias analysis 
> and thus view all types as "byte types" in this way.  For gcc, there 
> could be a kind of reverse "may_alias" type attribute to create such types.
> 
> 
> 
> There are a number of other features that could make allocation 
> functions more efficient and safer in use, and which could be ideally be 
> standardised in the C standards or at least added as gcc extensions, but 
> I think that's more than you are looking for here!

It is possible to submit proposal to WG14.

Martin


> 
> David
> 
> 
> 
> On 18/03/2024 08:03, Martin Uecker via Gcc wrote:
> > 
> > Hi,
> > 
> > can you please take a quick look at this? This is intended to align
> > the C standard with existing practice with respect to aliasing by
> > removing the special rules for "objects with no declared type" and
> > making it fully symmetric and only based on types with non-atomic
> > character types being able to alias everything.
> > 
> > 
> > Unrelated to this change, I have another question:  I wonder if GCC
> > (or any other compiler) actually exploits the " or is copied as an
> > array of  byte type, " rule to  make assumptions about the effective
> > types of the target array? I know compilers do this work memcpy...
> > Maybe also if a loop is transformed to memcpy?
> > 
> > Martin
> > 
> > 
> > Add the following definition after 3.5, paragraph 2:
> > 
> > byte array
> > object having either no declared type or an array of objects declared with 
> > a byte type
> > 
> > byte type
> > non-atomic character type
> > 
> > Modify 6.5,paragraph 6:
> > The effective type of an object that is not a byte array, for an access to 
> > its
> > stored value, is the declared type of the object.97) If a value is
> > stored into a byte array through 

Re: aliasing

2024-03-18 Thread Martin Uecker via Gcc
Am Montag, dem 18.03.2024 um 09:26 +0100 schrieb Richard Biener:
> On Mon, Mar 18, 2024 at 8:03 AM Martin Uecker  wrote:
> > 
> > 
> > Hi,
> > 
> > can you please take a quick look at this? This is intended to align
> > the C standard with existing practice with respect to aliasing by
> > removing the special rules for "objects with no declared type" and
> > making it fully symmetric and only based on types with non-atomic
> > character types being able to alias everything.
> > 
> > 
> > Unrelated to this change, I have another question:  I wonder if GCC
> > (or any other compiler) actually exploits the " or is copied as an
> > array of  byte type, " rule to  make assumptions about the effective
> > types of the target array?
> 
> We do not make assumptions about this anymore.  We did in the
> past (might be a distant past) transform say
> 
> struct X { int i; float f; } a, b;
> 
> void foo ()
> {
>   __builtin_memcpy (, , sizeof (struct X));
> }
> 
> into
> 
>   a = b;
> 
> which has an lvalue of type struct X.  But this assumed b's effective
> type was X.  Nowadays we treat the copy as using alias set zero.
> That effectively means the destination gets its effective type "cleared"
> (all subsequent accesses are valid to access storage with the effective
> type of a byte array).

Ok, thanks!  I wonder whether we should remove this special rule
from the standard.  I mostly worried about the "copied as an
array of  byte type" wording which seems difficult to precisely
define.

> 
> > I know compilers do this work memcpy...
> > Maybe also if a loop is transformed to memcpy?
> 
> We currently do not preserve the original effective type of the destination
> (or the effective type used to access the source) when doing this.  With
> some tricks we could (we also lose aligment guarantees of the original
> accesses).
> 
> > Martin
> > 
> > 
> > Add the following definition after 3.5, paragraph 2:
> > 
> > byte array
> > object having either no declared type or an array of objects declared with 
> > a byte type
> > 
> > byte type
> > non-atomic character type

This essentially becomes the "alias anything" type.

> > 
> > Modify 6.5,paragraph 6:
> > The effective type of an object that is not a byte array, for an access to 
> > its
> > stored value, is the declared type of the object.97) If a value is
> > stored into a byte array through an lvalue having a byte type, then
> > the type of the lvalue becomes the effective type of the object for that
> > access and for subsequent accesses that do not modify the stored value.
> > If a value is copied into a byte array using memcpy or memmove, or is
> > copied as an array of byte type, then the effective type of the
> > modified object for that access and for subsequent accesses that do not
> > modify the value is the effective type of the object from which the
> > value is copied, if it has one. For all other accesses to a byte array,
> > the effective type of the object is simply the type of the lvalue used
> > for the access.
> 
> What's the purpose of this change?  To me this reads more confusing and
> complicated than what I find in the c23 draft from April last year.

Note that C23 has been finalized. This change is proposed for the
revision after c23. 

> 
> I'll note that GCC does not take advantage of "The effective type of an
> object for an access to its stored value is the declard type of the object",
> instead it always relies on the type of the lvalue (treating non-atomic
> character types specially, as well as treating all string ops like memcpy
> or strcpy as using a character type for the access) and the effective type
> of the object for that access and for subsequent accesses that do not
> modify the stored value always becomes that of the lvalue type used for
> the access.

Understood.

> 
> Let me give you an complication example made valid in C++:
> 
> struct B { float x; float y; };
> struct X { int n; char buf[8]; } x, y;
> 
> void foo(struct B *b)
> {
>   memcpy (x.buf, b, sizeof (struct B)); // in C++:  new (x.buf) B (*b);

Let's make it an explicit store for the moment
(should not make a difference though):

*(struct B*)x.buf = *b;

>   y = x; // (*)
> }
> 
> What's the effective type of 'x' in the 'y = x' copy? 

Good point. The existing wording would take the declared
type of x as the effective type, but this may not be
what you are interested in. Let's assume that x has no declared
type but that it had effective type struct X before the
store to x.buf (because of an even earlier store to 
x with type struct X).

There is a general question how stores to subobjects
affect effective types and I do not think this is clear
even before this proposed change.


>  With your new
> wording, does 'B' transfer to x.buf with memcpy?  

Yes, it would. At least this is the intention.

Note that this would currently be undefined behavior
because x.buf has a declared type. So this is main
thing we want to change, i.e. making this defined.

> What's the

Re: aliasing

2024-03-18 Thread Jonathan Wakely via Gcc
On Mon, 18 Mar 2024 at 09:01, David Brown wrote:
> Should it also include "uint8_t" (if it exists) ?  "uint8_t" is often an
> alias for "unsigned char", but it could be something different, like an
> alias for __UINT8_TYPE__, or "unsigned int
> __attribute__((mode(QImode)))", which is used in the AVR gcc port.

N.B. __UINT8_TYPE__ is not a type, it's just a macro that expands to
something else (like unsigned char).


Re: aliasing

2024-03-18 Thread David Brown

Hi,

I would very glad to see this change in the standards.


Should "byte type" include all character types (signed, unsigned and 
plain), or should it be restricted to "unsigned char" since that is the 
"byte" type ?  (I think allowing all character types makes sense, but 
only unsigned char is guaranteed to be suitable for general object 
backing store.)


Should it also include "uint8_t" (if it exists) ?  "uint8_t" is often an 
alias for "unsigned char", but it could be something different, like an 
alias for __UINT8_TYPE__, or "unsigned int 
__attribute__((mode(QImode)))", which is used in the AVR gcc port.


In my line of work - small-systems embedded development - it is common 
to have "home-made" or specialised memory allocation systems rather than 
relying on a generic heap.  This is, I think, some of the "existing 
practice" that you are considering here - there is a "backing store" of 
some sort that can be allocated and used as objects of a type other than 
the declared type of the backing store.  While a simple unsigned char 
array is a very common kind of backing store, there are others that are 
used, and it would be good to be sure of the correctness guarantees for 
these.  Possibilities that I have seen include:


unsigned char heap1[N];

uint8_t heap2[N];

union {
double dummy_for_alignment;
char heap[N];
} heap3;

struct {
uint32_t capacity;
uint8_t * p_next_free;
uint8_t heap[N];
} heap4;

uint32_t heap5[N];

Apart from this last one, if "uint8_t" is guaranteed to be a "byte 
type", then I believe your wording means that these unions and structs 
would also work as "byte arrays".  But it might be useful to add a 
footnote clarifying that.


(It is also not uncommon to have the backing space allocated by the 
linker, but then it falls under the existing "no declared type" case.)



I would not want uint32_t to be considered an "alias anything" type, but 
I have occasionally seen such types used for memory store backings.  It 
is perhaps worth considering defining "byte type" as "non-atomic 
character type, [u]int8_t (if they exist), or other 
implementation-defined types".


Some other compilers might guarantee not to do type-based alias analysis 
and thus view all types as "byte types" in this way.  For gcc, there 
could be a kind of reverse "may_alias" type attribute to create such types.




There are a number of other features that could make allocation 
functions more efficient and safer in use, and which could be ideally be 
standardised in the C standards or at least added as gcc extensions, but 
I think that's more than you are looking for here!


David



On 18/03/2024 08:03, Martin Uecker via Gcc wrote:


Hi,

can you please take a quick look at this? This is intended to align
the C standard with existing practice with respect to aliasing by
removing the special rules for "objects with no declared type" and
making it fully symmetric and only based on types with non-atomic
character types being able to alias everything.


Unrelated to this change, I have another question:  I wonder if GCC
(or any other compiler) actually exploits the " or is copied as an
array of  byte type, " rule to  make assumptions about the effective
types of the target array? I know compilers do this work memcpy...
Maybe also if a loop is transformed to memcpy?

Martin


Add the following definition after 3.5, paragraph 2:

byte array
object having either no declared type or an array of objects declared with a 
byte type

byte type
non-atomic character type

Modify 6.5,paragraph 6:
The effective type of an object that is not a byte array, for an access to its
stored value, is the declared type of the object.97) If a value is
stored into a byte array through an lvalue having a byte type, then
the type of the lvalue becomes the effective type of the object for that
access and for subsequent accesses that do not modify the stored value.
If a value is copied into a byte array using memcpy or memmove, or is
copied as an array of byte type, then the effective type of the
modified object for that access and for subsequent accesses that do not
modify the value is the effective type of the object from which the
value is copied, if it has one. For all other accesses to a byte array,
the effective type of the object is simply the type of the lvalue used
for the access.

https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3230.pdf







Re: aliasing

2024-03-18 Thread Richard Biener via Gcc
On Mon, Mar 18, 2024 at 8:03 AM Martin Uecker  wrote:
>
>
> Hi,
>
> can you please take a quick look at this? This is intended to align
> the C standard with existing practice with respect to aliasing by
> removing the special rules for "objects with no declared type" and
> making it fully symmetric and only based on types with non-atomic
> character types being able to alias everything.
>
>
> Unrelated to this change, I have another question:  I wonder if GCC
> (or any other compiler) actually exploits the " or is copied as an
> array of  byte type, " rule to  make assumptions about the effective
> types of the target array?

We do not make assumptions about this anymore.  We did in the
past (might be a distant past) transform say

struct X { int i; float f; } a, b;

void foo ()
{
  __builtin_memcpy (, , sizeof (struct X));
}

into

  a = b;

which has an lvalue of type struct X.  But this assumed b's effective
type was X.  Nowadays we treat the copy as using alias set zero.
That effectively means the destination gets its effective type "cleared"
(all subsequent accesses are valid to access storage with the effective
type of a byte array).

> I know compilers do this work memcpy...
> Maybe also if a loop is transformed to memcpy?

We currently do not preserve the original effective type of the destination
(or the effective type used to access the source) when doing this.  With
some tricks we could (we also lose aligment guarantees of the original
accesses).

> Martin
>
>
> Add the following definition after 3.5, paragraph 2:
>
> byte array
> object having either no declared type or an array of objects declared with a 
> byte type
>
> byte type
> non-atomic character type
>
> Modify 6.5,paragraph 6:
> The effective type of an object that is not a byte array, for an access to its
> stored value, is the declared type of the object.97) If a value is
> stored into a byte array through an lvalue having a byte type, then
> the type of the lvalue becomes the effective type of the object for that
> access and for subsequent accesses that do not modify the stored value.
> If a value is copied into a byte array using memcpy or memmove, or is
> copied as an array of byte type, then the effective type of the
> modified object for that access and for subsequent accesses that do not
> modify the value is the effective type of the object from which the
> value is copied, if it has one. For all other accesses to a byte array,
> the effective type of the object is simply the type of the lvalue used
> for the access.

What's the purpose of this change?  To me this reads more confusing and
complicated than what I find in the c23 draft from April last year.

I'll note that GCC does not take advantage of "The effective type of an
object for an access to its stored value is the declard type of the object",
instead it always relies on the type of the lvalue (treating non-atomic
character types specially, as well as treating all string ops like memcpy
or strcpy as using a character type for the access) and the effective type
of the object for that access and for subsequent accesses that do not
modify the stored value always becomes that of the lvalue type used for
the access.

Let me give you an complication example made valid in C++:

struct B { float x; float y; };
struct X { int n; char buf[8]; } x, y;

void foo(struct B *b)
{
  memcpy (x.buf, b, sizeof (struct B)); // in C++:  new (x.buf) B (*b);
  y = x; // (*)
}

What's the effective type of 'x' in the 'y = x' copy?  With your new
wording, does 'B' transfer to x.buf with memcpy?  What's the
frankenstein effective type of 'x' then?  What's the effective type
of 'y' after the copy?  Can an lvalue of type 'B' access y.buf?

Richard.

> https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3230.pdf
>
>


Re: Aliasing rules for unannotated SYMBOL_REFs

2020-02-03 Thread Richard Sandiford
Thanks for the answer, and sorry for slow follow-up.  Got distracted by
other things...

Jeff Law  writes:
> On Sat, 2020-01-25 at 09:31 +, Richard Sandiford wrote:
>> TL;DR: if we have two bare SYMBOL_REFs X and Y, neither of which have an
>> associated source-level decl and neither of which are in an anchor block:
>> 
>> (Q1) can a valid byte access at X+C alias a valid byte access at Y+C?
>> 
>> (Q2) can a valid byte access at X+C1 alias a valid byte access at Y+C2,
>>  C1 != C2?
>> 
>> Also:
>> 
>> (Q3) If X has a source-level decl and Y doesn't, and neither of them are
>>  in an anchor block, can valid accesses based on X alias valid accesses
>>  based on Y?
> So what are the  cases where Y won't have a source level decl but we
> have a decl in RTL?  anchors, other cases? 

Not really sure why I wrote "source-level" TBH.  I was really talking
about any symbol that has a SYMBOL_REF_DECL.

I think there are three "interesting" cases:

- symbols with a SYMBOL_REF_DECL
- anchor symbols
- bare symbols (i.e. everything else)

Bare symbols are hopefully rare these days.

>> (well, OK, that wasn't too short either...)
> I would have thought the answer would be "no" across the board.  But
> the code clearly indicates otherwise.
>
> Interposition clearly complicates things as do explicit aliases though.
>
>> This part seems obvious enough.  But then, apart from the special case of
>> forced address alignment, we use an offset-based check even for cmp==-1:
>> 
>>   /* Assume a potential overlap for symbolic addresses that went
>>   through alignment adjustments (i.e., that have negative
>>   sizes), because we can't know how far they are from each
>>   other.  */
>>   if (maybe_lt (xsize, 0) || maybe_lt (ysize, 0))
>>  return -1;
>>   /* If decls are different or we know by offsets that there is no 
>> overlap,
>>   we win.  */
>>   if (!cmp || !offset_overlap_p (c, xsize, ysize))
>>  return 0;
>> 
>> So we seem to be taking cmp==-1 to mean that although we don't know
>> the relationship between the symbols, it must be the case that either
>> (a) the symbols are equal (e.g. via aliasing) or (b) the accesses are
>> to non-overlapping objects.  In other words, one of the situations
>> described by cmp==1 or cmp==0 must be true, but we don't know which
>> at compile time.
> Right.  That was the conclusion I came to.  If a  SYMBOL_REF has an
> alias, the alias must have the same value as the SYMBOL_REF.  So their
> either equal or there's no valid case for overlap.
>
>> 
>> This means that in practice, the answer to (Q1) appears to be "yes"
>> but the answer to (Q2) appears to be "no".
> That would be my understanding once aliases/interpositioning come into
> play.
>
>> 
>> This somewhat contradicts:
>> 
>>   /* In general we assume that memory locations pointed to by different 
>> labels
>>  may overlap in undefined ways.  */
>>   return -1;
>> 
>> at the end of compare_base_symbol_refs, which seems to be saying
>> that the answer to (Q2) ought to be "yes" instead.  Which is right?
> I'm not sure how we could get to yes in that case.  A symbol alias or
> interposition ultimately still results in two symbols having the same
> final address.  Thus for a byte access if C1 != C2, then we can't have
> an overlap.

I think it's handling cases in which one symbol is a bare symbol (has no
decl and isn't an anchor).  I assumed the idea was that we could have a
decl-less SYMBOL_REF for the start of a particular section, or things
like that.

>> In PR92294 we have a symbol X at ANCHOR+OFFSET that's preemptible.
>> Under the (Q1)==yes/(Q2)==no assumption, cmp==-1 means that either
>> (a) X = ANCHOR+OFFSET or (b) X and ANCHOR reference non-overlapping
>> objects.  So we should take the offset into account when doing:
>> 
>>   if (!cmp || !offset_overlap_p (c, xsize, ysize))
>>  return 0;
>> 
>> Let's call this FIX1.
> So this is a really interesting wrinkle.  Doesn't this change Q2 to a
> yes?  In particular it changes the "invariant" that the symbols have
> the same address in the event of an symbol alias or interposition.  Of
> course one could ask the question of whether or not we should handle
> cases with anchors specially.

This wouldn't come under Q2, since that was about symbols that aren't in
an anchor block.  I think it just means we need to generalise the three
cases that don't involve bare symbols from:

  - known equal
  - independent
  - equal or independent

to:

  - known distance apart
  - independent
  - known distance apart or independent

It's fortunate that anchors themselves can't be interposed. :-)

>> But that then brings us to: why does memrefs_conflict_p return -1
>> when one symbol X has a decl and the other symbol Y doesn't, and neither
>> of them are block symbols?  Is the answer to (Q3) that we allow equality
>> but not overlap here too?  E.g. a linker script could define Y to X but
>> not to a region that contains X at a nonzero 

Re: Aliasing rules for unannotated SYMBOL_REFs

2020-01-27 Thread Jeff Law
On Sat, 2020-01-25 at 09:31 +, Richard Sandiford wrote:
> TL;DR: if we have two bare SYMBOL_REFs X and Y, neither of which have an
> associated source-level decl and neither of which are in an anchor block:
> 
> (Q1) can a valid byte access at X+C alias a valid byte access at Y+C?
> 
> (Q2) can a valid byte access at X+C1 alias a valid byte access at Y+C2,
>  C1 != C2?
> 
> Also:
> 
> (Q3) If X has a source-level decl and Y doesn't, and neither of them are
>  in an anchor block, can valid accesses based on X alias valid accesses
>  based on Y?
So what are the  cases where Y won't have a source level decl but we
have a decl in RTL?  anchors, other cases? 


> 
> (well, OK, that wasn't too short either...)
I would have thought the answer would be "no" across the board.  But
the code clearly indicates otherwise.

Interposition clearly complicates things as do explicit aliases though.



> 
> This part seems obvious enough.  But then, apart from the special case of
> forced address alignment, we use an offset-based check even for cmp==-1:
> 
>   /* Assume a potential overlap for symbolic addresses that went
>through alignment adjustments (i.e., that have negative
>sizes), because we can't know how far they are from each
>other.  */
>   if (maybe_lt (xsize, 0) || maybe_lt (ysize, 0))
>   return -1;
>   /* If decls are different or we know by offsets that there is no 
> overlap,
>we win.  */
>   if (!cmp || !offset_overlap_p (c, xsize, ysize))
>   return 0;
> 
> So we seem to be taking cmp==-1 to mean that although we don't know
> the relationship between the symbols, it must be the case that either
> (a) the symbols are equal (e.g. via aliasing) or (b) the accesses are
> to non-overlapping objects.  In other words, one of the situations
> described by cmp==1 or cmp==0 must be true, but we don't know which
> at compile time.
Right.  That was the conclusion I came to.  If a  SYMBOL_REF has an
alias, the alias must have the same value as the SYMBOL_REF.  So their
either equal or there's no valid case for overlap.

> 
> This means that in practice, the answer to (Q1) appears to be "yes"
> but the answer to (Q2) appears to be "no".
That would be my understanding once aliases/interpositioning come into
play.

> 
> This somewhat contradicts:
> 
>   /* In general we assume that memory locations pointed to by different labels
>  may overlap in undefined ways.  */
>   return -1;
> 
> at the end of compare_base_symbol_refs, which seems to be saying
> that the answer to (Q2) ought to be "yes" instead.  Which is right?
I'm not sure how we could get to yes in that case.  A symbol alias or
interposition ultimately still results in two symbols having the same
final address.  Thus for a byte access if C1 != C2, then we can't have
an overlap.


> 
> In PR92294 we have a symbol X at ANCHOR+OFFSET that's preemptible.
> Under the (Q1)==yes/(Q2)==no assumption, cmp==-1 means that either
> (a) X = ANCHOR+OFFSET or (b) X and ANCHOR reference non-overlapping
> objects.  So we should take the offset into account when doing:
> 
>   if (!cmp || !offset_overlap_p (c, xsize, ysize))
>   return 0;
> 
> Let's call this FIX1.
So this is a really interesting wrinkle.  Doesn't this change Q2 to a
yes?  In particular it changes the "invariant" that the symbols have
the same address in the event of an symbol alias or interposition.  Of
course one could ask the question of whether or not we should handle
cases with anchors specially.


> 
> But that then brings us to: why does memrefs_conflict_p return -1
> when one symbol X has a decl and the other symbol Y doesn't, and neither
> of them are block symbols?  Is the answer to (Q3) that we allow equality
> but not overlap here too?  E.g. a linker script could define Y to X but
> not to a region that contains X at a nonzero offset?
Does digging into the history provide any insights here?

I'm not sure given the issues you've introduced if I could actually
fill out the matrix of answers without more underlying information. 
ie, when can we get symbols without source level decls, 
anchors+interposition issues, etc.

Jeff
> 



Re: aliasing between internal zero-length-arrays and other members

2018-06-05 Thread Jakub Jelinek
On Tue, Jun 05, 2018 at 01:38:21PM +0200, Richard Biener wrote:
> On Tue, Jun 5, 2018 at 1:39 AM Martin Sebor  wrote:
> >
> > GCC silently (without -Wpedantic) accepts declarations of zero
> > length arrays that are followed by other members in the same
> > struct, such as in:
> >
> >struct A { char a, b[0], c; };
> >
> > Is it intended that accesses to elements of such arrays that
> > alias other members be well-defined?
> 
> The middle-end assumes that fields in a structure do not overlap.
> For overlaps you have to use a union.
> 
> In C++ I guess the rule that sizeof() of anything is at least 1 saves
> you here so IMHO this is a C FE bug and we should probably simply
> reject non-trailing empty arrays.

The above is not flexible array member, but zero sized array, that is just
fine anywhere, at the toplevel as well as anywhere inside of the struct.
And yes, accessing it out of bounds is invalid, unless it is flexible-like,
i.e. at the end of the struct.

Jakub


Re: aliasing between internal zero-length-arrays and other members

2018-06-05 Thread Richard Biener
On Tue, Jun 5, 2018 at 1:39 AM Martin Sebor  wrote:
>
> GCC silently (without -Wpedantic) accepts declarations of zero
> length arrays that are followed by other members in the same
> struct, such as in:
>
>struct A { char a, b[0], c; };
>
> Is it intended that accesses to elements of such arrays that
> alias other members be well-defined?

The middle-end assumes that fields in a structure do not overlap.
For overlaps you have to use a union.

In C++ I guess the rule that sizeof() of anything is at least 1 saves
you here so IMHO this is a C FE bug and we should probably simply
reject non-trailing empty arrays.

Note since b has size zero there isn't any real overlap, so ...

> In my tests, GCC assumes that neither read nor write accesses
> to elements of internal zero-length arrays alias other members,
> so assuming those aren't bugs I wonder if the documentation
> should be updated to make that clear and a warning added for
> such declarations (and perhaps also accesses).
>
> For example, the test in the following function is eliminated,
> implying that GCC assumes that the access to p->b does not modify
> p->c, even though with i set to 0 it would:
>
>void f (struct A *p, int i)
>{
>  int x = p->c;
>  p->b[i] = 1;
>  if (x != p->c)
>__builtin_abort ();

... your testcase simply invokes undefined behavior by accessing
b out of bounds.

Richard.

>}
>
> Martin


Re: Aliasing 'this' in a C++ constructor

2018-05-18 Thread Nathan Sidwell

On 05/18/2018 10:12 AM, Jakub Jelinek wrote:


Or is it invalid to dereference s before it has been constructed?


Yes, Marc found:

"During the construction of an object, if the value of the object or any 
of its subobjects is accessed through a glvalue that is not obtained, 
directly or indirectly, from the constructor’s this pointer, the value 
of the object or subobject thus obtained is unspecified."


nathan
--
Nathan Sidwell


Re: Aliasing 'this' in a C++ constructor

2018-05-18 Thread Jakub Jelinek
On Fri, May 18, 2018 at 09:57:42AM -0400, Jason Merrill wrote:
> > But what is invalid on:
> > struct S { int foo (S *); int a; } s { 2 };
> > int S::foo (S *x)
> > {
> >   int b = this->a;
> >   x->a = 5;
> >   b += this->a;
> >   return b;
> > }
> > int main ()
> > {
> >   if (s.foo () != 7)
> > abort ();
> > }
> >
> > I think if you make this a restrict pointer, this will be miscompiled.
> 
> We're only talking about the 'this' pointer in a constructor, not a
> normal member function like foo.

Oops, sorry, missed that.
Make it then:

struct S { S (S *); int a; } s ();
S::S (S *s)
{
  this->a = 2;
  s->a = 3;
  if (this->a != 3)
abort ();
}

Or is it invalid to dereference s before it has been constructed?

Jakub


Re: Aliasing 'this' in a C++ constructor

2018-05-18 Thread Nathan Sidwell

On 05/18/2018 09:21 AM, Marc Glisse wrote:

what about comparisons to this?  I thought restrict implied such a 
comparison was 'never the same'?


ie. if the ctor was:
 selfie (selfie *ptr) : me (ptr==this ? 0 : ptr) {}


It is tempting to think so, but I don't see any language to that effect.


Ok then, thanks for the explanations.

nathan

--
Nathan Sidwell


Re: Aliasing 'this' in a C++ constructor

2018-05-18 Thread Marc Glisse

On Fri, 18 May 2018, Jakub Jelinek wrote:


On Fri, May 18, 2018 at 09:02:08AM -0400, Nathan Sidwell wrote:

On 05/18/2018 08:53 AM, Marc Glisse wrote:


As long as you do not dereference ptr in the constructor, that shouldn't
contradict 'restrict'. The PR gives this quote from the standard:

"During the construction of an object, if the value of the object or any
of its subobjects is accessed through a glvalue that is not obtained,
directly or indirectly, from the constructor’s this pointer, the value
of the object or subobject thus obtained is unspecified."

which reads quite close to saying that 'this' is restrict.

Indeed it is, thanks.

what about comparisons to this?  I thought restrict implied such a
comparison was 'never the same'?

ie. if the ctor was:
  selfie (selfie *ptr) : me (ptr==this ? 0 : ptr) {}


But what is invalid on:
struct S { int foo (S *); int a; } s { 2 };
int S::foo (S *x)
{
 int b = this->a;
 x->a = 5;
 b += this->a;
 return b;
}
int main ()
{
 if (s.foo () != 7)
   abort ();
}

I think if you make this a restrict pointer, this will be miscompiled.


The patch is only for constructors, not for all member functions.

--
Marc Glisse


Re: Aliasing 'this' in a C++ constructor

2018-05-18 Thread Jason Merrill
On Fri, May 18, 2018 at 9:53 AM, Jakub Jelinek  wrote:
> On Fri, May 18, 2018 at 09:02:08AM -0400, Nathan Sidwell wrote:
>> On 05/18/2018 08:53 AM, Marc Glisse wrote:
>>
>> > As long as you do not dereference ptr in the constructor, that shouldn't
>> > contradict 'restrict'. The PR gives this quote from the standard:
>> >
>> > "During the construction of an object, if the value of the object or any
>> > of its subobjects is accessed through a glvalue that is not obtained,
>> > directly or indirectly, from the constructor’s this pointer, the value
>> > of the object or subobject thus obtained is unspecified."
>> >
>> > which reads quite close to saying that 'this' is restrict.
>> Indeed it is, thanks.
>>
>> what about comparisons to this?  I thought restrict implied such a
>> comparison was 'never the same'?
>>
>> ie. if the ctor was:
>>   selfie (selfie *ptr) : me (ptr==this ? 0 : ptr) {}
>
> But what is invalid on:
> struct S { int foo (S *); int a; } s { 2 };
> int S::foo (S *x)
> {
>   int b = this->a;
>   x->a = 5;
>   b += this->a;
>   return b;
> }
> int main ()
> {
>   if (s.foo () != 7)
> abort ();
> }
>
> I think if you make this a restrict pointer, this will be miscompiled.

We're only talking about the 'this' pointer in a constructor, not a
normal member function like foo.

Jason


Re: Aliasing 'this' in a C++ constructor

2018-05-18 Thread Jakub Jelinek
On Fri, May 18, 2018 at 09:02:08AM -0400, Nathan Sidwell wrote:
> On 05/18/2018 08:53 AM, Marc Glisse wrote:
> 
> > As long as you do not dereference ptr in the constructor, that shouldn't
> > contradict 'restrict'. The PR gives this quote from the standard:
> > 
> > "During the construction of an object, if the value of the object or any
> > of its subobjects is accessed through a glvalue that is not obtained,
> > directly or indirectly, from the constructor’s this pointer, the value
> > of the object or subobject thus obtained is unspecified."
> > 
> > which reads quite close to saying that 'this' is restrict.
> Indeed it is, thanks.
> 
> what about comparisons to this?  I thought restrict implied such a
> comparison was 'never the same'?
> 
> ie. if the ctor was:
>   selfie (selfie *ptr) : me (ptr==this ? 0 : ptr) {}

But what is invalid on:
struct S { int foo (S *); int a; } s { 2 };
int S::foo (S *x)
{
  int b = this->a;
  x->a = 5;
  b += this->a;
  return b;
}
int main ()
{
  if (s.foo () != 7)
abort ();
}

I think if you make this a restrict pointer, this will be miscompiled.

Jakub


Re: Aliasing 'this' in a C++ constructor

2018-05-18 Thread Jason Merrill
On Fri, May 18, 2018 at 8:34 AM, Marc Glisse  wrote:
> On Fri, 18 May 2018, Richard Biener wrote:
>> On Fri, May 18, 2018 at 2:25 PM Marc Glisse  wrote:
>>
>>> this lets alias analysis handle the implicit 'this' parameter in C++
>>> constructors as if it was restrict.
>>
>> OK.  Please give C++ folks the chance to chime in on the semantics.
>
> (Cc: said C++ folks)

I suppose it's technically valid to write something like

void f()
{
  A a;
  a.~A();
  new () A(a);
}

but that's pretty pathological and useless, so I'm not concerned about
messing with it.

Jason


Re: Aliasing 'this' in a C++ constructor

2018-05-18 Thread Marc Glisse

On Fri, 18 May 2018, Nathan Sidwell wrote:


On 05/18/2018 08:53 AM, Marc Glisse wrote:

As long as you do not dereference ptr in the constructor, that shouldn't 
contradict 'restrict'. The PR gives this quote from the standard:


"During the construction of an object, if the value of the object or any of 
its subobjects is accessed through a glvalue that is not obtained, directly 
or indirectly, from the constructor’s this pointer, the value of the object 
or subobject thus obtained is unspecified."


which reads quite close to saying that 'this' is restrict.

Indeed it is, thanks.

what about comparisons to this?  I thought restrict implied such a comparison 
was 'never the same'?


ie. if the ctor was:
 selfie (selfie *ptr) : me (ptr==this ? 0 : ptr) {}


It is tempting to think so, but I don't see any language to that effect.

C11 has this example:

void h(int n, int * restrict p, int * restrict q, int * restrict r)
{
  int i;
  for (i = 0; i < n; i++)
p[i] = q[i] + r[i];
}
h(100, a, b, b) --> valid (because no write)

We have https://gcc.gnu.org/bugzilla/show_bug.cgi?id=13962 about using 
alias information to simplify pointer comparisons.


--
Marc Glisse


Re: Aliasing 'this' in a C++ constructor

2018-05-18 Thread Nathan Sidwell

On 05/18/2018 08:53 AM, Marc Glisse wrote:

As long as you do not dereference ptr in the constructor, that shouldn't 
contradict 'restrict'. The PR gives this quote from the standard:


"During the construction of an object, if the value of the object or any 
of its subobjects is accessed through a glvalue that is not obtained, 
directly or indirectly, from the constructor’s this pointer, the value 
of the object or subobject thus obtained is unspecified."


which reads quite close to saying that 'this' is restrict.

Indeed it is, thanks.

what about comparisons to this?  I thought restrict implied such a 
comparison was 'never the same'?


ie. if the ctor was:
  selfie (selfie *ptr) : me (ptr==this ? 0 : ptr) {}

nathan

--
Nathan Sidwell


Re: Aliasing 'this' in a C++ constructor

2018-05-18 Thread Marc Glisse

On Fri, 18 May 2018, Nathan Sidwell wrote:


On 05/18/2018 08:34 AM, Marc Glisse wrote:

(Cc: said C++ folks)

On Fri, 18 May 2018, Richard Biener wrote:


On Fri, May 18, 2018 at 2:25 PM Marc Glisse  wrote:


Hello,



this lets alias analysis handle the implicit 'this' parameter in C++
constructors as if it was restrict.


I think the following bizarre code is nevertheless well-formed:

struct selfie {
 selfie *me;
 selfie (selfie *ptr) : me (ptr) {}
};

selfie bob ();


As long as you do not dereference ptr in the constructor, that shouldn't 
contradict 'restrict'. The PR gives this quote from the standard:


"During the construction of an object, if the value of the object or any 
of its subobjects is accessed through a glvalue that is not obtained, 
directly or indirectly, from the constructor’s this pointer, the value of 
the object or subobject thus obtained is unspecified."


which reads quite close to saying that 'this' is restrict.

--
Marc Glisse


Re: Aliasing 'this' in a C++ constructor

2018-05-18 Thread Nathan Sidwell

On 05/18/2018 08:34 AM, Marc Glisse wrote:

(Cc: said C++ folks)

On Fri, 18 May 2018, Richard Biener wrote:


On Fri, May 18, 2018 at 2:25 PM Marc Glisse  wrote:


Hello,



this lets alias analysis handle the implicit 'this' parameter in C++
constructors as if it was restrict.


I think the following bizarre code is nevertheless well-formed:

struct selfie {
  selfie *me;
  selfie (selfie *ptr) : me (ptr) {}
};

selfie bob ();

nathan

--
Nathan Sidwell


Re: Aliasing 'this' in a C++ constructor

2018-05-18 Thread Marc Glisse

(Cc: said C++ folks)

On Fri, 18 May 2018, Richard Biener wrote:


On Fri, May 18, 2018 at 2:25 PM Marc Glisse  wrote:


Hello,



this lets alias analysis handle the implicit 'this' parameter in C++
constructors as if it was restrict.



Bootstrap+regtest on powerpc64le-unknown-linux-gnu.


OK.  Please give C++ folks the chance to chime in on the semantics.

Thanks,
Richard.


2018-05-18  Marc Glisse  



 PR c++/82899
gcc/
 * tree-ssa-structalias.c (create_variable_info_for_1): Extra

argument.

 (intra_create_variable_infos): Handle C++ constructors.



gcc/testsuite/
 * g++.dg/pr82899.C: New testcase.



--
Marc Glisse


--
Marc Glisse


Re: Aliasing 'this' in a C++ constructor

2018-05-18 Thread Richard Biener
On Fri, May 18, 2018 at 2:25 PM Marc Glisse  wrote:

> Hello,

> this lets alias analysis handle the implicit 'this' parameter in C++
> constructors as if it was restrict.

> Bootstrap+regtest on powerpc64le-unknown-linux-gnu.

OK.  Please give C++ folks the chance to chime in on the semantics.

Thanks,
Richard.

> 2018-05-18  Marc Glisse  

>  PR c++/82899
> gcc/
>  * tree-ssa-structalias.c (create_variable_info_for_1): Extra
argument.
>  (intra_create_variable_infos): Handle C++ constructors.

> gcc/testsuite/
>  * g++.dg/pr82899.C: New testcase.

> --
> Marc Glisse


Re: Aliasing of arrays

2016-12-01 Thread Richard Biener
On Wed, Nov 30, 2016 at 6:29 PM, Joseph Myers  wrote:
> On Wed, 30 Nov 2016, Richard Biener wrote:
>
>> but that probably shouldn't apply to array types.  The idea is that
>> objects of the same type cannot overlap.  Maybe Joseph can clarify whether
>> and array object of known size really constitutes an object in that sense.
>
> This is one of the ambiguous cases about which objects are relevant for
> type-based aliasing rules.  I think it's best not to optimize this (that
> is, to treat the objects as being the individual array elements, so the
> arrays can overlap by an exact multiple of the element size - meaning the
> ultimate element size in the case of multidimensional arrays, so e.g. two
> int[4][5] arrays could be offset by one int from each other).

Ok, and I agree.

Testing the attached patch.

Richard.

2016-12-01  Richard Biener  

* tree-ssa-alias.c (indirect_refs_may_alias_p): Do not
treat arrays with same type as objects that cannot overlap.

* gcc.dg/torture/alias-2.c: New testcase.


p3
Description: Binary data


Re: Aliasing of arrays

2016-11-30 Thread Joseph Myers
On Wed, 30 Nov 2016, Richard Biener wrote:

> but that probably shouldn't apply to array types.  The idea is that
> objects of the same type cannot overlap.  Maybe Joseph can clarify whether
> and array object of known size really constitutes an object in that sense.

This is one of the ambiguous cases about which objects are relevant for 
type-based aliasing rules.  I think it's best not to optimize this (that 
is, to treat the objects as being the individual array elements, so the 
arrays can overlap by an exact multiple of the element size - meaning the 
ultimate element size in the case of multidimensional arrays, so e.g. two 
int[4][5] arrays could be offset by one int from each other).

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: Aliasing of arrays

2016-11-30 Thread Richard Biener
On Wed, Nov 30, 2016 at 2:19 PM, Alexander Cherepanov
 wrote:
> Hi!
>
> Pascal Cuoq communicated to me the following example:
>
> int ar1(int (*p)[3], int (*q)[3])
> {
>   (*p)[0] = 1;
>   (*q)[1] = 2;
>   return (*p)[0];
> }
>
> gcc of versions 4.9.2 and 7.0.0 20161129 optimize it with -O2 on the premise
> that elements with different indices don't alias:
>
>  :
>0:   c7 47 0c 01 00 00 00movl   $0x1,0xc(%rdi)
>7:   b8 01 00 00 00  mov$0x1,%eax
>c:   c7 46 10 02 00 00 00movl   $0x2,0x10(%rsi)
>   13:   c3  retq
>
> That's fine. But then I expect that gcc will also assume that arrays of
> different known lengths don't alias, i.e. that gcc will optimize this
> example:
>
> int ar2(int (*p)[8], int (*q)[7]) {
>   (*p)[3] = 1;
>   (*q)[3] = 2;
>   return (*p)[3];
> }
>
> But this is not optimized:
>
> 0020 :
>   20:   c7 47 0c 01 00 00 00movl   $0x1,0xc(%rdi)
>   27:   c7 46 0c 02 00 00 00movl   $0x2,0xc(%rsi)
>   2e:   8b 47 0cmov0xc(%rdi),%eax
>
> Is this behavior fully intentional, is the first example optimized too
> aggressively, is an optimization missed in the second example, or is the
> situation more complex?

I think it's a bug that we optimize ar1.  We run into

static bool
indirect_refs_may_alias_p (tree ref1 ATTRIBUTE_UNUSED, tree base1,
...
  /* If both references are through the same type, they do not alias
 if the accesses do not overlap.  This does extra disambiguation
 for mixed/pointer accesses but requires strict aliasing.  */
  if ((TREE_CODE (base1) != TARGET_MEM_REF
   || (!TMR_INDEX (base1) && !TMR_INDEX2 (base1)))
  && (TREE_CODE (base2) != TARGET_MEM_REF
  || (!TMR_INDEX (base2) && !TMR_INDEX2 (base2)))
  && same_type_for_tbaa (TREE_TYPE (base1), TREE_TYPE (ptrtype1)) == 1
  && same_type_for_tbaa (TREE_TYPE (base2), TREE_TYPE (ptrtype2)) == 1
  && same_type_for_tbaa (TREE_TYPE (ptrtype1),
 TREE_TYPE (ptrtype2)) == 1)
return ranges_overlap_p (offset1, max_size1, offset2, max_size2);

but that probably shouldn't apply to array types.  The idea is that
objects of the same type cannot overlap.  Maybe Joseph can clarify whether
and array object of known size really constitutes an object in that sense.

For the ar2 case the types are not equal and thus we do not use that trick
(and all array types irrespective of size get the same alias set, so
TBAA doesn't
apply here).

Richard.

>
> --
> Alexander Cherepanov


Re: Aliasing: look through pointer's def stmt

2013-11-05 Thread Marc Glisse

On Mon, 4 Nov 2013, Richard Biener wrote:


Marc Glisse marc.gli...@inria.fr wrote:

On Mon, 4 Nov 2013, Richard Biener wrote:


Well, host_integer_p (, 0) looks correct to me.  But ...


Ok, I'll put it back.


Er, actually host_integerp(, 0) can't work, because arithmetic on pointers 
is done with unsigned integers and for x86_64 it would return false for 
(size_t)-4. So it would need to be host_integerp(, 1) instead. I still 
test TREE_CODE == INTEGER_CST in the current patch, let me know if you 
want host_integerp(, 1).



Index: gcc/tree-ssa-alias.h
===
--- gcc/tree-ssa-alias.h(revision 204267)
+++ gcc/tree-ssa-alias.h(working copy)
@@ -139,30 +139,30 @@ extern void pt_solution_set_var (struct

 extern void dump_pta_stats (FILE *);

 extern GTY(()) struct pt_solution ipa_escaped_pt;

 /* Return true, if the two ranges [POS1, SIZE1] and [POS2, SIZE2]
overlap.  SIZE1 and/or SIZE2 can be (unsigned)-1 in which case

the

range is open-ended.  Otherwise return false.  */

 static inline bool
-ranges_overlap_p (unsigned HOST_WIDE_INT pos1,
- unsigned HOST_WIDE_INT size1,
- unsigned HOST_WIDE_INT pos2,
- unsigned HOST_WIDE_INT size2)
+ranges_overlap_p (HOST_WIDE_INT pos1,
+ HOST_WIDE_INT size1,
+ HOST_WIDE_INT pos2,
+ HOST_WIDE_INT size2)


I think size[12] should still be unsigned (we don't allow negative
sizes but only the special value -1U which we special-case only to
avoid pos + size to wrap)


But all we do with size[12] is compare it to -1 and perform arithmetic
with pos[12]. At least for the second one, we need to cast size to a
signed type so the comparisons make sense (unless we decide to use a
double_int there). So I thought I would do the cast in the argument.
Otherwise, I'll start the function with:
HOST_WIDE_INT ssize1 = (HOST_WIDE_INT)size1;
and replace size1 with ssize1 through the function.

Is the reason of keeping size[12] unsigned for documentation? Or am I
missing a reason why I may be breaking things?


It is mostly documentation indeed.


Ok, then here it is (bootstrap+testsuite on x86_64-unknown-linux-gnu).

2013-11-05  Marc Glisse  marc.gli...@inria.fr

gcc/
* tree-ssa-alias.h (ranges_overlap_p): Handle negative offsets.
* tree-ssa-alias.c (ao_ref_init_from_ptr_and_size): Likewise.

gcc/testsuite/
* gcc.dg/tree-ssa/alias-26.c: New file.

--
Marc GlisseIndex: gcc/testsuite/gcc.dg/tree-ssa/alias-26.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/alias-26.c(revision 0)
+++ gcc/testsuite/gcc.dg/tree-ssa/alias-26.c(working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-options -O2 -fdump-tree-optimized } */
+
+void f (const char *c, int *i)
+{
+  *i = 42;
+  __builtin_memcpy (i - 1, c, sizeof (int));
+  if (*i != 42) __builtin_abort();
+}
+
+/* { dg-final { scan-tree-dump-not abort optimized } } */
+/* { dg-final { cleanup-tree-dump optimized } } */
+

Property changes on: gcc/testsuite/gcc.dg/tree-ssa/alias-26.c
___
Added: svn:keywords
## -0,0 +1 ##
+Author Date Id Revision URL
\ No newline at end of property
Added: svn:eol-style
## -0,0 +1 ##
+native
\ No newline at end of property
Index: gcc/tree-ssa-alias.c
===
--- gcc/tree-ssa-alias.c(revision 204381)
+++ gcc/tree-ssa-alias.c(working copy)
@@ -552,42 +552,42 @@ ao_ref_base_alias_set (ao_ref *ref)
 alias_set_type
 ao_ref_alias_set (ao_ref *ref)
 {
   if (ref-ref_alias_set != -1)
 return ref-ref_alias_set;
   ref-ref_alias_set = get_alias_set (ref-ref);
   return ref-ref_alias_set;
 }
 
 /* Init an alias-oracle reference representation from a gimple pointer
-   PTR and a gimple size SIZE in bytes.  If SIZE is NULL_TREE the the
+   PTR and a gimple size SIZE in bytes.  If SIZE is NULL_TREE then the
size is assumed to be unknown.  The access is assumed to be only
to or after of the pointer target, not before it.  */
 
 void
 ao_ref_init_from_ptr_and_size (ao_ref *ref, tree ptr, tree size)
 {
   HOST_WIDE_INT t1, t2, extra_offset = 0;
   ref-ref = NULL_TREE;
   if (TREE_CODE (ptr) == SSA_NAME)
 {
   gimple stmt = SSA_NAME_DEF_STMT (ptr);
   if (gimple_assign_single_p (stmt)
   gimple_assign_rhs_code (stmt) == ADDR_EXPR)
ptr = gimple_assign_rhs1 (stmt);
   else if (is_gimple_assign (stmt)
gimple_assign_rhs_code (stmt) == POINTER_PLUS_EXPR
-   host_integerp (gimple_assign_rhs2 (stmt), 0)
-   (t1 = int_cst_value (gimple_assign_rhs2 (stmt))) = 0)
+   TREE_CODE (gimple_assign_rhs2 (stmt)) == INTEGER_CST)
{
  ptr = gimple_assign_rhs1 (stmt);
- extra_offset = BITS_PER_UNIT * t1;
+ extra_offset = 

Re: Aliasing: look through pointer's def stmt

2013-11-05 Thread Richard Biener
On Tue, Nov 5, 2013 at 11:50 AM, Marc Glisse marc.gli...@inria.fr wrote:
 On Mon, 4 Nov 2013, Richard Biener wrote:

 Marc Glisse marc.gli...@inria.fr wrote:

 On Mon, 4 Nov 2013, Richard Biener wrote:

 Well, host_integer_p (, 0) looks correct to me.  But ...


 Ok, I'll put it back.


 Er, actually host_integerp(, 0) can't work, because arithmetic on pointers
 is done with unsigned integers and for x86_64 it would return false for
 (size_t)-4. So it would need to be host_integerp(, 1) instead. I still test
 TREE_CODE == INTEGER_CST in the current patch, let me know if you want
 host_integerp(, 1).

Hmm, indeed there isn't an exact match to what int_cst_value would
require.  Just testing INTEGER_CST should work, we know the
constant is of sizetype which needs to fit a HWI (in this case meaning
TREE_INT_CST_HIGH is either zero or -1).

 Index: gcc/tree-ssa-alias.h
 ===
 --- gcc/tree-ssa-alias.h(revision 204267)
 +++ gcc/tree-ssa-alias.h(working copy)
 @@ -139,30 +139,30 @@ extern void pt_solution_set_var (struct

  extern void dump_pta_stats (FILE *);

  extern GTY(()) struct pt_solution ipa_escaped_pt;

  /* Return true, if the two ranges [POS1, SIZE1] and [POS2, SIZE2]
 overlap.  SIZE1 and/or SIZE2 can be (unsigned)-1 in which case

 the

 range is open-ended.  Otherwise return false.  */

  static inline bool
 -ranges_overlap_p (unsigned HOST_WIDE_INT pos1,
 - unsigned HOST_WIDE_INT size1,
 - unsigned HOST_WIDE_INT pos2,
 - unsigned HOST_WIDE_INT size2)
 +ranges_overlap_p (HOST_WIDE_INT pos1,
 + HOST_WIDE_INT size1,
 + HOST_WIDE_INT pos2,
 + HOST_WIDE_INT size2)


 I think size[12] should still be unsigned (we don't allow negative
 sizes but only the special value -1U which we special-case only to
 avoid pos + size to wrap)


 But all we do with size[12] is compare it to -1 and perform arithmetic
 with pos[12]. At least for the second one, we need to cast size to a
 signed type so the comparisons make sense (unless we decide to use a
 double_int there). So I thought I would do the cast in the argument.
 Otherwise, I'll start the function with:
 HOST_WIDE_INT ssize1 = (HOST_WIDE_INT)size1;
 and replace size1 with ssize1 through the function.

 Is the reason of keeping size[12] unsigned for documentation? Or am I
 missing a reason why I may be breaking things?


 It is mostly documentation indeed.


 Ok, then here it is (bootstrap+testsuite on x86_64-unknown-linux-gnu).

Ok.

Thanks,
Richard.

 2013-11-05  Marc Glisse  marc.gli...@inria.fr


 gcc/
 * tree-ssa-alias.h (ranges_overlap_p): Handle negative offsets.
 * tree-ssa-alias.c (ao_ref_init_from_ptr_and_size): Likewise.

 gcc/testsuite/
 * gcc.dg/tree-ssa/alias-26.c: New file.

 --
 Marc Glisse

 Index: gcc/testsuite/gcc.dg/tree-ssa/alias-26.c
 ===
 --- gcc/testsuite/gcc.dg/tree-ssa/alias-26.c(revision 0)
 +++ gcc/testsuite/gcc.dg/tree-ssa/alias-26.c(working copy)
 @@ -0,0 +1,13 @@
 +/* { dg-do compile } */
 +/* { dg-options -O2 -fdump-tree-optimized } */
 +
 +void f (const char *c, int *i)
 +{
 +  *i = 42;
 +  __builtin_memcpy (i - 1, c, sizeof (int));
 +  if (*i != 42) __builtin_abort();
 +}
 +
 +/* { dg-final { scan-tree-dump-not abort optimized } } */
 +/* { dg-final { cleanup-tree-dump optimized } } */
 +

 Property changes on: gcc/testsuite/gcc.dg/tree-ssa/alias-26.c
 ___
 Added: svn:keywords
 ## -0,0 +1 ##
 +Author Date Id Revision URL
 \ No newline at end of property
 Added: svn:eol-style
 ## -0,0 +1 ##
 +native
 \ No newline at end of property
 Index: gcc/tree-ssa-alias.c
 ===
 --- gcc/tree-ssa-alias.c(revision 204381)
 +++ gcc/tree-ssa-alias.c(working copy)
 @@ -552,42 +552,42 @@ ao_ref_base_alias_set (ao_ref *ref)
  alias_set_type
  ao_ref_alias_set (ao_ref *ref)
  {
if (ref-ref_alias_set != -1)
  return ref-ref_alias_set;
ref-ref_alias_set = get_alias_set (ref-ref);
return ref-ref_alias_set;
  }

  /* Init an alias-oracle reference representation from a gimple pointer
 -   PTR and a gimple size SIZE in bytes.  If SIZE is NULL_TREE the the
 +   PTR and a gimple size SIZE in bytes.  If SIZE is NULL_TREE then the
 size is assumed to be unknown.  The access is assumed to be only
 to or after of the pointer target, not before it.  */

  void
  ao_ref_init_from_ptr_and_size (ao_ref *ref, tree ptr, tree size)
  {
HOST_WIDE_INT t1, t2, extra_offset = 0;
ref-ref = NULL_TREE;
if (TREE_CODE (ptr) == SSA_NAME)
  {
gimple stmt = SSA_NAME_DEF_STMT (ptr);
if (gimple_assign_single_p (stmt)
gimple_assign_rhs_code (stmt) == ADDR_EXPR)
 ptr = gimple_assign_rhs1 

Re: Aliasing: look through pointer's def stmt

2013-11-04 Thread Richard Biener
On Thu, Oct 31, 2013 at 6:11 PM, Marc Glisse marc.gli...@inria.fr wrote:
 On Wed, 30 Oct 2013, Richard Biener wrote:

 --- gcc/tree-ssa-alias.c(revision 204188)
 +++ gcc/tree-ssa-alias.c(working copy)
 @@ -567,20 +567,29 @@ void
  ao_ref_init_from_ptr_and_size (ao_ref *ref, tree ptr, tree size)
  {
HOST_WIDE_INT t1, t2;
ref-ref = NULL_TREE;
if (TREE_CODE (ptr) == SSA_NAME)
  {
gimple stmt = SSA_NAME_DEF_STMT (ptr);
if (gimple_assign_single_p (stmt)
gimple_assign_rhs_code (stmt) == ADDR_EXPR)
 ptr = gimple_assign_rhs1 (stmt);
 +  else if (is_gimple_assign (stmt)
 +   gimple_assign_rhs_code (stmt) == POINTER_PLUS_EXPR
 +   host_integerp (gimple_assign_rhs2 (stmt), 0)
 +   (t1 = int_cst_value (gimple_assign_rhs2 (stmt))) = 0)


 No need to restrict this to positive offsets I think.


 Here is the follow-up patch to remove this restriction (bootstrap+testsuite
 on x86_64-unknown-linux-gnu).
 I don't really know how safe it is.

 I couldn't use host_integerp(,1) since it returns false for (size_t)(-1) on
 x86_64, and host_integerp(,0) seemed strange, but I could use it if you
 want.

Well, host_integer_p (, 0) looks correct to me.  But ...

 Index: gcc/tree-ssa-alias.h
 ===
 --- gcc/tree-ssa-alias.h(revision 204267)
 +++ gcc/tree-ssa-alias.h(working copy)
 @@ -139,30 +139,30 @@ extern void pt_solution_set_var (struct

  extern void dump_pta_stats (FILE *);

  extern GTY(()) struct pt_solution ipa_escaped_pt;

  /* Return true, if the two ranges [POS1, SIZE1] and [POS2, SIZE2]
 overlap.  SIZE1 and/or SIZE2 can be (unsigned)-1 in which case the
 range is open-ended.  Otherwise return false.  */

  static inline bool
 -ranges_overlap_p (unsigned HOST_WIDE_INT pos1,
 - unsigned HOST_WIDE_INT size1,
 - unsigned HOST_WIDE_INT pos2,
 - unsigned HOST_WIDE_INT size2)
 +ranges_overlap_p (HOST_WIDE_INT pos1,
 + HOST_WIDE_INT size1,
 + HOST_WIDE_INT pos2,
 + HOST_WIDE_INT size2)

I think size[12] should still be unsigned (we don't allow negative
sizes but only the special value -1U which we special-case only to
avoid pos + size to wrap)

Otherwise ok.

Thanks,
Richard.

  {
if (pos1 = pos2
 -   (size2 == (unsigned HOST_WIDE_INT)-1
 +   (size2 == -1
   || pos1  (pos2 + size2)))
  return true;
if (pos2 = pos1
 -   (size1 == (unsigned HOST_WIDE_INT)-1
 +   (size1 == -1
   || pos2  (pos1 + size1)))
  return true;

return false;
  }





  #endif /* TREE_SSA_ALIAS_H  */



Re: Aliasing: look through pointer's def stmt

2013-11-04 Thread Marc Glisse

On Mon, 4 Nov 2013, Richard Biener wrote:


Well, host_integer_p (, 0) looks correct to me.  But ...


Ok, I'll put it back.


Index: gcc/tree-ssa-alias.h
===
--- gcc/tree-ssa-alias.h(revision 204267)
+++ gcc/tree-ssa-alias.h(working copy)
@@ -139,30 +139,30 @@ extern void pt_solution_set_var (struct

 extern void dump_pta_stats (FILE *);

 extern GTY(()) struct pt_solution ipa_escaped_pt;

 /* Return true, if the two ranges [POS1, SIZE1] and [POS2, SIZE2]
overlap.  SIZE1 and/or SIZE2 can be (unsigned)-1 in which case the
range is open-ended.  Otherwise return false.  */

 static inline bool
-ranges_overlap_p (unsigned HOST_WIDE_INT pos1,
- unsigned HOST_WIDE_INT size1,
- unsigned HOST_WIDE_INT pos2,
- unsigned HOST_WIDE_INT size2)
+ranges_overlap_p (HOST_WIDE_INT pos1,
+ HOST_WIDE_INT size1,
+ HOST_WIDE_INT pos2,
+ HOST_WIDE_INT size2)


I think size[12] should still be unsigned (we don't allow negative
sizes but only the special value -1U which we special-case only to
avoid pos + size to wrap)


But all we do with size[12] is compare it to -1 and perform arithmetic 
with pos[12]. At least for the second one, we need to cast size to a 
signed type so the comparisons make sense (unless we decide to use a 
double_int there). So I thought I would do the cast in the argument. 
Otherwise, I'll start the function with:

HOST_WIDE_INT ssize1 = (HOST_WIDE_INT)size1;
and replace size1 with ssize1 through the function.

Is the reason of keeping size[12] unsigned for documentation? Or am I 
missing a reason why I may be breaking things?


--
Marc Glisse


Re: Aliasing: look through pointer's def stmt

2013-11-04 Thread Richard Biener
Marc Glisse marc.gli...@inria.fr wrote:
On Mon, 4 Nov 2013, Richard Biener wrote:

 Well, host_integer_p (, 0) looks correct to me.  But ...

Ok, I'll put it back.

 Index: gcc/tree-ssa-alias.h
 ===
 --- gcc/tree-ssa-alias.h(revision 204267)
 +++ gcc/tree-ssa-alias.h(working copy)
 @@ -139,30 +139,30 @@ extern void pt_solution_set_var (struct

  extern void dump_pta_stats (FILE *);

  extern GTY(()) struct pt_solution ipa_escaped_pt;

  /* Return true, if the two ranges [POS1, SIZE1] and [POS2, SIZE2]
 overlap.  SIZE1 and/or SIZE2 can be (unsigned)-1 in which case
the
 range is open-ended.  Otherwise return false.  */

  static inline bool
 -ranges_overlap_p (unsigned HOST_WIDE_INT pos1,
 - unsigned HOST_WIDE_INT size1,
 - unsigned HOST_WIDE_INT pos2,
 - unsigned HOST_WIDE_INT size2)
 +ranges_overlap_p (HOST_WIDE_INT pos1,
 + HOST_WIDE_INT size1,
 + HOST_WIDE_INT pos2,
 + HOST_WIDE_INT size2)

 I think size[12] should still be unsigned (we don't allow negative
 sizes but only the special value -1U which we special-case only to
 avoid pos + size to wrap)

But all we do with size[12] is compare it to -1 and perform arithmetic 
with pos[12]. At least for the second one, we need to cast size to a 
signed type so the comparisons make sense (unless we decide to use a 
double_int there). So I thought I would do the cast in the argument. 
Otherwise, I'll start the function with:
HOST_WIDE_INT ssize1 = (HOST_WIDE_INT)size1;
and replace size1 with ssize1 through the function.

Is the reason of keeping size[12] unsigned for documentation? Or am I 
missing a reason why I may be breaking things?

It is mostly documentation indeed.

Richard.



Re: Aliasing: look through pointer's def stmt

2013-10-31 Thread Marc Glisse

On Wed, 30 Oct 2013, Richard Biener wrote:


--- gcc/tree-ssa-alias.c(revision 204188)
+++ gcc/tree-ssa-alias.c(working copy)
@@ -567,20 +567,29 @@ void
 ao_ref_init_from_ptr_and_size (ao_ref *ref, tree ptr, tree size)
 {
   HOST_WIDE_INT t1, t2;
   ref-ref = NULL_TREE;
   if (TREE_CODE (ptr) == SSA_NAME)
 {
   gimple stmt = SSA_NAME_DEF_STMT (ptr);
   if (gimple_assign_single_p (stmt)
   gimple_assign_rhs_code (stmt) == ADDR_EXPR)
ptr = gimple_assign_rhs1 (stmt);
+  else if (is_gimple_assign (stmt)
+   gimple_assign_rhs_code (stmt) == POINTER_PLUS_EXPR
+   host_integerp (gimple_assign_rhs2 (stmt), 0)
+   (t1 = int_cst_value (gimple_assign_rhs2 (stmt))) = 0)


No need to restrict this to positive offsets I think.


Here is the follow-up patch to remove this restriction 
(bootstrap+testsuite on x86_64-unknown-linux-gnu).

I don't really know how safe it is.

I couldn't use host_integerp(,1) since it returns false for (size_t)(-1) 
on x86_64, and host_integerp(,0) seemed strange, but I could use it if you 
want.


2013-11-01  Marc Glisse  marc.gli...@inria.fr

gcc/
* tree-ssa-alias.h (ranges_overlap_p): Handle negative offsets.
* tree-ssa-alias.c (ao_ref_init_from_ptr_and_size): Likewise.

gcc/testsuite/
* gcc.dg/tree-ssa/alias-26.c: New file.

--
Marc GlisseIndex: gcc/testsuite/gcc.dg/tree-ssa/alias-26.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/alias-26.c(revision 0)
+++ gcc/testsuite/gcc.dg/tree-ssa/alias-26.c(working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-options -O2 -fdump-tree-optimized } */
+
+void f (const char *c, int *i)
+{
+  *i = 42;
+  __builtin_memcpy (i - 1, c, sizeof (int));
+  if (*i != 42) __builtin_abort();
+}
+
+/* { dg-final { scan-tree-dump-not abort optimized } } */
+/* { dg-final { cleanup-tree-dump optimized } } */
+

Property changes on: gcc/testsuite/gcc.dg/tree-ssa/alias-26.c
___
Added: svn:keywords
## -0,0 +1 ##
+Author Date Id Revision URL
\ No newline at end of property
Added: svn:eol-style
## -0,0 +1 ##
+native
\ No newline at end of property
Index: gcc/tree-ssa-alias.c
===
--- gcc/tree-ssa-alias.c(revision 204267)
+++ gcc/tree-ssa-alias.c(working copy)
@@ -552,42 +552,42 @@ ao_ref_base_alias_set (ao_ref *ref)
 alias_set_type
 ao_ref_alias_set (ao_ref *ref)
 {
   if (ref-ref_alias_set != -1)
 return ref-ref_alias_set;
   ref-ref_alias_set = get_alias_set (ref-ref);
   return ref-ref_alias_set;
 }
 
 /* Init an alias-oracle reference representation from a gimple pointer
-   PTR and a gimple size SIZE in bytes.  If SIZE is NULL_TREE the the
+   PTR and a gimple size SIZE in bytes.  If SIZE is NULL_TREE then the
size is assumed to be unknown.  The access is assumed to be only
to or after of the pointer target, not before it.  */
 
 void
 ao_ref_init_from_ptr_and_size (ao_ref *ref, tree ptr, tree size)
 {
   HOST_WIDE_INT t1, t2, extra_offset = 0;
   ref-ref = NULL_TREE;
   if (TREE_CODE (ptr) == SSA_NAME)
 {
   gimple stmt = SSA_NAME_DEF_STMT (ptr);
   if (gimple_assign_single_p (stmt)
   gimple_assign_rhs_code (stmt) == ADDR_EXPR)
ptr = gimple_assign_rhs1 (stmt);
   else if (is_gimple_assign (stmt)
gimple_assign_rhs_code (stmt) == POINTER_PLUS_EXPR
-   host_integerp (gimple_assign_rhs2 (stmt), 0)
-   (t1 = int_cst_value (gimple_assign_rhs2 (stmt))) = 0)
+   TREE_CODE (gimple_assign_rhs2 (stmt)) == INTEGER_CST)
{
  ptr = gimple_assign_rhs1 (stmt);
- extra_offset = BITS_PER_UNIT * t1;
+ extra_offset = BITS_PER_UNIT
+* int_cst_value (gimple_assign_rhs2 (stmt));
}
 }
 
   if (TREE_CODE (ptr) == ADDR_EXPR)
 ref-base = get_ref_base_and_extent (TREE_OPERAND (ptr, 0),
 ref-offset, t1, t2);
   else
 {
   ref-base = build2 (MEM_REF, char_type_node,
  ptr, null_pointer_node);
Index: gcc/tree-ssa-alias.h
===
--- gcc/tree-ssa-alias.h(revision 204267)
+++ gcc/tree-ssa-alias.h(working copy)
@@ -139,30 +139,30 @@ extern void pt_solution_set_var (struct
 
 extern void dump_pta_stats (FILE *);
 
 extern GTY(()) struct pt_solution ipa_escaped_pt;
 
 /* Return true, if the two ranges [POS1, SIZE1] and [POS2, SIZE2]
overlap.  SIZE1 and/or SIZE2 can be (unsigned)-1 in which case the
range is open-ended.  Otherwise return false.  */
 
 static inline bool
-ranges_overlap_p (unsigned HOST_WIDE_INT pos1,
- unsigned HOST_WIDE_INT size1,
- unsigned HOST_WIDE_INT pos2,
- unsigned HOST_WIDE_INT size2)

Re: Aliasing: look through pointer's def stmt

2013-10-30 Thread Richard Biener
On Wed, Oct 30, 2013 at 1:09 AM, Marc Glisse marc.gli...@inria.fr wrote:
 On Tue, 29 Oct 2013, Richard Biener wrote:

 For the POINTER_PLUS_EXPR offset argument you should use
 int_cst_value () to access it (it will unconditionally sign-extend)
 and use host_integerp (..., 0).  That leaves the overflow possibility
 in place (and you should multiply by BITS_PER_UNIT) which we
 ignore in enough other places similar to this to ignore ...


 Like this? (passes bootstrap+testsuite on x86_64-linux-gnu)

 2013-10-30  Marc Glisse  marc.gli...@inria.fr


 gcc/
 * tree-ssa-alias.c (ao_ref_init_from_ptr_and_size): Look for a
 POINTER_PLUS_EXPR in the defining statement.

 gcc/testsuite/
 * gcc.dg/tree-ssa/alias-24.c: New file.


 --
 Marc Glisse

 Index: gcc/testsuite/gcc.dg/tree-ssa/alias-24.c
 ===
 --- gcc/testsuite/gcc.dg/tree-ssa/alias-24.c(revision 0)
 +++ gcc/testsuite/gcc.dg/tree-ssa/alias-24.c(working copy)
 @@ -0,0 +1,22 @@
 +/* { dg-do compile } */
 +/* { dg-options -O2 -fdump-tree-optimized } */
 +
 +void f (const char *c, int *i)
 +{
 +  *i = 42;
 +  __builtin_memcpy (i + 1, c, sizeof (int));
 +  if (*i != 42) __builtin_abort();
 +}
 +
 +extern void keepit ();
 +void g (const char *c, int *i)
 +{
 +  *i = 33;
 +  __builtin_memcpy (i - 1, c, 3 * sizeof (int));
 +  if (*i != 33) keepit();
 +}
 +
 +/* { dg-final { scan-tree-dump-not abort optimized } } */
 +/* { dg-final { scan-tree-dump keepit optimized } } */
 +/* { dg-final { cleanup-tree-dump optimized } } */
 +

 Property changes on: gcc/testsuite/gcc.dg/tree-ssa/alias-24.c
 ___
 Added: svn:keywords
 ## -0,0 +1 ##
 +Author Date Id Revision URL
 \ No newline at end of property
 Added: svn:eol-style
 ## -0,0 +1 ##
 +native
 \ No newline at end of property
 Index: gcc/tree-ssa-alias.c
 ===
 --- gcc/tree-ssa-alias.c(revision 204188)
 +++ gcc/tree-ssa-alias.c(working copy)
 @@ -567,20 +567,29 @@ void
  ao_ref_init_from_ptr_and_size (ao_ref *ref, tree ptr, tree size)
  {
HOST_WIDE_INT t1, t2;
ref-ref = NULL_TREE;
if (TREE_CODE (ptr) == SSA_NAME)
  {
gimple stmt = SSA_NAME_DEF_STMT (ptr);
if (gimple_assign_single_p (stmt)
gimple_assign_rhs_code (stmt) == ADDR_EXPR)
 ptr = gimple_assign_rhs1 (stmt);
 +  else if (is_gimple_assign (stmt)
 +   gimple_assign_rhs_code (stmt) == POINTER_PLUS_EXPR
 +   host_integerp (gimple_assign_rhs2 (stmt), 0)
 +   (t1 = int_cst_value (gimple_assign_rhs2 (stmt))) = 0)

No need to restrict this to positive offsets I think.

 +   {
 + ao_ref_init_from_ptr_and_size (ref, gimple_assign_rhs1 (stmt),
 size);

Please don't recurse - forwprop combines series of POINTER_PLUS_EXPRs
and MEM[ptr, offset] so it shouldn't be necessary.

 + ref-offset += 8 * t1;

BITS_PER_UNIT instead of 8.  I'd say just have a 0-initialized
additional_offset var that you unconditionally add ...

 + return;
 +   }
  }

if (TREE_CODE (ptr) == ADDR_EXPR)
  ref-base = get_ref_base_and_extent (TREE_OPERAND (ptr, 0),
  ref-offset, t1, t2);
else
  {
ref-base = build2 (MEM_REF, char_type_node,
   ptr, null_pointer_node);
ref-offset = 0;

... here at the end.

Thanks,
Richard.


Re: Aliasing: look through pointer's def stmt

2013-10-30 Thread Marc Glisse

On Wed, 30 Oct 2013, Richard Biener wrote:


On Wed, Oct 30, 2013 at 1:09 AM, Marc Glisse marc.gli...@inria.fr wrote:

On Tue, 29 Oct 2013, Richard Biener wrote:


For the POINTER_PLUS_EXPR offset argument you should use
int_cst_value () to access it (it will unconditionally sign-extend)
and use host_integerp (..., 0).  That leaves the overflow possibility
in place (and you should multiply by BITS_PER_UNIT) which we
ignore in enough other places similar to this to ignore ...



Like this? (passes bootstrap+testsuite on x86_64-linux-gnu)

2013-10-30  Marc Glisse  marc.gli...@inria.fr


gcc/
* tree-ssa-alias.c (ao_ref_init_from_ptr_and_size): Look for a
POINTER_PLUS_EXPR in the defining statement.

gcc/testsuite/
* gcc.dg/tree-ssa/alias-24.c: New file.


--
Marc Glisse

Index: gcc/testsuite/gcc.dg/tree-ssa/alias-24.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/alias-24.c(revision 0)
+++ gcc/testsuite/gcc.dg/tree-ssa/alias-24.c(working copy)
@@ -0,0 +1,22 @@
+/* { dg-do compile } */
+/* { dg-options -O2 -fdump-tree-optimized } */
+
+void f (const char *c, int *i)
+{
+  *i = 42;
+  __builtin_memcpy (i + 1, c, sizeof (int));
+  if (*i != 42) __builtin_abort();
+}
+
+extern void keepit ();
+void g (const char *c, int *i)
+{
+  *i = 33;
+  __builtin_memcpy (i - 1, c, 3 * sizeof (int));
+  if (*i != 33) keepit();
+}
+
+/* { dg-final { scan-tree-dump-not abort optimized } } */
+/* { dg-final { scan-tree-dump keepit optimized } } */
+/* { dg-final { cleanup-tree-dump optimized } } */
+

Property changes on: gcc/testsuite/gcc.dg/tree-ssa/alias-24.c
___
Added: svn:keywords
## -0,0 +1 ##
+Author Date Id Revision URL
\ No newline at end of property
Added: svn:eol-style
## -0,0 +1 ##
+native
\ No newline at end of property
Index: gcc/tree-ssa-alias.c
===
--- gcc/tree-ssa-alias.c(revision 204188)
+++ gcc/tree-ssa-alias.c(working copy)
@@ -567,20 +567,29 @@ void
 ao_ref_init_from_ptr_and_size (ao_ref *ref, tree ptr, tree size)
 {
   HOST_WIDE_INT t1, t2;
   ref-ref = NULL_TREE;
   if (TREE_CODE (ptr) == SSA_NAME)
 {
   gimple stmt = SSA_NAME_DEF_STMT (ptr);
   if (gimple_assign_single_p (stmt)
   gimple_assign_rhs_code (stmt) == ADDR_EXPR)
ptr = gimple_assign_rhs1 (stmt);
+  else if (is_gimple_assign (stmt)
+   gimple_assign_rhs_code (stmt) == POINTER_PLUS_EXPR
+   host_integerp (gimple_assign_rhs2 (stmt), 0)
+   (t1 = int_cst_value (gimple_assign_rhs2 (stmt))) = 0)


No need to restrict this to positive offsets I think.


Actually there is. The function documents that it only handles nonnegative 
offsets, and if we ignore that, the keepit part of the testcase fails 
because ranges_overlap_p works with unsigned offsets. I can try to change 
ranges_overlap_p and remove this restriction from 
ao_ref_init_from_ptr_and_size, but that's more risky so I'd prefer to do 
that as a separate follow-up patch if that's ok.



+   {
+ ao_ref_init_from_ptr_and_size (ref, gimple_assign_rhs1 (stmt),
size);


Please don't recurse - forwprop combines series of POINTER_PLUS_EXPRs
and MEM[ptr, offset] so it shouldn't be necessary.


Done. (note that if it isn't necessary, then it doesn't hurt either ;-)


+ ref-offset += 8 * t1;


BITS_PER_UNIT instead of 8.  I'd say just have a 0-initialized
additional_offset var that you unconditionally add ...


I also changed a few other 8s.


+ return;
+   }
 }

   if (TREE_CODE (ptr) == ADDR_EXPR)
 ref-base = get_ref_base_and_extent (TREE_OPERAND (ptr, 0),
 ref-offset, t1, t2);
   else
 {
   ref-base = build2 (MEM_REF, char_type_node,
  ptr, null_pointer_node);
   ref-offset = 0;


... here at the end.


Here is a new version, same testing, same ChangeLog.

--
Marc GlisseIndex: gcc/testsuite/gcc.dg/tree-ssa/alias-24.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/alias-24.c(revision 0)
+++ gcc/testsuite/gcc.dg/tree-ssa/alias-24.c(working copy)
@@ -0,0 +1,22 @@
+/* { dg-do compile } */
+/* { dg-options -O2 -fdump-tree-optimized } */
+
+void f (const char *c, int *i)
+{
+  *i = 42;
+  __builtin_memcpy (i + 1, c, sizeof (int));
+  if (*i != 42) __builtin_abort();
+}
+
+extern void keepit ();
+void g (const char *c, int *i)
+{
+  *i = 33;
+  __builtin_memcpy (i - 1, c, 3 * sizeof (int));
+  if (*i != 33) keepit();
+}
+
+/* { dg-final { scan-tree-dump-not abort optimized } } */
+/* { dg-final { scan-tree-dump keepit optimized } } */
+/* { dg-final { cleanup-tree-dump optimized } } */
+

Property changes on: gcc/testsuite/gcc.dg/tree-ssa/alias-24.c
___
Added: svn:keywords

Re: Aliasing: look through pointer's def stmt

2013-10-30 Thread Richard Biener
On Wed, Oct 30, 2013 at 1:43 PM, Marc Glisse marc.gli...@inria.fr wrote:
 On Wed, 30 Oct 2013, Richard Biener wrote:

 On Wed, Oct 30, 2013 at 1:09 AM, Marc Glisse marc.gli...@inria.fr wrote:

 On Tue, 29 Oct 2013, Richard Biener wrote:

 For the POINTER_PLUS_EXPR offset argument you should use
 int_cst_value () to access it (it will unconditionally sign-extend)
 and use host_integerp (..., 0).  That leaves the overflow possibility
 in place (and you should multiply by BITS_PER_UNIT) which we
 ignore in enough other places similar to this to ignore ...



 Like this? (passes bootstrap+testsuite on x86_64-linux-gnu)

 2013-10-30  Marc Glisse  marc.gli...@inria.fr


 gcc/
 * tree-ssa-alias.c (ao_ref_init_from_ptr_and_size): Look for a
 POINTER_PLUS_EXPR in the defining statement.

 gcc/testsuite/
 * gcc.dg/tree-ssa/alias-24.c: New file.


 --
 Marc Glisse

 Index: gcc/testsuite/gcc.dg/tree-ssa/alias-24.c
 ===
 --- gcc/testsuite/gcc.dg/tree-ssa/alias-24.c(revision 0)
 +++ gcc/testsuite/gcc.dg/tree-ssa/alias-24.c(working copy)
 @@ -0,0 +1,22 @@
 +/* { dg-do compile } */
 +/* { dg-options -O2 -fdump-tree-optimized } */
 +
 +void f (const char *c, int *i)
 +{
 +  *i = 42;
 +  __builtin_memcpy (i + 1, c, sizeof (int));
 +  if (*i != 42) __builtin_abort();
 +}
 +
 +extern void keepit ();
 +void g (const char *c, int *i)
 +{
 +  *i = 33;
 +  __builtin_memcpy (i - 1, c, 3 * sizeof (int));
 +  if (*i != 33) keepit();
 +}
 +
 +/* { dg-final { scan-tree-dump-not abort optimized } } */
 +/* { dg-final { scan-tree-dump keepit optimized } } */
 +/* { dg-final { cleanup-tree-dump optimized } } */
 +

 Property changes on: gcc/testsuite/gcc.dg/tree-ssa/alias-24.c
 ___
 Added: svn:keywords
 ## -0,0 +1 ##
 +Author Date Id Revision URL
 \ No newline at end of property
 Added: svn:eol-style
 ## -0,0 +1 ##
 +native
 \ No newline at end of property
 Index: gcc/tree-ssa-alias.c
 ===
 --- gcc/tree-ssa-alias.c(revision 204188)
 +++ gcc/tree-ssa-alias.c(working copy)
 @@ -567,20 +567,29 @@ void
  ao_ref_init_from_ptr_and_size (ao_ref *ref, tree ptr, tree size)
  {
HOST_WIDE_INT t1, t2;
ref-ref = NULL_TREE;
if (TREE_CODE (ptr) == SSA_NAME)
  {
gimple stmt = SSA_NAME_DEF_STMT (ptr);
if (gimple_assign_single_p (stmt)
gimple_assign_rhs_code (stmt) == ADDR_EXPR)
 ptr = gimple_assign_rhs1 (stmt);
 +  else if (is_gimple_assign (stmt)
 +   gimple_assign_rhs_code (stmt) == POINTER_PLUS_EXPR
 +   host_integerp (gimple_assign_rhs2 (stmt), 0)
 +   (t1 = int_cst_value (gimple_assign_rhs2 (stmt))) = 0)


 No need to restrict this to positive offsets I think.


 Actually there is. The function documents that it only handles nonnegative
 offsets, and if we ignore that, the keepit part of the testcase fails
 because ranges_overlap_p works with unsigned offsets. I can try to change
 ranges_overlap_p and remove this restriction from
 ao_ref_init_from_ptr_and_size, but that's more risky so I'd prefer to do
 that as a separate follow-up patch if that's ok.


 +   {
 + ao_ref_init_from_ptr_and_size (ref, gimple_assign_rhs1 (stmt),
 size);


 Please don't recurse - forwprop combines series of POINTER_PLUS_EXPRs
 and MEM[ptr, offset] so it shouldn't be necessary.


 Done. (note that if it isn't necessary, then it doesn't hurt either ;-)


 + ref-offset += 8 * t1;


 BITS_PER_UNIT instead of 8.  I'd say just have a 0-initialized
 additional_offset var that you unconditionally add ...


 I also changed a few other 8s.


 + return;
 +   }
  }

if (TREE_CODE (ptr) == ADDR_EXPR)
  ref-base = get_ref_base_and_extent (TREE_OPERAND (ptr, 0),
  ref-offset, t1, t2);
else
  {
ref-base = build2 (MEM_REF, char_type_node,
   ptr, null_pointer_node);
ref-offset = 0;


 ... here at the end.


 Here is a new version, same testing, same ChangeLog.

Ok.

Thanks,
Richard.

 --
 Marc Glisse
 Index: gcc/testsuite/gcc.dg/tree-ssa/alias-24.c
 ===
 --- gcc/testsuite/gcc.dg/tree-ssa/alias-24.c(revision 0)
 +++ gcc/testsuite/gcc.dg/tree-ssa/alias-24.c(working copy)
 @@ -0,0 +1,22 @@
 +/* { dg-do compile } */
 +/* { dg-options -O2 -fdump-tree-optimized } */
 +
 +void f (const char *c, int *i)
 +{
 +  *i = 42;
 +  __builtin_memcpy (i + 1, c, sizeof (int));
 +  if (*i != 42) __builtin_abort();
 +}
 +
 +extern void keepit ();
 +void g (const char *c, int *i)
 +{
 +  *i = 33;
 +  __builtin_memcpy (i - 1, c, 3 * sizeof (int));
 +  if (*i != 33) keepit();
 +}
 +
 +/* { dg-final { scan-tree-dump-not abort optimized } } */
 +/* { dg-final { scan-tree-dump keepit 

Re: Aliasing: look through pointer's def stmt

2013-10-29 Thread Richard Biener
On Sat, Oct 26, 2013 at 7:07 PM, Marc Glisse marc.gli...@inria.fr wrote:
 On Fri, 25 Oct 2013, Marc Glisse wrote:

 On Fri, 25 Oct 2013, Richard Biener wrote:

 you can followup with handling POINTER_PLUS_EXPR if you like.


 Like this? (bootstrap+testsuite on x86_64-unknown-linux-gnu)

 2013-10-26  Marc Glisse  marc.gli...@inria.fr

 gcc/
 * tree-ssa-alias.c (ao_ref_init_from_ptr_and_size): Look for a
 POINTER_PLUS_EXPR in the defining statement.


 This has some issues with the type of the offsets. First, they might
 overflow in some extreme cases (that could probably already happen without
 the patch). But mostly, negative offsets are not supported. There is a
 comment to that effect before ao_ref_init_from_ptr_and_size, and
 ranges_overlap_p takes the offsets as unsigned HOST_WIDE_INT. Currently it
 does indeed seem hard to produce a negative offset there, but handling
 POINTER_PLUS_EXPR (definitely a good thing) would obviously change that.

For the POINTER_PLUS_EXPR offset argument you should use
int_cst_value () to access it (it will unconditionally sign-extend)
and use host_integerp (..., 0).  That leaves the overflow possibility
in place (and you should multiply by BITS_PER_UNIT) which we
ignore in enough other places similar to this to ignore ...
(side-note: for quite some time on my TODO is to make the
gimple alias-machinery use byte-offsets instead of bit-offsets
which wouldn't cause regressions if we finally lowered bitfield
accesses ...).  A way to not ignore it is to do

  off = double_int_from_tree (ptr-plus-offset);
  off = double_int_sext (off, TYPE_PRECISION (...))
  off = double_int_mul (off, BITS_PER_UNIT);
  if (off.fits_shwi ())
...

Richard.


 --
 Marc Glisse


Re: Aliasing: look through pointer's def stmt

2013-10-29 Thread Marc Glisse

On Tue, 29 Oct 2013, Richard Biener wrote:


For the POINTER_PLUS_EXPR offset argument you should use
int_cst_value () to access it (it will unconditionally sign-extend)
and use host_integerp (..., 0).  That leaves the overflow possibility
in place (and you should multiply by BITS_PER_UNIT) which we
ignore in enough other places similar to this to ignore ...


Like this? (passes bootstrap+testsuite on x86_64-linux-gnu)

2013-10-30  Marc Glisse  marc.gli...@inria.fr

gcc/
* tree-ssa-alias.c (ao_ref_init_from_ptr_and_size): Look for a
POINTER_PLUS_EXPR in the defining statement.

gcc/testsuite/
* gcc.dg/tree-ssa/alias-24.c: New file.


--
Marc GlisseIndex: gcc/testsuite/gcc.dg/tree-ssa/alias-24.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/alias-24.c(revision 0)
+++ gcc/testsuite/gcc.dg/tree-ssa/alias-24.c(working copy)
@@ -0,0 +1,22 @@
+/* { dg-do compile } */
+/* { dg-options -O2 -fdump-tree-optimized } */
+
+void f (const char *c, int *i)
+{
+  *i = 42;
+  __builtin_memcpy (i + 1, c, sizeof (int));
+  if (*i != 42) __builtin_abort();
+}
+
+extern void keepit ();
+void g (const char *c, int *i)
+{
+  *i = 33;
+  __builtin_memcpy (i - 1, c, 3 * sizeof (int));
+  if (*i != 33) keepit();
+}
+
+/* { dg-final { scan-tree-dump-not abort optimized } } */
+/* { dg-final { scan-tree-dump keepit optimized } } */
+/* { dg-final { cleanup-tree-dump optimized } } */
+

Property changes on: gcc/testsuite/gcc.dg/tree-ssa/alias-24.c
___
Added: svn:keywords
## -0,0 +1 ##
+Author Date Id Revision URL
\ No newline at end of property
Added: svn:eol-style
## -0,0 +1 ##
+native
\ No newline at end of property
Index: gcc/tree-ssa-alias.c
===
--- gcc/tree-ssa-alias.c(revision 204188)
+++ gcc/tree-ssa-alias.c(working copy)
@@ -567,20 +567,29 @@ void
 ao_ref_init_from_ptr_and_size (ao_ref *ref, tree ptr, tree size)
 {
   HOST_WIDE_INT t1, t2;
   ref-ref = NULL_TREE;
   if (TREE_CODE (ptr) == SSA_NAME)
 {
   gimple stmt = SSA_NAME_DEF_STMT (ptr);
   if (gimple_assign_single_p (stmt)
   gimple_assign_rhs_code (stmt) == ADDR_EXPR)
ptr = gimple_assign_rhs1 (stmt);
+  else if (is_gimple_assign (stmt)
+   gimple_assign_rhs_code (stmt) == POINTER_PLUS_EXPR
+   host_integerp (gimple_assign_rhs2 (stmt), 0)
+   (t1 = int_cst_value (gimple_assign_rhs2 (stmt))) = 0)
+   {
+ ao_ref_init_from_ptr_and_size (ref, gimple_assign_rhs1 (stmt), size);
+ ref-offset += 8 * t1;
+ return;
+   }
 }
 
   if (TREE_CODE (ptr) == ADDR_EXPR)
 ref-base = get_ref_base_and_extent (TREE_OPERAND (ptr, 0),
 ref-offset, t1, t2);
   else
 {
   ref-base = build2 (MEM_REF, char_type_node,
  ptr, null_pointer_node);
   ref-offset = 0;


Re: Aliasing: look through pointer's def stmt

2013-10-26 Thread Marc Glisse

On Fri, 25 Oct 2013, Marc Glisse wrote:


On Fri, 25 Oct 2013, Richard Biener wrote:


you can followup with handling POINTER_PLUS_EXPR if you like.


Like this? (bootstrap+testsuite on x86_64-unknown-linux-gnu)

2013-10-26  Marc Glisse  marc.gli...@inria.fr

gcc/
* tree-ssa-alias.c (ao_ref_init_from_ptr_and_size): Look for a
POINTER_PLUS_EXPR in the defining statement.


This has some issues with the type of the offsets. First, they might 
overflow in some extreme cases (that could probably already happen without 
the patch). But mostly, negative offsets are not supported. There is a 
comment to that effect before ao_ref_init_from_ptr_and_size, and 
ranges_overlap_p takes the offsets as unsigned HOST_WIDE_INT. Currently it 
does indeed seem hard to produce a negative offset there, but handling 
POINTER_PLUS_EXPR (definitely a good thing) would obviously change that.


--
Marc Glisse


Re: Aliasing: look through pointer's def stmt

2013-10-25 Thread Marc Glisse

On Thu, 24 Oct 2013, Jeff Law wrote:


On 10/24/13 23:23, Marc Glisse wrote:

Hello,

I noticed that in some cases we were failing to find aliasing
information because we were only looking at an SSA_NAME variable,
missing the fact that it was really an ADDR_EXPR. The attached patch
passes bootstrap+testsuite, does it make sense? (I am a bit afraid of
losing some type information for instance)

I didn't investigate the 2 tests where I had to remove dg-bogus, because
removing dg-bogus sounds like a bonus...

2013-10-25  Marc Glisse  marc.gli...@inria.fr

gcc/
 * tree-ssa-alias.c (ao_ref_init_from_ptr_and_size): Look for an
 ADDR_EXPR in the defining statement.

Shouldn't the ADDR_EXPR have been propagated into the use?


Maybe when the address is a constant, but here it comes from malloc.

--
Marc Glisse


Re: Aliasing: look through pointer's def stmt

2013-10-25 Thread Richard Biener
On Fri, Oct 25, 2013 at 8:11 AM, Marc Glisse marc.gli...@inria.fr wrote:
 On Thu, 24 Oct 2013, Jeff Law wrote:

 On 10/24/13 23:23, Marc Glisse wrote:

 Hello,

 I noticed that in some cases we were failing to find aliasing
 information because we were only looking at an SSA_NAME variable,
 missing the fact that it was really an ADDR_EXPR. The attached patch
 passes bootstrap+testsuite, does it make sense? (I am a bit afraid of
 losing some type information for instance)

 I didn't investigate the 2 tests where I had to remove dg-bogus, because
 removing dg-bogus sounds like a bonus...

 2013-10-25  Marc Glisse  marc.gli...@inria.fr

 gcc/
  * tree-ssa-alias.c (ao_ref_init_from_ptr_and_size): Look for an
  ADDR_EXPR in the defining statement.

 Shouldn't the ADDR_EXPR have been propagated into the use?


 Maybe when the address is a constant, but here it comes from malloc.

points-to should have propagated the alias info, so no, looking at
def-stmts in random places like this isn't ok.  Where does alias info
get lost?

Thanks,
Richard.

 --
 Marc Glisse


Re: Aliasing: look through pointer's def stmt

2013-10-25 Thread Richard Biener
On Fri, Oct 25, 2013 at 10:59 AM, Richard Biener
richard.guent...@gmail.com wrote:
 On Fri, Oct 25, 2013 at 8:11 AM, Marc Glisse marc.gli...@inria.fr wrote:
 On Thu, 24 Oct 2013, Jeff Law wrote:

 On 10/24/13 23:23, Marc Glisse wrote:

 Hello,

 I noticed that in some cases we were failing to find aliasing
 information because we were only looking at an SSA_NAME variable,
 missing the fact that it was really an ADDR_EXPR. The attached patch
 passes bootstrap+testsuite, does it make sense? (I am a bit afraid of
 losing some type information for instance)

 I didn't investigate the 2 tests where I had to remove dg-bogus, because
 removing dg-bogus sounds like a bonus...

 2013-10-25  Marc Glisse  marc.gli...@inria.fr

 gcc/
  * tree-ssa-alias.c (ao_ref_init_from_ptr_and_size): Look for an
  ADDR_EXPR in the defining statement.

 Shouldn't the ADDR_EXPR have been propagated into the use?


 Maybe when the address is a constant, but here it comes from malloc.

 points-to should have propagated the alias info, so no, looking at
 def-stmts in random places like this isn't ok.  Where does alias info
 get lost?

It doesn't get lost, but this is the missed optimization that malloc
results still alias with global pointers (here a function argument).
Taking into account the offset shouldn't change anything.

I suppose you are looking for the memcpy to be folded into an
assignment?  Which ao_ref_from_ptr_and_size is critical for
the transform - it doesn't seem to be the one in memory op folding.

Richard.

 Thanks,
 Richard.

 --
 Marc Glisse


Re: Aliasing: look through pointer's def stmt

2013-10-25 Thread Marc Glisse

On Fri, 25 Oct 2013, Richard Biener wrote:


On Fri, Oct 25, 2013 at 10:59 AM, Richard Biener
richard.guent...@gmail.com wrote:

On Fri, Oct 25, 2013 at 8:11 AM, Marc Glisse marc.gli...@inria.fr wrote:

On Thu, 24 Oct 2013, Jeff Law wrote:


On 10/24/13 23:23, Marc Glisse wrote:


Hello,

I noticed that in some cases we were failing to find aliasing
information because we were only looking at an SSA_NAME variable,
missing the fact that it was really an ADDR_EXPR. The attached patch
passes bootstrap+testsuite, does it make sense? (I am a bit afraid of
losing some type information for instance)

I didn't investigate the 2 tests where I had to remove dg-bogus, because
removing dg-bogus sounds like a bonus...

2013-10-25  Marc Glisse  marc.gli...@inria.fr

gcc/
 * tree-ssa-alias.c (ao_ref_init_from_ptr_and_size): Look for an
 ADDR_EXPR in the defining statement.


Shouldn't the ADDR_EXPR have been propagated into the use?



Maybe when the address is a constant, but here it comes from malloc.


points-to should have propagated the alias info, so no, looking at
def-stmts in random places like this isn't ok.  Where does alias info
get lost?


By points-to, do you mean in pt_solution? That's very rough information,
and in particular here it can't tell me that 2 fields of the same struct
don't alias, can it?


It doesn't get lost, but this is the missed optimization that malloc
results still alias with global pointers (here a function argument).
Taking into account the offset shouldn't change anything.


I think this is a different issue (but if you say improving malloc would 
help, maybe).



I suppose you are looking for the memcpy to be folded into an
assignment?  Which ao_ref_from_ptr_and_size is critical for
the transform - it doesn't seem to be the one in memory op folding.


No, I only used memcpy as an example, my goal is for the compiler to 
notice that the call (memcpy here, possibly other functions with some new 
attribute later) doesn't clobber the counter (i) and thus the counter 
value is the same after the call as it was before.


If memcpy gets turned to an assignment (which alignment should prevent), 
my testcase becomes useless.


--
Marc Glisse


Re: Aliasing: look through pointer's def stmt

2013-10-25 Thread Richard Biener
On Fri, Oct 25, 2013 at 11:29 AM, Marc Glisse marc.gli...@inria.fr wrote:
 On Fri, 25 Oct 2013, Richard Biener wrote:

 On Fri, Oct 25, 2013 at 10:59 AM, Richard Biener
 richard.guent...@gmail.com wrote:

 On Fri, Oct 25, 2013 at 8:11 AM, Marc Glisse marc.gli...@inria.fr
 wrote:

 On Thu, 24 Oct 2013, Jeff Law wrote:

 On 10/24/13 23:23, Marc Glisse wrote:


 Hello,

 I noticed that in some cases we were failing to find aliasing
 information because we were only looking at an SSA_NAME variable,
 missing the fact that it was really an ADDR_EXPR. The attached patch
 passes bootstrap+testsuite, does it make sense? (I am a bit afraid of
 losing some type information for instance)

 I didn't investigate the 2 tests where I had to remove dg-bogus,
 because
 removing dg-bogus sounds like a bonus...

 2013-10-25  Marc Glisse  marc.gli...@inria.fr

 gcc/
  * tree-ssa-alias.c (ao_ref_init_from_ptr_and_size): Look for an
  ADDR_EXPR in the defining statement.


 Shouldn't the ADDR_EXPR have been propagated into the use?



 Maybe when the address is a constant, but here it comes from malloc.


 points-to should have propagated the alias info, so no, looking at
 def-stmts in random places like this isn't ok.  Where does alias info
 get lost?


 By points-to, do you mean in pt_solution? That's very rough information,
 and in particular here it can't tell me that 2 fields of the same struct
 don't alias, can it?

No, it cannot.

 It doesn't get lost, but this is the missed optimization that malloc
 results still alias with global pointers (here a function argument).
 Taking into account the offset shouldn't change anything.


 I think this is a different issue (but if you say improving malloc would
 help, maybe).


 I suppose you are looking for the memcpy to be folded into an
 assignment?  Which ao_ref_from_ptr_and_size is critical for
 the transform - it doesn't seem to be the one in memory op folding.


 No, I only used memcpy as an example, my goal is for the compiler to notice
 that the call (memcpy here, possibly other functions with some new attribute
 later) doesn't clobber the counter (i) and thus the counter value is the
 same after the call as it was before.

Ah, so you are looking at call_may_clobber_ref_p_1 and its pointer handling
when special-casing builtins?  Note that fields can only be disambiguated
if the size of the access is known (not sure what fancy attribute you
are going to invent here ...).  Generally the simple alias machinery
is written to be cheap, walking use-def chains isn't.  Users do not have
to expect use-def chains to be walked (we can change that rule of course,
but for example the RTL alias machinery also shares the core of the
gimple alias machinery.

But I can see that in this particular case the patch makes sense.  Still,

+  if (TREE_CODE (ptr) == SSA_NAME)
+{
+  gimple stmt = SSA_NAME_DEF_STMT (ptr);
+  if (gimple_assign_single_p (stmt)
+  !gimple_has_volatile_ops (stmt)
+  gimple_assign_rhs_code (stmt) == ADDR_EXPR)
+   ptr = gimple_assign_rhs1 (stmt);
+}

no need to look at gimple_has_volatile_ops (stmt).  Also you want to
handle

 p_2 = p_1 + CST;
 foo (p_2);

which has a related canonical form,

 p_2 = MEM[p_1, CST];

The patch is ok with the volatile check removed, you can followup
with handling POINTER_PLUS_EXPR if you like.

Thanks,
Richard.

 If memcpy gets turned to an assignment (which alignment should prevent), my
 testcase becomes useless.

 --
 Marc Glisse


Re: Aliasing: look through pointer's def stmt

2013-10-25 Thread Marc Glisse

On Fri, 25 Oct 2013, Richard Biener wrote:


Ah, so you are looking at call_may_clobber_ref_p_1 and its pointer handling
when special-casing builtins?


Yes.


Note that fields can only be disambiguated
if the size of the access is known


TBAA could also help sometimes.


(not sure what fancy attribute you are going to invent here ...).


That will certainly require quite a bit of discussion...


Generally the simple alias machinery is written to be cheap,


I wouldn't mind an expensive version ;-)


walking use-def chains isn't.


Peeking at the defining statement shouldn't be very costly, as long as you 
don't do it recursively.



no need to look at gimple_has_volatile_ops (stmt).


Ok.


Also you want to handle

p_2 = p_1 + CST;
foo (p_2);

which has a related canonical form,

p_2 = MEM[p_1, CST];


This testcase seems relevant, I'll see if I can handle it:

void f (const char *c, int *i)
{
  *i = 42;
  __builtin_memcpy (i + 1, c, sizeof (int));
  if (*i != 42) __builtin_abort();
}


The patch is ok with the volatile check removed, you can followup
with handling POINTER_PLUS_EXPR if you like.


Thanks.

--
Marc Glisse


Re: Aliasing: look through pointer's def stmt

2013-10-25 Thread Tobias Burnus

Marc Glisse wrote:

I noticed that in some cases we were failing to find aliasing
information because we were only looking at an SSA_NAME variable,
missing the fact that it was really an ADDR_EXPR. The attached patch
passes bootstrap+testsuite, does it make sense? (I am a bit afraid of
losing some type information for instance)

I didn't investigate the 2 tests where I had to remove dg-bogus, because
removing dg-bogus sounds like a bonus...


I wonder why you are seeing failures with those test cases. I tired your 
patch (w/o modifying the test cases) and they passed.


The idea of those test cases is to ensure that there is no output like
note: loop versioned for vectorization because of possible aliasing
Because that output would be a hint that #pragma GCC ivdep doesn't work.

What kind of output do you see? Seemingly not the one above as you kept 
the version dg-bogus.


Tobias


Re: Aliasing: look through pointer's def stmt

2013-10-25 Thread Marc Glisse

On Fri, 25 Oct 2013, Tobias Burnus wrote:


Marc Glisse wrote:

I noticed that in some cases we were failing to find aliasing
information because we were only looking at an SSA_NAME variable,
missing the fact that it was really an ADDR_EXPR. The attached patch
passes bootstrap+testsuite, does it make sense? (I am a bit afraid of
losing some type information for instance)

I didn't investigate the 2 tests where I had to remove dg-bogus, because
removing dg-bogus sounds like a bonus...


I wonder why you are seeing failures with those test cases. I tired your 
patch (w/o modifying the test cases) and they passed.


The idea of those test cases is to ensure that there is no output like
note: loop versioned for vectorization because of possible aliasing
Because that output would be a hint that #pragma GCC ivdep doesn't work.

What kind of output do you see? Seemingly not the one above as you kept the 
version dg-bogus.


beats self on the head
My source directory includes the word alias in it. So I'll leave those 2 
dg-bogus alone, but please tighten the regexp a bit, include at least a 
space or something...


--
Marc Glisse


Re: Aliasing: look through pointer's def stmt

2013-10-25 Thread Jeff Law

On 10/25/13 03:29, Marc Glisse wrote:



It doesn't get lost, but this is the missed optimization that malloc
results still alias with global pointers (here a function argument).
Taking into account the offset shouldn't change anything.


I think this is a different issue (but if you say improving malloc would
help, maybe).
Has the patent on Steensgaard's techniques expired yet?  IIRC it would 
handle this kind of stuff quite well.


Jeff


Re: Aliasing: look through pointer's def stmt

2013-10-25 Thread Marc Glisse

On Fri, 25 Oct 2013, Richard Biener wrote:


you can followup with handling POINTER_PLUS_EXPR if you like.


Like this? (bootstrap+testsuite on x86_64-unknown-linux-gnu)

2013-10-26  Marc Glisse  marc.gli...@inria.fr

gcc/
* tree-ssa-alias.c (ao_ref_init_from_ptr_and_size): Look for a
POINTER_PLUS_EXPR in the defining statement.

gcc/testsuite/
* gcc.dg/tree-ssa/alias-24.c: New file.

--
Marc GlisseIndex: gcc/testsuite/gcc.dg/tree-ssa/alias-24.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/alias-24.c(revision 0)
+++ gcc/testsuite/gcc.dg/tree-ssa/alias-24.c(working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-options -O2 -fdump-tree-optimized } */
+
+void f (const char *c, int *i)
+{
+  *i = 42;
+  __builtin_memcpy (i + 1, c, sizeof (int));
+  if (*i != 42) __builtin_abort();
+}
+
+/* { dg-final { scan-tree-dump-not abort optimized } } */
+/* { dg-final { cleanup-tree-dump optimized } } */
+

Property changes on: gcc/testsuite/gcc.dg/tree-ssa/alias-24.c
___
Added: svn:keywords
## -0,0 +1 ##
+Author Date Id Revision URL
\ No newline at end of property
Added: svn:eol-style
## -0,0 +1 ##
+native
\ No newline at end of property
Index: gcc/tree-ssa-alias.c
===
--- gcc/tree-ssa-alias.c(revision 204071)
+++ gcc/tree-ssa-alias.c(working copy)
@@ -567,20 +567,28 @@ void
 ao_ref_init_from_ptr_and_size (ao_ref *ref, tree ptr, tree size)
 {
   HOST_WIDE_INT t1, t2;
   ref-ref = NULL_TREE;
   if (TREE_CODE (ptr) == SSA_NAME)
 {
   gimple stmt = SSA_NAME_DEF_STMT (ptr);
   if (gimple_assign_single_p (stmt)
   gimple_assign_rhs_code (stmt) == ADDR_EXPR)
ptr = gimple_assign_rhs1 (stmt);
+  else if (is_gimple_assign (stmt)
+   gimple_assign_rhs_code (stmt) == POINTER_PLUS_EXPR
+   host_integerp (gimple_assign_rhs2 (stmt), 1))
+   {
+ ao_ref_init_from_ptr_and_size (ref, gimple_assign_rhs1 (stmt), size);
+ ref-offset += 8 * TREE_INT_CST_LOW (gimple_assign_rhs2 (stmt));
+ return;
+   }
 }
 
   if (TREE_CODE (ptr) == ADDR_EXPR)
 ref-base = get_ref_base_and_extent (TREE_OPERAND (ptr, 0),
 ref-offset, t1, t2);
   else
 {
   ref-base = build2 (MEM_REF, char_type_node,
  ptr, null_pointer_node);
   ref-offset = 0;


Re: Aliasing: look through pointer's def stmt

2013-10-24 Thread Jeff Law

On 10/24/13 23:23, Marc Glisse wrote:

Hello,

I noticed that in some cases we were failing to find aliasing
information because we were only looking at an SSA_NAME variable,
missing the fact that it was really an ADDR_EXPR. The attached patch
passes bootstrap+testsuite, does it make sense? (I am a bit afraid of
losing some type information for instance)

I didn't investigate the 2 tests where I had to remove dg-bogus, because
removing dg-bogus sounds like a bonus...

2013-10-25  Marc Glisse  marc.gli...@inria.fr

gcc/
 * tree-ssa-alias.c (ao_ref_init_from_ptr_and_size): Look for an
 ADDR_EXPR in the defining statement.

Shouldn't the ADDR_EXPR have been propagated into the use?

Jeff



Re: aliasing warnings

2010-04-23 Thread Ian Lance Taylor
Eduard Hasenleithner edu...@hasenleithner.at writes:

 Since gcc 4.4 I get considerably more
 dereferencing type-punned pointer will break strict-aliasing rules
 warnings with my code. The code ist used for message en- and de-capsulation.

This question is not appropriate for the mailing list gcc@gcc.gnu.org,
which is for gcc developers.  It would be appropriate for
gcc-h...@gcc.gnu.org.  Please take any followups to gcc-help.  Thanks.


This warning will reliably report cases where your code may be
miscompiled.  It does not reliably report all cases where the C/C++
aliasing rules are broken.  We can only report warnings that we can
find, and the warnings are limited by the quality of the alias
analysis done in the compiler.  As the alias analysis gets more
sophisticated, the warnings detects more invalid code.


 struct msg {
   uint8_t msg_type;
   uint8_t msg_len;
   uint8_t msg_data[0];
 };

 struct msg *msg = receive();
 if (msg-msg_type==5  msg-msg_len==8) {
   printf(a: %d\n, *(uint32_t*)(msg-msg_data+0));
   printf(b: %d\n, *(uint32_t*)(msg-msg_data+4));
 }

This looks to me like a pretty clear aliasing violation.  You are
accessing data defined in one type using a different type.  The
difference between the two cases is admittedly rather subtle.
msg-msg_data+0 is accessing the object msg-msg_data which has type
uint8_t[0].  msg-msg_data+4 is already wandering off into unspecified
behaviour, as you are accessing beyond the end of the array (the C99
syntax for a flexible array is msg_data[], not msg_data[0]; I can't
recall whether C++ permits this).  The type of that is uint8_t*.  This
is still an aliasing violation, but evidently one that the compiler
does not detect.

Ian


Re: Aliasing bug

2009-07-02 Thread Richard Guenther
On Thu, Jul 2, 2009 at 3:21 PM, Andrew Stubbsa...@codesourcery.com wrote:
 Hi all,

 I'm fairly sure I have found an aliasing bug in GCC, although I could be
 wrong. I've reproduced it in both 4.4 and mainline.

 Consider this testcase, aliasing.c:

    extern void *foo;

    extern inline short **
    f1 (void)
    {
      union
        {
          void **v;
          short **s;
        } u;

      u.v = (foo);
      if (*u.s == 0) *u.s = (short *)42;
      return u.s;
    }

    const short *a, *b;

    int
    f ()
    {
      a = *f1();
      b = *f1();
    }

 The (not very useful) testcase initialises foo to 42, if necessary, and then
 sets both 'a' and 'b' to equal foo. There should be no way that 'a' and 'b'
 can ever be set to zero.

 Compile the code as follows:

  sh-linux-gnu-gcc -c aliasing.c -O2 -fdump-tree-all

 The dump file aliasing.c.133t.optimized (the last tree dump) then contains:

    f ()
    {
      void * foo.3;
      short int * D.1982;
      short int * * D.1973;

    bb 2:
      foo.3_10 = foo;
      D.1982_26 = (short int *) foo.3_10;
      if (D.1982_26 == 0B)
        goto bb 3;
      else
        goto bb 5;

    bb 3:
      D.1973_13 = (short int * *) foo;
      *D.1973_13 = 42B;
      a = 0B;

    bb 4:
      b = D.1982_26;
      return;

    bb 5:
      a = D.1982_26;
      goto bb 4;

    }

 This is the state of the code after the tree optimisations. Both 'a' and 'b'
 are set to the initial value of foo, before it was initialised. Not only
 that, but 'a' is explicitly set to zero.

 This problem goes away if -fno-strict-aliasing is used.

 Is this a compiler bug? Or have I got something wrong in my code?

You are writing to memory of type void * via an lvalue of type short *.

Richard.

 Thanks

 Andrew



Re: Aliasing bug

2009-07-02 Thread Andrew Stubbs

On 02/07/09 14:26, Richard Guenther wrote:

You are writing to memory of type void * via an lvalue of type short *.


Yes, there is type punning there, but that should work, shouldn't it?

This code is distilled from some glibc code I'm having trouble with.

Andrew


Re: Aliasing bug

2009-07-02 Thread Richard Guenther
On Thu, Jul 2, 2009 at 3:29 PM, Andrew Stubbsa...@codesourcery.com wrote:
 On 02/07/09 14:26, Richard Guenther wrote:

 You are writing to memory of type void * via an lvalue of type short *.

 Yes, there is type punning there, but that should work, shouldn't it?

No, that's invalid.  You would have to do

extern union {
  void *foo;
  short *bar;
};

using the union for the double-indirect pointer doesn't help.  Or
simply use memcpy to store to foo.

Richard.


Re: Aliasing bug

2009-07-02 Thread Andrew Stubbs

On 02/07/09 14:34, Richard Guenther wrote:

No, that's invalid.  You would have to do

extern union {
   void *foo;
   short *bar;
};

using the union for the double-indirect pointer doesn't help.  Or
simply use memcpy to store to foo.


Ah, I did not know that. I still don't understand how a reference to a 
memory location that happens to contain a pointer is different to one 
what contains other data?


Anyway, I see that the glibc code has, in fact, already been fixed here: 
http://sourceware.org/ml/libc-alpha/2008-11/msg4.html


Thank you.

Andrew


Re: Aliasing bug

2009-07-02 Thread Andrew Haley
Andrew Stubbs wrote:
 On 02/07/09 14:34, Richard Guenther wrote:
 No, that's invalid.  You would have to do

 extern union {
void *foo;
short *bar;
 };

 using the union for the double-indirect pointer doesn't help.  Or
 simply use memcpy to store to foo.
 
 Ah, I did not know that. I still don't understand how a reference to a
 memory location that happens to contain a pointer is different to one
 what contains other data?

It's easy enough to find out: just look at the C standard.  6.3.2.3, Pointers.

Andrew.


Re: Aliasing bug

2009-07-02 Thread Richard Guenther
On Thu, Jul 2, 2009 at 3:46 PM, Andrew Stubbsa...@codesourcery.com wrote:
 On 02/07/09 14:34, Richard Guenther wrote:

 No, that's invalid.  You would have to do

 extern union {
   void *foo;
   short *bar;
 };

 using the union for the double-indirect pointer doesn't help.  Or
 simply use memcpy to store to foo.

 Ah, I did not know that. I still don't understand how a reference to a
 memory location that happens to contain a pointer is different to one what
 contains other data?

It is not different.

 Anyway, I see that the glibc code has, in fact, already been fixed here:
 http://sourceware.org/ml/libc-alpha/2008-11/msg4.html

Great.

Richard.

 Thank you.

 Andrew



Re: Aliasing error in man page?

2007-01-10 Thread Ian Lance Taylor
Andrew Haley [EMAIL PROTECTED] writes:

 ---
 *(void **) (cosine) = dlsym(handle, cos);
 ---

 That is a strict-aliasing error, is it?

Yes, and a seemingly pointless one at that.

Ian


Re: Aliasing: reliable code or not?

2006-11-29 Thread Jakub Jelinek
On Tue, Nov 28, 2006 at 11:36:19PM -0800, Ian Lance Taylor wrote:
 Or you can use constructor expressions to make this slightly more
 elegant, though I retain the assumptions about type sizes:
 
 char *foo1(char* buf){
   memcpy(buf, (char[]) { 42 }, 1);
   buf += 1;
   memcpy(buf, (short[]) { 0xfeed }, 2);
   buf += 2;
   memcpy(buf, (int[]) { 0x12345678 }, 4);
   buf += 4;
   memcpy(buf, (int[]) { 0x12345678 }, 4);
   buf += 4;
   return buf;
 }

Or even use mempcpy to make it even more compact:

char *
foo1 (char *buf)
{
  buf = mempcpy (buf, (char[]) { 42 }, 1);
  buf = mempcpy (buf, (short[]) { 0xfeed }, 2);
  buf = mempcpy (buf, (int[]) { 0x12345678 }, 4);
  buf = mempcpy (buf, (int[]) { 0x12345678 }, 4);
  return buf;
}

Jakub


Re: Aliasing: reliable code or not?

2006-11-28 Thread Andrew Pinski
 
 I have code that goes something like this:
 
 char *foo(char *buf){
 *buf++ = 42;
 *((short*)buf) = 0xfeed;
 buf += 2;
 *((int*)buf) = 0x12345678;
 buf += 4;
 *((int*)buf) = 0x12345678;
 buf += 4;
 return buf;
 }

This does violate C aliasing rules.

Here is how I would write it so you can get the best results:

char *foo(char *buf)
{
  short temp;
  int temp1;
  *buf++=42;
  temp = 0xfeed;
  memcpy(buf, temp, sizeof(temp));
  buf+=sizeof(temp);
  temp1 = 0x12345678;
  memcpy(buf, temp1, sizeof(temp1));
  buf+=sizeof(temp1);
  temp1 = 0x12345678;
  memcpy(buf, temp1, sizeof(temp1));
  buf+=sizeof(temp1);
  return buf;
}

As using memcpy does not cause a violation of the aliasing rules.

And does the correct thing:
lis 11,0x1234
li 0,42
li 9,-275
ori 11,11,22136
stb 0,0(3)
sth 9,1(3)
stw 11,7(3)
stw 11,3(3)
addi 3,3,11


-- Pinski


Re: Aliasing: reliable code or not?

2006-11-28 Thread Ian Lance Taylor
Andrew Pinski [EMAIL PROTECTED] writes:

 Here is how I would write it so you can get the best results:
 
 char *foo(char *buf)
 {
   short temp;
   int temp1;
   *buf++=42;
   temp = 0xfeed;
   memcpy(buf, temp, sizeof(temp));
   buf+=sizeof(temp);
   temp1 = 0x12345678;
   memcpy(buf, temp1, sizeof(temp1));
   buf+=sizeof(temp1);
   temp1 = 0x12345678;
   memcpy(buf, temp1, sizeof(temp1));
   buf+=sizeof(temp1);
   return buf;
 }

Or you can use constructor expressions to make this slightly more
elegant, though I retain the assumptions about type sizes:

char *foo1(char* buf){
  memcpy(buf, (char[]) { 42 }, 1);
  buf += 1;
  memcpy(buf, (short[]) { 0xfeed }, 2);
  buf += 2;
  memcpy(buf, (int[]) { 0x12345678 }, 4);
  buf += 4;
  memcpy(buf, (int[]) { 0x12345678 }, 4);
  buf += 4;
  return buf;
}

Ian


Re: Aliasing sets on Arrays Types

2006-03-21 Thread Richard Guenther
On 3/21/06, Andrew Pinski [EMAIL PROTECTED] wrote:
 Take the following C code:
 typedef long atype[];
 typedef long atype1[];

 int NumSift (atype *a, atype1 *a1)
 {
(*a)[0] = 0;
(*a1)[0] = 1;
return (*a)[0];
 }
 Shouldn't the aliasing set for the type atype be the same as atype1?

Im not entirely sure, but the C99 standard does at least not suggest
otherwise, instead it says (6.7.7/3) A typedef declaration does not introduce
a new type, only a synonym for the type so specified.  And continues
with an example

After the declarations
typedef struct s1 { int x; } t1, *tp1;
typedef struct s2 { int x; } t2, *tp2;
type t1 and the type pointed to by tp1 are compatible. Type t1 is also
compatible with type struct
s1, but not compatible with the types struct s2, t2, the type pointed
to by tp2, or int.

where it explicitly says sth about structures, but not array types.

Richard.


Re: Aliasing sets on Arrays Types

2006-03-21 Thread Andrew Haley
Richard Guenther writes:
  On 3/21/06, Andrew Pinski [EMAIL PROTECTED] wrote:
   Take the following C code:
   typedef long atype[];
   typedef long atype1[];
  
   int NumSift (atype *a, atype1 *a1)
   {
  (*a)[0] = 0;
  (*a1)[0] = 1;
  return (*a)[0];
   }
   Shouldn't the aliasing set for the type atype be the same as atype1?
  
  Im not entirely sure, but the C99 standard does at least not suggest
  otherwise, instead it says (6.7.7/3) A typedef declaration does not 
  introduce
  a new type, only a synonym for the type so specified.

atype and atype1 are compatible bacsue of 6.7.5.2, Array declarators:

6   For two array types to be compatible, both shall have compatible
element types, and if both size specifiers are present, and are
integer constant expressions, then both size specifiers shall have
the same constant value. If the two array types are used in a
context which requires them to be compatible, it is undefined
behavior if the two size specifiers evaluate to unequal values.

I assume that compatible types should be in the same alias set.

Andrew.



Re: Aliasing violation generated by fold_builtin_memcmp?

2005-09-30 Thread Olivier Hainque
Richard Henderson wrote:
 Try
 
   cst_uchar_ptr_node
 = build_pointer_type_for_mode (cst_uchar_node, ptr_mode, true);
 
 which is apparently in use by the Ada front end, but only if a
 certain pragma is given.  Dunno how reliably that's likely to work.

 We are seeing regressions in our local testsuite on cases exercising
 that pragma.

 Passing 'true' as CAN_ALIAS_ALL sets TYPE_REF_CAN_ALIAS_ALL (t), but
 this apparently has no influence on what tree-ssa-alias computes.

 Out of a preliminary look into this code (new to me), a possible place
 to address that appears to be 'get_tmt_for', which presumably should assign
 a zero alias set to tags for pointer types with that bit set.

 The current code doesn't do that:

   tree tag_type = TREE_TYPE (TREE_TYPE (ptr));
   HOST_WIDE_INT tag_set = get_alias_set (tag_type);


 I'd be happy to work-on and submit a patch to deal with this the proper
 way. I'd welcome hints or directions on what the proper way should be, as
 I don't yet have a global view of the complete alias analysis circuitry.

 The 'tag alias set should be zero if ...' idea above seems logical to me.
 I still may well not yet see a number of other options.





 




 
 


Re: Aliasing violation generated by fold_builtin_memcmp?

2005-09-30 Thread Daniel Berlin
On Fri, 2005-09-30 at 11:23 +0200, Olivier Hainque wrote:
 Richard Henderson wrote:
  Try
  
cst_uchar_ptr_node
  = build_pointer_type_for_mode (cst_uchar_node, ptr_mode, true);
  
  which is apparently in use by the Ada front end, but only if a
  certain pragma is given.  Dunno how reliably that's likely to work.
 
  We are seeing regressions in our local testsuite on cases exercising
  that pragma.
 
  Passing 'true' as CAN_ALIAS_ALL sets TYPE_REF_CAN_ALIAS_ALL (t), but
  this apparently has no influence on what tree-ssa-alias computes.
 
  Out of a preliminary look into this code (new to me), a possible place
  to address that appears to be 'get_tmt_for', which presumably should assign
  a zero alias set to tags for pointer types with that bit set.

Actually, you just want it to assign the same tag, not change the alias
set of every tag it assigns.




Re: Aliasing violation generated by fold_builtin_memcmp?

2005-09-30 Thread Olivier Hainque
Daniel Berlin wrote:
   Out of a preliminary look into this code (new to me), a possible
   place to address that appears to be 'get_tmt_for', which
   presumably should assign a zero alias set to tags for pointer
   types with that bit set.
 
 Actually, you just want it to assign the same tag, not change the alias
 set of every tag it assigns.

 Humm, and still have the corresponding alias set zero to ensure it conflicts
 with everything, right ?

 In which case almost all the bits are in already, since get_tmt_for
 already reuses a previously created tag if it assigned the same alias set.

 I tried this on a couple of testcases yesterday, and it worked fine.

 I am still unclear on one point: is it fine to reuse the same tag for
 possibly different designated types ?











Re: Aliasing violation generated by fold_builtin_memcmp?

2005-09-30 Thread Daniel Berlin
On Fri, 2005-09-30 at 15:54 +0200, Olivier Hainque wrote:
 Daniel Berlin wrote:
Out of a preliminary look into this code (new to me), a possible
place to address that appears to be 'get_tmt_for', which
presumably should assign a zero alias set to tags for pointer
types with that bit set.
  
  Actually, you just want it to assign the same tag, not change the alias
  set of every tag it assigns.
 
  Humm, and still have the corresponding alias set zero to ensure it conflicts
  with everything, right ?

Yes. :)

 
  In which case almost all the bits are in already, since get_tmt_for
  already reuses a previously created tag if it assigned the same alias set.
 
  I tried this on a couple of testcases yesterday, and it worked fine.
 
  I am still unclear on one point: is it fine to reuse the same tag for
  possibly different designated types ?

Yes, as long as they have the same alias set.

 
 
 
 
 
 
 
 
 



Re: Aliasing violation generated by fold_builtin_memcmp?

2005-09-30 Thread Olivier Hainque
Daniel Berlin wrote:
   I am still unclear on one point: is it fine to reuse the same tag for
   possibly different designated types ?
 
 Yes, as long as they have the same alias set.

 OK. A last detail:

 On the first tag_set 0 creation, we get into:

  if (var_ann (ptr)-type_mem_tag == NULL_TREE)
tag = create_memory_tag (tag_type, true);

 and, if doing nothing special, trip on

 /* Make sure that the type tag has the same alias set as the
pointed-to type.  */
 gcc_assert (tag_set == get_alias_set (tag));

 I've relaxed the assert expression for experimentation purposes, but
 this is probably not the best thing to do.

 How would you prefer to see this addressed ?





 


Re: Aliasing violation generated by fold_builtin_memcmp?

2005-09-30 Thread Daniel Berlin
On Fri, 2005-09-30 at 16:29 +0200, Olivier Hainque wrote:
 Daniel Berlin wrote:
I am still unclear on one point: is it fine to reuse the same tag for
possibly different designated types ?
  
  Yes, as long as they have the same alias set.
 
  OK. A last detail:
 
  On the first tag_set 0 creation, we get into:
 
   if (var_ann (ptr)-type_mem_tag == NULL_TREE)
   tag = create_memory_tag (tag_type, true);
 
  and, if doing nothing special, trip on
 
  /* Make sure that the type tag has the same alias set as the
   pointed-to type.  */
  gcc_assert (tag_set == get_alias_set (tag));

Well, doesn't the pointed-to type have set 0 because of
TYPE_REF_CAN_ALIAS_ALL (or whatever it's named :P)?
I'm a bit confused.




Re: Aliasing violation generated by fold_builtin_memcmp?

2005-09-30 Thread Olivier Hainque
Daniel Berlin wrote:
 Well, doesn't the pointed-to type have set 0 because of
 TYPE_REF_CAN_ALIAS_ALL (or whatever it's named :P)?

 Not quite: the pointer type has TYPE_REF_CAN_ALIAS_ALL, not the
 pointed-to type:

   /* Nonzero in a pointer or reference type means the data pointed to
  by this type can alias anything.  */
   #define TYPE_REF_CAN_ALIAS_ALL(NODE) \
 (PTR_OR_REF_CHECK (NODE)-common.static_flag)

 It seems to me that get_tmt does not do the right thing today
 because it assigns the tag alias set from the alias set of the
 pointed-to type, even if CAN_ALIAS_ALL is set on the pointer type.

 Here is an example input tree out of an Ada testcase:

 var_decl 0x4022f370 ksubint.2
type pointer_type 0x4025fbdc p2__integer_access
type integer_type 0x4025fb80 integer type integer_type integer
sizes-gimplified public SI
user align 32 symtab 0 alias set 3 

sizes-gimplified public static unsigned SI
^^
user align 32 symtab 0 alias set -1

 This is a pointer to integer which is declared can actually alias anything.












 


Re: Aliasing violation generated by fold_builtin_memcmp?

2005-09-30 Thread Daniel Berlin
On Fri, 2005-09-30 at 16:53 +0200, Olivier Hainque wrote:
 Daniel Berlin wrote:
  Well, doesn't the pointed-to type have set 0 because of
  TYPE_REF_CAN_ALIAS_ALL (or whatever it's named :P)?
 
  Not quite: the pointer type has TYPE_REF_CAN_ALIAS_ALL, not the
  pointed-to type:
 
/* Nonzero in a pointer or reference type means the data pointed to
   by this type can alias anything.  */
#define TYPE_REF_CAN_ALIAS_ALL(NODE) \
  (PTR_OR_REF_CHECK (NODE)-common.static_flag)
 
  It seems to me that get_tmt does not do the right thing today
  because it assigns the tag alias set from the alias set of the
  pointed-to type, even if CAN_ALIAS_ALL is set on the pointer type.

Uh, CAN_ALIAS_ALL seems like a very bad hack then.
You should simply be creating a pointed-to type that aliases set 0, and
using that for the pointed to type.
That is, after all, what alias set 0 is for.




Re: Aliasing violation generated by fold_builtin_memcmp?

2005-09-30 Thread Richard Kenner
Uh, CAN_ALIAS_ALL seems like a very bad hack then.
You should simply be creating a pointed-to type that aliases set 0, and
using that for the pointed to type.
That is, after all, what alias set 0 is for.

That's easy enough for integer types, but creating multiple record types
means duplicating lots of fields and layouts and is, in general, a mess.
What's special here is the pointer type, not the underlying type.


Re: Aliasing violation generated by fold_builtin_memcmp?

2005-09-30 Thread Daniel Berlin
On Fri, 2005-09-30 at 11:20 -0400, Richard Kenner wrote:
 Uh, CAN_ALIAS_ALL seems like a very bad hack then.
 You should simply be creating a pointed-to type that aliases set 0, and
 using that for the pointed to type.
 That is, after all, what alias set 0 is for.
 
 That's easy enough for integer types, but creating multiple record types
 means duplicating lots of fields and layouts and is, in general, a mess.

But of course, it's the right thing to do when you've got one type that
can be aliased to anything through a pointer, and the other cannot.

 What's special here is the pointer type, not the underlying type.

Which means that everywhere that handles pointer types and aliasing has
to be modified to check this magic.

I still claim it is a hack, and a very bad one.

At the absolute least, it should be encapsulated in a function, maybe
get_alias_set_of_pointed_to_type or something.

--Dan



Re: Aliasing violation generated by fold_builtin_memcmp?

2005-09-30 Thread Richard Kenner
 But of course, it's the right thing to do when you've got one type that
 can be aliased to anything through a pointer, and the other cannot.

Sure, if that's what were going on, but it's not.

What we have are two *pointer types*, both pointing at the same
underlying type.  Access via one of those pointer types is assumed to
be aliasing anything and the other is not.

The case in Ada is

type r1 is record ... end record;
type ar1 is access all r1;
type r2 is record ... end record;
type ar2 is access all r2;

type ar2p is access all r2;
function uc is new unchecked_conversion (ar1, ar2p);

In this case, both ar2.all and ar2p.all are accessing r2.  But accesses
through ar2p must be presumed to access anything since we know that the
underlying object might be an ar1.

  What's special here is the pointer type, not the underlying type.

 Which means that everywhere that handles pointer types and aliasing has
 to be modified to check this magic.

At the RTL level, there's just one place that computes the alias set of an
expression and it knows about that flag.  It was the only place that had
to be modified to support it, as I recall.  One would hope that knowlege
of this semantics can be similarly localized at the tree level.  Why can't it?


Re: Aliasing violation generated by fold_builtin_memcmp?

2005-09-30 Thread Daniel Berlin
On Fri, 2005-09-30 at 12:45 -0400, Richard Kenner wrote:

   What's special here is the pointer type, not the underlying type.
 
  Which means that everywhere that handles pointer types and aliasing has
  to be modified to check this magic.
 
 At the RTL level, there's just one place that computes the alias set of an
 expression and it knows about that flag.  It was the only place that had
 to be modified to support it, as I recall.  One would hope that knowlege
 of this semantics can be similarly localized at the tree level.  Why can't it?

Because the RTL level only supports type based aliasing, and very simple
TBAA at that.

You are saying that an access through this pointer can point to
anything, regardless of what we think it points to.

This leaves 3 or 4 places at the tree level that need to be changed, and
possibly more later as more aliasing techniques are added.




Re: Aliasing violation generated by fold_builtin_memcmp?

2005-09-30 Thread Richard Kenner
 Because the RTL level only supports type based aliasing, and very simple
 TBAA at that.

But we're just talking about type-based aliasing.

 You are saying that an access through this pointer can point to
 anything, regardless of what we think it points to.

No.  If you know what it points to by value-based information, that can
safely be used.  The only thing is that you must use alias set 0 for
any type-based information.

 This leaves 3 or 4 places at the tree level that need to be changed,
 and possibly more later as more aliasing techniques are added.

I don't see why.  Why is there more than one place that computes the alias
set of what a pointer might point to?


Re: Aliasing violation generated by fold_builtin_memcmp?

2005-09-30 Thread Andrew Pinski


On Sep 29, 2005, at 8:32 PM, Daniel Berlin wrote:




Any suggestions how to fix this?


The easiest thing is to store a version of unsigned_char_type_node
somewhere that has it's TYPE_ALIAS_SET set to 0, and use it there.
Whether this is the best solution, i'll leave to others :)



Something like this will fix the issue with TYPE_REF_CAN_ALIAS_ALL
by removing the use.  Maybe we could move may_alias to a bit in the
types and move TYPE_REF_CAN_ALIAS_ALL from the pointer type to the
type which is being accessed instead of this mess.

-- Pinski


Index: c-common.c
===
RCS file: /cvs/gcc/gcc/gcc/c-common.c,v
retrieving revision 1.653
diff -u -p -r1.653 c-common.c
--- c-common.c  27 Sep 2005 16:04:06 -  1.653
+++ c-common.c  30 Sep 2005 17:46:42 -
@@ -2668,9 +2668,6 @@ c_common_get_alias_set (tree t)
   || t == unsigned_char_type_node)
 return 0;
 
-  /* If it has the may_alias attribute, it can alias anything.  */
-  if (lookup_attribute (may_alias, TYPE_ATTRIBUTES (t)))
-return 0;
 
   /* The C standard specifically allows aliasing between signed and
  unsigned variants of the same type.  We treat the signed
Index: alias.c
===
RCS file: /cvs/gcc/gcc/gcc/alias.c,v
retrieving revision 1.254
diff -u -p -r1.254 alias.c
--- alias.c 21 Jul 2005 22:34:33 -  1.254
+++ alias.c 30 Sep 2005 17:46:42 -
@@ -587,6 +587,10 @@ get_alias_set (tree t)
   /* Now all we care about is the type.  */
   t = TREE_TYPE (t);
 }
+  
+  /* If it has the may_alias attribute, it can alias anything.  */
+  if (lookup_attribute (may_alias, TYPE_ATTRIBUTES (t)))
+return 0;
 
   /* Variant qualifiers don't affect the alias set, so get the main
  variant. If this is a type with a known alias set, return it.  */
Index: tree.c
===
RCS file: /cvs/gcc/gcc/gcc/tree.c,v
retrieving revision 1.505
diff -u -p -r1.505 tree.c
--- tree.c  14 Sep 2005 15:04:56 -  1.505
+++ tree.c  30 Sep 2005 17:46:42 -
@@ -4705,6 +4705,17 @@ build_pointer_type_for_mode (tree to_typ
 
   if (to_type == error_mark_node)
 return error_mark_node;
+  
+  if (can_alias_all)
+{
+  to_type =
+   build_type_attribute_variant (ttype,
+ merge_attributes
+   (TYPE_ATTRIBUTES (to_type), 
+tree_cons (get_identifier 
(may_alias),
+   NULL_TREE, NULL_TREE)));
+
+}
 
   /* In some cases, languages will have things that aren't a POINTER_TYPE
  (such as a RECORD_TYPE for fat pointers in Ada) as TYPE_POINTER_TO.
@@ -4721,14 +4732,13 @@ build_pointer_type_for_mode (tree to_typ
   /* First, if we already have a type for pointers to TO_TYPE and it's
  the proper mode, use it.  */
   for (t = TYPE_POINTER_TO (to_type); t; t = TYPE_NEXT_PTR_TO (t))
-if (TYPE_MODE (t) == mode  TYPE_REF_CAN_ALIAS_ALL (t) == can_alias_all)
+if (TYPE_MODE (t) == mode)
   return t;
 
   t = make_node (POINTER_TYPE);
 
   TREE_TYPE (t) = to_type;
   TYPE_MODE (t) = mode;
-  TYPE_REF_CAN_ALIAS_ALL (t) = can_alias_all;
   TYPE_NEXT_PTR_TO (t) = TYPE_POINTER_TO (to_type);
   TYPE_POINTER_TO (to_type) = t;
 
@@ -4754,6 +4764,18 @@ build_reference_type_for_mode (tree to_t
   bool can_alias_all)
 {
   tree t;
+  
+
+  if (can_alias_all)
+{
+  to_type =
+   build_type_attribute_variant (ttype,
+ merge_attributes
+   (TYPE_ATTRIBUTES (to_type), 
+tree_cons (get_identifier 
(may_alias),
+   NULL_TREE, NULL_TREE)));
+
+}
 
   /* In some cases, languages will have things that aren't a REFERENCE_TYPE
  (such as a RECORD_TYPE for fat pointers in Ada) as TYPE_REFERENCE_TO.
@@ -4770,14 +4792,13 @@ build_reference_type_for_mode (tree to_t
   /* First, if we already have a type for pointers to TO_TYPE and it's
  the proper mode, use it.  */
   for (t = TYPE_REFERENCE_TO (to_type); t; t = TYPE_NEXT_REF_TO (t))
-if (TYPE_MODE (t) == mode  TYPE_REF_CAN_ALIAS_ALL (t) == can_alias_all)
+if (TYPE_MODE (t) == mode)
   return t;
 
   t = make_node (REFERENCE_TYPE);
 
   TREE_TYPE (t) = to_type;
   TYPE_MODE (t) = mode;
-  TYPE_REF_CAN_ALIAS_ALL (t) = can_alias_all;
   TYPE_NEXT_REF_TO (t) = TYPE_REFERENCE_TO (to_type);
   TYPE_REFERENCE_TO (to_type) = t;
 
@@ -4809,12 +4830,12 @@ build_type_no_quals (tree t)
 case POINTER_TYPE:
   return build_pointer_type_for_mode (build_type_no_quals (TREE_TYPE (t)),
  TYPE_MODE (t),

Re: Aliasing violation generated by fold_builtin_memcmp?

2005-09-30 Thread Richard Henderson
On Fri, Sep 30, 2005 at 11:23:23AM +0200, Olivier Hainque wrote:
  We are seeing regressions in our local testsuite on cases exercising
  that pragma.

Hmm.

  The 'tag alias set should be zero if ...' idea above seems logical to me.
  I still may well not yet see a number of other options.

Perhaps we should instead get rid of this flag and instead
represent this as a variant of the pointed-to type with the
alias set forced to zero.  It does seem odd that there are
two ways to represent this...


r~


Re: Aliasing violation generated by fold_builtin_memcmp?

2005-09-30 Thread Richard Kenner
Yeah, so?  How often do you think this feature is used, anyway?

Quite a lot in legacy Ada code.

But how often it's used doesn't matter: if it's used *at all*, we still
have to create the mechanism for duplicating record types.


Re: Aliasing violation generated by fold_builtin_memcmp?

2005-09-29 Thread Daniel Berlin

 Any suggestions how to fix this?

The easiest thing is to store a version of unsigned_char_type_node
somewhere that has it's TYPE_ALIAS_SET set to 0, and use it there.
Whether this is the best solution, i'll leave to others :)

 
 Bye,
 Ulrich
 



Re: Aliasing violation generated by fold_builtin_memcmp?

2005-09-29 Thread Richard Henderson
On Fri, Sep 30, 2005 at 02:08:18AM +0200, Ulrich Weigand wrote:
  tree cst_uchar_node = build_type_variant (unsigned_char_type_node, 1, 0);
  tree cst_uchar_ptr_node = build_pointer_type (cst_uchar_node);
...
 Any suggestions how to fix this?

Try

  cst_uchar_ptr_node
= build_pointer_type_for_mode (cst_uchar_node, ptr_mode, true);

which is apparently in use by the Ada front end, but only if a
certain pragma is given.  Dunno how reliably that's likely to work.


r~


Re: Aliasing violation generated by fold_builtin_memcmp?

2005-09-29 Thread Richard Kenner
In short, the problem appears to be this code in fold_builtin_memcmp:

  /* If len parameter is one, return an expression corresponding to
 (*(const unsigned char*)arg1 - (const unsigned char*)arg2).  */
  if (host_integerp (len, 1)  tree_low_cst (len, 1) == 1)
{
  tree cst_uchar_node = build_type_variant (unsigned_char_type_node, 1, 0);
  tree cst_uchar_ptr_node = build_pointer_type (cst_uchar_node);

Any suggestions how to fix this?

Maybe change the above to
build_pointer_type_for_mode (cst_uchar_node, ptr_mode, true)