[Bug other/56955] documentation for attribute malloc contradicts itself

2014-05-23 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56955

--- Comment #21 from Richard Biener rguenth at gcc dot gnu.org ---
Author: rguenth
Date: Fri May 23 10:11:03 2014
New Revision: 210848

URL: http://gcc.gnu.org/viewcvs?rev=210848root=gccview=rev
Log:
2014-05-22  Paul Eggert  egg...@cs.ucla.edu

PR other/56955
* doc/extend.texi (Function Attributes): Fix  __attribute__ ((malloc))
documentation; the old documentation didn't clearly state the
constraints on the contents of the pointed-to storage.

Modified:
trunk/gcc/ChangeLog
trunk/gcc/doc/extend.texi


[Bug other/56955] documentation for attribute malloc contradicts itself

2014-05-23 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56955

Richard Biener rguenth at gcc dot gnu.org changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #22 from Richard Biener rguenth at gcc dot gnu.org ---
Fixed.


[Bug other/56955] documentation for attribute malloc contradicts itself

2014-05-22 Thread eggert at gnu dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56955

Paul Eggert eggert at gnu dot org changed:

   What|Removed |Added

  Attachment #32832|0   |1
is obsolete||

--- Comment #20 from Paul Eggert eggert at gnu dot org ---
Created attachment 32844
  -- https://gcc.gnu.org/bugzilla/attachment.cgi?id=32844action=edit
Revised-again doc patch for __attribute__ ((malloc))

Patch revised based on further comments on gcc-patches.


[Bug other/56955] documentation for attribute malloc contradicts itself

2014-05-21 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56955

Richard Biener rguenth at gcc dot gnu.org changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2014-05-21
 CC||rguenth at gcc dot gnu.org
 Ever confirmed|0   |1

--- Comment #15 from Richard Biener rguenth at gcc dot gnu.org ---
(In reply to Dan Gohman from comment #4)
 (In reply to comment #3)
  Well, it _is_ actually about the content.  There must be no way to compute
  a valid pointer to another object from the contents of the pointed-to
  memory.  
 
 Oh wow. That's a subtlety that completely escaped me.
 
  So if you initialize the memory to {0, 1, 2, 3, 4 ...} thus
  every possible byte value is somewhere and then do
  
void *p = (void *)(mem[3]  24 | mem[58]  16 | ...);
  
  then points-to analysis assumes that from the contents of 'mem' you
  can only compute pointers to nothing (NULL).  
 
 Is that example fundamentally different than something like this:
 
 void *q = (void *)(mem[0] + 0xb1ab1ab1a);
 
 In both cases, the information of the pointer value is in the expression,
 not in the memory.

It's exactly the same.

 Is it the case that the memory must be either actually zeros or
 uninitialized? Or could it contain other data which merely transmits no
 information about pointer values?

There must be no way to compute a pointer to an object by _just_
combining bits and bytes of that memory cleverly.  So for example
initializing the memory to -1 (all bits set) would not work as you can compute
a zero from 1 ^ 1 and thus any possible pointer value.  That's not
possible for zero.  Oh wait, you can do ~0.  Hmm ... subtle ;)

Ok, we ignore pointer values computed from FP values as well, thus
I guess technically only undefined content is really really valid.
Practically memory with non-pointer values is ok unless you play evil
(or very evil) games outlined above.

  Technically for targets
  where NULL is a valid poiner to an object calloc () may not be marked
  with malloc.
  
  That is, read it in the way that the code assumes the memory _may_ be
  zero-initialized (but only zero-initialized) or uninitialized.
 
 If this is what it means, then I request that the text be updated to say
 this. I'd be willing to propose a wording, once I understand the intent, if
 that'd be helpful.
 
 What should we say about the fact that GLIBC uses the malloc attribute on
 strdup (and similar things)? strdup actually could be used to transmit
 information about pointer values.

True.  See above though.

Note that the actual implementation (as opposed to what would be allowed
by the documentation) does:

  /* If this is not a real malloc call assume the memory was
 initialized and thus may point to global memory.  All
 builtin functions with the malloc attribute behave in a sane way.  */
  if (!fndecl
  || DECL_BUILT_IN_CLASS (fndecl) != BUILT_IN_NORMAL)
make_constraint_from (vi, nonlocal_id);

and thus restricts this to known functions (not only malloc, but also
strdup which we just expect you don't use to transfer pointers ...).


As of 'realloc' - yes we can special-case that in the compiler (we don't
do that), but we can't really re-use the existing 'malloc' attribute for that.

The proposed revised documentation looks like a good improvement to me,
Paul, can you post it to gcc-patches@?


[Bug other/56955] documentation for attribute malloc contradicts itself

2014-05-21 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56955

--- Comment #16 from Richard Biener rguenth at gcc dot gnu.org ---
One reason for why realloc is hard is that there is no language that says
it is undefined to access the object via the old pointer, but there is only
language that says the old and the new pointer values may be equal.  Thus,

void foo (int *p)
{
  int *q = realloc (p, sizeof (int));
  *q = 2;
}

may I remove the store *q = 2 as dead?  The new pointer doesn't escape
and thus nothing can read from it ...


[Bug other/56955] documentation for attribute malloc contradicts itself

2014-05-21 Thread sunfish at mozilla dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56955

--- Comment #17 from Dan Gohman sunfish at mozilla dot com ---
(In reply to Richard Biener from comment #16)
 One reason for why realloc is hard is that there is no language that says
 it is undefined to access the object via the old pointer, but there is only
 language that says the old and the new pointer values may be equal.

C89 was unclear, but C99 and now C11 7.22.3.5 say realloc deallocates the old
pointer, and there is no mention of the case where the pointers happen to be
equal. The interpretation of this to mean that old and new pointers don't
alias, even when being comparison-equal, has a serious following.


[Bug other/56955] documentation for attribute malloc contradicts itself

2014-05-21 Thread eggert at gnu dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56955

--- Comment #18 from Paul Eggert eggert at gnu dot org ---
(In reply to Richard Biener from comment #16)

 void foo (int *p)
 {
   int *q = realloc (p, sizeof (int));
   *q = 2;
 }
 
 may I remove the store *q = 2 as dead?

Yes, the consensus nowadays is that you can.

I'll be happy to send the proposed change to gcc-patches but would like to be
sure it's correct first.  Has this new information about realloc changed your
opinion about whether realloc can be given the malloc attribute?


[Bug other/56955] documentation for attribute malloc contradicts itself

2014-05-21 Thread rguenther at suse dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56955

--- Comment #19 from rguenther at suse dot de rguenther at suse dot de ---
On May 21, 2014 5:14:27 PM CEST, eggert at gnu dot org
gcc-bugzi...@gcc.gnu.org wrote:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56955

--- Comment #18 from Paul Eggert eggert at gnu dot org ---
(In reply to Richard Biener from comment #16)

 void foo (int *p)
 {
   int *q = realloc (p, sizeof (int));
   *q = 2;
 }
 
 may I remove the store *q = 2 as dead?

Yes, the consensus nowadays is that you can.

I'll be happy to send the proposed change to gcc-patches but would like
to be
sure it's correct first.  Has this new information about realloc
changed your
opinion about whether realloc can be given the malloc attribute?

No, it has not.

Richard.


[Bug other/56955] documentation for attribute malloc contradicts itself

2014-05-20 Thread eggert at gnu dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56955

Paul Eggert eggert at gnu dot org changed:

   What|Removed |Added

 CC||eggert at gnu dot org

--- Comment #6 from Paul Eggert eggert at gnu dot org ---
Created attachment 32831
  -- https://gcc.gnu.org/bugzilla/attachment.cgi?id=32831action=edit
Clarify documentation for __attribute__ ((malloc)).

This topic recently came up on the glibc mailing list and there's clearly a lot
of confusion about it.  See, for example,
https://sourceware.org/ml/libc-alpha/2014-05/msg00519.html.  Attaching a
proposed patch to the documentation to try to help clear this up.


[Bug other/56955] documentation for attribute malloc contradicts itself

2014-05-20 Thread carlos at redhat dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56955

Carlos O'Donell carlos at redhat dot com changed:

   What|Removed |Added

 CC||carlos at redhat dot com

--- Comment #7 from Carlos O'Donell carlos at redhat dot com ---
(In reply to Paul Eggert from comment #6)
 Created attachment 32831 [details]
 Clarify documentation for __attribute__ ((malloc)).
 
 This topic recently came up on the glibc mailing list and there's clearly a
 lot of confusion about it.  See, for example,
 https://sourceware.org/ml/libc-alpha/2014-05/msg00519.html.  Attaching a
 proposed patch to the documentation to try to help clear this up.

s/Ussing/Using/g, otherwise the patch in attachment #32831 looks great to me.


[Bug other/56955] documentation for attribute malloc contradicts itself

2014-05-20 Thread eggert at gnu dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56955

--- Comment #8 from Paul Eggert eggert at gnu dot org ---
Comment on attachment 32831
  -- https://gcc.gnu.org/bugzilla/attachment.cgi?id=32831
Clarify documentation for __attribute__ ((malloc)).

Index: gcc/ChangeLog
===
--- gcc/ChangeLog  (revision 210629)
+++ gcc/ChangeLog  (working copy)
@@ -1,3 +1,10 @@
+2014-05-20  Paul Eggert  egg...@cs.ucla.edu
+
+  PR other/56955
+  * doc/extend.texi (Function Attributes): Fix  __attribute__ ((malloc))
+  documentation; the old documentation didn't clearly state the
+  constraints on the contents of the pointed-to storage.
+
 2014-05-19  David Wohlferd d...@limegreensocks.com
 
   * doc/extend.texi: Create Label Attributes section,
Index: gcc/doc/extend.texi
===
--- gcc/doc/extend.texi(revision 210629)
+++ gcc/doc/extend.texi(working copy)
@@ -3207,15 +3207,20 @@
 
 @item malloc
 @cindex @code{malloc} attribute
-The @code{malloc} attribute is used to tell the compiler that a function
-may be treated as if any non-@code{NULL} pointer it returns cannot
-alias any other pointer valid when the function returns and that the memory
-has undefined content.
-This often improves optimization.
-Standard functions with this property include @code{malloc} and
-@code{calloc}.  @code{realloc}-like functions do not have this
-property as the memory pointed to does not have undefined content.
+This tells the compiler that a function is @code{malloc}-like, i.e.,
+that if the function returns a non-null pointer @var{P}, then @var{P}
+cannot alias any other pointer valid when the function returns, and
+moreover the contents of any storage addressed by @var{P} cannot
+contain a pointer that aliases any other pointer valid when the
+function returns.
 
+Ussing this attribute often improves optimization.  Functions like
+@code{malloc} and @code{calloc} have this property because they return
+a pointer to uninitialized or zeroed-out storage.  However, functions
+like @code{realloc} do not have this property, as they can return a
+pointer to storage containing pointers that alias already-valid
+pointers.
+
 @item mips16/nomips16
 @cindex @code{mips16} attribute
 @cindex @code{nomips16} attribute

[Bug other/56955] documentation for attribute malloc contradicts itself

2014-05-20 Thread eggert at gnu dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56955

Paul Eggert eggert at gnu dot org changed:

   What|Removed |Added

  Attachment #32831|0   |1
is obsolete||

--- Comment #9 from Paul Eggert eggert at gnu dot org ---
Created attachment 32832
  -- https://gcc.gnu.org/bugzilla/attachment.cgi?id=32832action=edit
Revised documentation patch for __attribute__ ((malloc))

Thanks for the quick review.  Revised patch attached.  Sorry about the noise in
my previous reply, I hit Submit by accident.


[Bug other/56955] documentation for attribute malloc contradicts itself

2014-05-20 Thread bugdal at aerifal dot cx
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56955

--- Comment #10 from Rich Felker bugdal at aerifal dot cx ---
I don't see how it's at all helpful for GCC to assume that memory obtained by
__attribute__((__malloc__)) functions does not contain pointers to anything
that existed before the call. This assumption only aids optimization in the
case where a pointer residing in the obtained memory is used (e.g. dereferenced
or compared with another pointer) before anything is stored to it. But with
GCC's assumption, such use would be UB anyway and thus cannot occur in a
correct program, so there's no sense in optimizing it.

The alternative is much more reasonable: assume that a pointer residing in the
obtained memory could alias any object whose address has already escaped
(roughly, anything but automatic or static/internal-linkage objects whose
addresses were not taken and passed to code the compiler can't see). This
allows __attribute__((__malloc__)) to be applied to realloc-like functions as
well as functions in third-party libraries which allocate non-opaque structures
whose members may point to data that's also accessible via other paths. And as
far as I can tell, it doesn't preclude any optimizations that could take place
in a code path that doesn't invoke UB.


[Bug other/56955] documentation for attribute malloc contradicts itself

2014-05-20 Thread eggert at gnu dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56955

--- Comment #11 from Paul Eggert eggert at gnu dot org ---
Created attachment 32834
  -- https://gcc.gnu.org/bugzilla/attachment.cgi?id=32834action=edit
Sample illustrating GCC's optimization with __attribute__ ((malloc))


[Bug other/56955] documentation for attribute malloc contradicts itself

2014-05-20 Thread eggert at gnu dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56955

--- Comment #12 from Paul Eggert eggert at gnu dot org ---
(In reply to Rich Felker from comment #10)
 This assumption only aids
 optimization in the case where a pointer residing in the obtained memory is
 used (e.g. dereferenced or compared with another pointer) before anything is
 stored to it.

No, it also aids optimization because GCC can infer lack of aliasing elsewhere,
even if no pointer in the newly allocated memory is used-before-set.  Consider
the contrived example am.c (which I've added as an attachment to this report). 
It has two functions f and g that differ only in that f calls m which has
__attribute__ ((malloc)) whereas g calls n which does not.  With the weaker
assumption you're suggesting, GCC could not optimize away the reload from
a-next in f, because of the intervening assignment '*p = q'.

I've compiled this with both GCC 4.9.0 and Clang 3.4 on x86-64 with -O2.  Both
compile g to essentially the same 15 instructions.  Clang, which I suspect uses
the weaker assumption, compiles f to 14 instructions; GCC compiles f to 11
instructions.


[Bug other/56955] documentation for attribute malloc contradicts itself

2014-05-20 Thread sunfish at mozilla dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56955

--- Comment #13 from Dan Gohman sunfish at mozilla dot com ---
(In reply to Paul Eggert from comment #12)
 (In reply to Rich Felker from comment #10)
  This assumption only aids
  optimization in the case where a pointer residing in the obtained memory is
  used (e.g. dereferenced or compared with another pointer) before anything is
  stored to it.
 
 No, it also aids optimization because GCC can infer lack of aliasing
 elsewhere, even if no pointer in the newly allocated memory is
 used-before-set.  Consider the contrived example am.c (which I've added as
 an attachment to this report).  It has two functions f and g that differ
 only in that f calls m which has __attribute__ ((malloc)) whereas g calls n
 which does not.  With the weaker assumption you're suggesting, GCC could not
 optimize away the reload from a-next in f, because of the intervening
 assignment '*p = q'.

Actually, GCC and Clang both eliminate the reload of a-next in f (and not in
g). The weaker assumption is sufficient for that. *p can't alias a or b without
violating the weaker assumption.

What GCC is additionally doing in f is deleting the stores to a-next and
b-next as dead stores. That's really clever. However, the weaker assumption is
actually sufficient for that too: First, forward b to eliminate the load of
a-next. Then, it can be proved that a doesn't escape, and is defined by an
attribute malloc function, so the stores through it can't be loaded anywhere
else, so they're dead. Then, it can be proved that b doesn't escape either, and
is also defined by an attribute malloc function, so the stores through it are
dead too.

Consequently, the weaker assumption is still fairly strong. Further, the weaker
assumption would be usable by a much broader set of functions, so it may even
provide overall stronger alias information in practice.


[Bug other/56955] documentation for attribute malloc contradicts itself

2014-05-20 Thread eggert at gnu dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56955

--- Comment #14 from Paul Eggert eggert at gnu dot org ---
(In reply to Dan Gohman from comment #13)

 *p can't alias a or b without violating the weaker assumption.

Sorry, you've lost me there.  Pointers in realloc'ed storage can alias
already-existing pointers, and surely this aliasing can inhibit optimization;
perhaps my example doesn't show this correctly but I expect that other examples
could.

 the weaker assumption would be usable by a much broader set of functions,
 so it may even provide overall stronger alias information in practice.

From what I've seen so far I tend to agree, but there are contrary opinions,
e.g., Richard Biener's We cannot use the malloc attribute on realloc, ever.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=23383#23. It might be helpful
for Richard to chime in here and explain. Even if GCC does not change the
meaning of __attribute__ ((malloc)), perhaps there should be a new directive
(__attribute__ ((new)), say) with the weaker semantics.

Anyway, the original bug report is about GCC's documentation, and the proposed
patch does fix the documentation bug, so we could start with it.


[Bug other/56955] documentation for attribute malloc contradicts itself

2013-04-26 Thread davidxl at google dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56955



davidxl at google dot com changed:



   What|Removed |Added



 CC||davidxl at google dot com



--- Comment #5 from davidxl at google dot com 2013-04-26 19:16:31 UTC ---

The documentation gives the most strict semantic for the attribute where calloc

is not qualified if NULL is a valid pointer.



However GCC's implementation for the attribute is more relaxed but pessimizes

simple cases:



http://gcc.gnu.org/ml/gcc-patches/2010-02/msg00381.html



David





(In reply to comment #3)

 (In reply to comment #2)

  (In reply to comment #1)

   I think it is talking about the memory returned by malloc/calloc will not 
   point

   to another memory location while realloc can.

  

  I agree that's essentially what it ought to talk about, and the bug is that

  it's talking about something else -- the contents of the pointed-to memory.

 

 Well, it _is_ actually about the content.  There must be no way to compute

 a valid pointer to another object from the contents of the pointed-to

 memory.  So if you initialize the memory to {0, 1, 2, 3, 4 ...} thus

 every possible byte value is somewhere and then do

 

   void *p = (void *)(mem[3]  24 | mem[58]  16 | ...);

 

 then points-to analysis assumes that from the contents of 'mem' you

 can only compute pointers to nothing (NULL).  Technically for targets

 where NULL is a valid poiner to an object calloc () may not be marked

 with malloc.

 

 That is, read it in the way that the code assumes the memory _may_ be

 zero-initialized (but only zero-initialized) or uninitialized.


[Bug other/56955] documentation for attribute malloc contradicts itself

2013-04-15 Thread rguenth at gcc dot gnu.org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56955



--- Comment #3 from Richard Biener rguenth at gcc dot gnu.org 2013-04-15 
10:19:22 UTC ---

(In reply to comment #2)

 (In reply to comment #1)

  I think it is talking about the memory returned by malloc/calloc will not 
  point

  to another memory location while realloc can.

 

 I agree that's essentially what it ought to talk about, and the bug is that

 it's talking about something else -- the contents of the pointed-to memory.



Well, it _is_ actually about the content.  There must be no way to compute

a valid pointer to another object from the contents of the pointed-to

memory.  So if you initialize the memory to {0, 1, 2, 3, 4 ...} thus

every possible byte value is somewhere and then do



  void *p = (void *)(mem[3]  24 | mem[58]  16 | ...);



then points-to analysis assumes that from the contents of 'mem' you

can only compute pointers to nothing (NULL).  Technically for targets

where NULL is a valid poiner to an object calloc () may not be marked

with malloc.



That is, read it in the way that the code assumes the memory _may_ be

zero-initialized (but only zero-initialized) or uninitialized.


[Bug other/56955] documentation for attribute malloc contradicts itself

2013-04-15 Thread sunfish at google dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56955



--- Comment #4 from Dan Gohman sunfish at google dot com 2013-04-15 14:53:06 
UTC ---

(In reply to comment #3)

 Well, it _is_ actually about the content.  There must be no way to compute

 a valid pointer to another object from the contents of the pointed-to

 memory.  



Oh wow. That's a subtlety that completely escaped me.



 So if you initialize the memory to {0, 1, 2, 3, 4 ...} thus

 every possible byte value is somewhere and then do

 

   void *p = (void *)(mem[3]  24 | mem[58]  16 | ...);

 

 then points-to analysis assumes that from the contents of 'mem' you

 can only compute pointers to nothing (NULL).  



Is that example fundamentally different than something like this:



void *q = (void *)(mem[0] + 0xb1ab1ab1a);



In both cases, the information of the pointer value is in the expression, not

in the memory.



Is it the case that the memory must be either actually zeros or uninitialized?

Or could it contain other data which merely transmits no information about

pointer values?



 Technically for targets

 where NULL is a valid poiner to an object calloc () may not be marked

 with malloc.

 

 That is, read it in the way that the code assumes the memory _may_ be

 zero-initialized (but only zero-initialized) or uninitialized.



If this is what it means, then I request that the text be updated to say this.

I'd be willing to propose a wording, once I understand the intent, if that'd be

helpful.



What should we say about the fact that GLIBC uses the malloc attribute on

strdup (and similar things)? strdup actually could be used to transmit

information about pointer values.


[Bug other/56955] documentation for attribute malloc contradicts itself

2013-04-14 Thread pinskia at gcc dot gnu.org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56955



--- Comment #1 from Andrew Pinski pinskia at gcc dot gnu.org 2013-04-14 
19:05:00 UTC ---

I think it is talking about the memory returned by malloc/calloc will not point

to another memory location while realloc can.


[Bug other/56955] documentation for attribute malloc contradicts itself

2013-04-14 Thread sunfish at google dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56955



--- Comment #2 from Dan Gohman sunfish at google dot com 2013-04-14 19:47:42 
UTC ---

(In reply to comment #1)

 I think it is talking about the memory returned by malloc/calloc will not 
 point

 to another memory location while realloc can.



I agree that's essentially what it ought to talk about, and the bug is that

it's talking about something else -- the contents of the pointed-to memory.