[Bug c++/54770] sibling call optimization is not applied where it ought to be

2012-11-20 Thread andy.m.jost at gmail dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54770



--- Comment #8 from Andy Jost andy.m.jost at gmail dot com 2012-11-20 
17:14:18 UTC ---

The workaround I mentioned in comment #5 does seem to do the trick. 

Specifically, something like this works in all the cases I've tried:



  struct MainNode : Node

  {

virtual void H()

{

  helper()

  this-H();

}

__attribute__((noinline)) void helper()

{

  NodePtr tmp(arg);

  this-~Node();

  new(this) MainNode(tmp);

}

  };


[Bug c++/54770] sibling call optimization is not applied where it ought to be

2012-10-17 Thread jakub at gcc dot gnu.org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54770



--- Comment #7 from Jakub Jelinek jakub at gcc dot gnu.org 2012-10-17 
12:27:15 UTC ---

It is just that though, a workaround, it doesn't workaround say:

void bar (char *);

int baz ();



int

foo ()

{

  {

char buf[64];

bar (buf);

  }

  return baz ();

}


[Bug c++/54770] sibling call optimization is not applied where it ought to be

2012-10-16 Thread paolo.carlini at oracle dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54770



--- Comment #6 from Paolo Carlini paolo.carlini at oracle dot com 2012-10-16 
23:06:16 UTC ---

In terms of workarounds, what about using -std=c++11?


[Bug c++/54770] sibling call optimization is not applied where it ought to be

2012-10-02 Thread rguenth at gcc dot gnu.org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54770



Richard Guenther rguenth at gcc dot gnu.org changed:



   What|Removed |Added



 Status|UNCONFIRMED |WAITING

   Last reconfirmed||2012-10-02

 Ever Confirmed|0   |1


[Bug c++/54770] sibling call optimization is not applied where it ought to be

2012-10-02 Thread steven at gcc dot gnu.org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54770



Steven Bosscher steven at gcc dot gnu.org changed:



   What|Removed |Added



 Status|WAITING |NEW



--- Comment #2 from Steven Bosscher steven at gcc dot gnu.org 2012-10-02 
11:44:41 UTC ---

Confirmed, complete test case:



-- 8 --

typedef long unsigned int size_t;



namespace std

{

using ::size_t;



  class exception

{

  public:

exception() throw() { }

virtual ~exception() throw();

};



  class bad_alloc : public exception

{

  public:

bad_alloc() throw() { }

virtual ~bad_alloc() throw();

};



  struct nothrow_t { };

  extern const nothrow_t nothrow;

}



inline void* operator new(std::size_t, void* __p) throw() { return __p; }

inline void operator delete (void*, void*) throw() { }



templatetypename T struct Intrusive

{

  Intrusive(T * p = 0) : px(p) { if(px) incref(px); }

  ~Intrusive() { if(px) decref(px); }

  T * operator-() const { return px; }

  T * px;

};

struct Node;

typedef IntrusiveNode NodePtr;

struct Node

{

  Node() : refcnt(0) {}

  virtual ~Node() {}

  virtual void H() {}

  unsigned short refcnt;

  friend void incref(Node * node) { ++node-refcnt; }

  friend void decref(Node * node) { if(--node-refcnt == 0) delete node; }

};

struct MainNode : Node

{

  MainNode(NodePtr const  arg_) : Node(), arg(arg_) {}

  virtual ~MainNode() {}

  NodePtr arg;

  virtual void H()

{

{

  NodePtr tmp(arg);

  this-~Node();

  new(this) MainNode(tmp);

}

  this-H();

}

};

int main()

{

  NodePtr arg = NodePtr();

  NodePtr root(new MainNode(arg));

  root-H();

  return 0;

}

-- 8 --



At trunk r191835, compiling with:

./xgcc -B. -S -O3 -fdump-tree-optimized-lineno t.C



Gives the call at line 58:



L6:

  tmp ={v} {CLOBBER};

  [t.C : 58:15] _13 = MEM[(int (*__vtbl_ptr_type) () *)prephitmp_29+16B];

  [t.C : 58:16] OBJ_TYPE_REF(_13;this_3(D)-2) (this_3(D));

  [t.C : 59:5] return;


[Bug c++/54770] sibling call optimization is not applied where it ought to be

2012-10-02 Thread jakub at gcc dot gnu.org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54770



Jakub Jelinek jakub at gcc dot gnu.org changed:



   What|Removed |Added



 CC||jakub at gcc dot gnu.org



--- Comment #3 from Jakub Jelinek jakub at gcc dot gnu.org 2012-10-02 
12:52:39 UTC ---

It is not tail call optimized (unless -std=c++11, which generates different

code and is tail call optimized btw), because points-to analysis thinks that

the this-H call in MainNode::H may use the tmp variable:

  # .MEM_25 = PHI .MEM_27(6), .MEM_30(4)

  # PT = nonlocal escaped

  # prephitmp_29 = PHI pretmp_31(6), MEM[(voidD.45 *)_ZTV8MainNodeD.2389 +

16B](4)

L6:

  # .MEM_11 = VDEF .MEM_25

  tmpD.2422 ={v} {CLOBBER};

  # VUSE .MEM_11

  # PT = nonlocal escaped

  _13 = MEM[(intD.9 (*__vtbl_ptr_typeD.2134) () *)prephitmp_29 + 16B];

  # .MEM_14 = VDEF .MEM_11

  # USE = nonlocal null { D.2320 D.2389 D.2422 } (glob)

  # CLB = nonlocal null { D.2320 D.2389 D.2422 } (glob)

  OBJ_TYPE_REF(_13;this_3(D)-2) (this_3(D));

  # VUSE .MEM_14

  return;



It is true that tmpD.2422 is addressable:

;;pred:   2 (EH,EXECUTABLE)

L4: [LP 1]

  [MNT 5] # .MEM_15 = VDEF .MEM_8

  # USE = nonlocal null { D.2320 D.2389 D.2422 } (glob)

  # CLB = nonlocal null { D.2320 D.2389 D.2422 } (glob)

  _ZN9IntrusiveI4NodeED1EvD.2361 (tmpD.2422);

  resx 2

;;succ:

but here it is dominated by a clobber stmt for that variable.  I wonder if PTA

could be thought something about TREE_CLOBBER_Ps, the question is what exactly.

Here the call is dominated by the clobber stmt and there is no stmt mentioning

that var in any way in between the clobber and the call.  Only saying that a

var can't escape to calls dominated by a clobber stmt wouldn't be enough, if

e.g. a loop is unrolled, we could have

  bar ();

  foo (tmp);

  bar ();

  tmp ={v} {CLOBBER};

  bar ();

  foo (tmp);

  bar ();

  tmp ={v} {CLOBBER};

While it is ok to assume (I think) that tmp can't be used in the 3rd bar call,

in the 4th bar call it can be used.


[Bug c++/54770] sibling call optimization is not applied where it ought to be

2012-10-02 Thread jakub at gcc dot gnu.org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54770



--- Comment #4 from Jakub Jelinek jakub at gcc dot gnu.org 2012-10-02 
13:31:24 UTC ---

Right now we only have control flow insensitive points to ESCAPED set, so I'm

afraid this is unfixable, unless we change that.  Only with control flow

sensitive ESCAPED we'd be able to clear the tmp var out of the escaped set at

the CLOBBER point (but it would need to be kept in other points to sets, as we

could e.g. CSE pure/const calls that return the var's address).



Another example where control flow insensitive ESCAPED prevents some

optimizations:

struct S { char p[40]; };

void bar (struct S *);

void foo (void)

{

  struct S a, b, c, d, e;

  a.p[0] = 1;

  b.p[0] = 1;

  c.p[0] = 1;

  d.p[0] = 1;

  e.p[0] = 1;

  a.p[0] = 2;

  bar (a);

  b.p[0] = 2;

  bar (b);

  c.p[0] = 2;

  bar (c);

  d.p[0] = 2;

  bar (d);

  e.p[0] = 2;

  bar (e);

}



As we add all the vars into the ESCAPED set and consider it being used even by

the a call then, DSE can't optimize away the = 1 stores with the exception of

a.p[0] = 1;.


[Bug c++/54770] sibling call optimization is not applied where it ought to be

2012-10-02 Thread andy.m.jost at gmail dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54770



--- Comment #5 from Andy Jost andy.m.jost at gmail dot com 2012-10-02 
15:25:46 UTC ---

Most of the technical discussion is over my head, I'm afraid.  Is there a

simple workaround?  What if the body of H contains only a call to a

non-inlineable function to do its work and then the recursive call back to H?



For those of us using GCC to compile a functional language, tail recursion is a

requirement.  We either need to make sure the compiler uses it every time, or

resort to more manual methods. The option to specify a required sibling call

would be ideal (see http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52067).


[Bug c++/54770] sibling call optimization is not applied where it ought to be

2012-10-01 Thread redi at gcc dot gnu.org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54770



--- Comment #1 from Jonathan Wakely redi at gcc dot gnu.org 2012-10-02 
00:37:30 UTC ---

4.5.2 is no longer maintained so won't be fixed in any case - could you check

whether the problem still exists in a current release, i.e. 4.6+, preferably

checking the trunk (i.e 4.8)?