Re: pr45605.C devirtualize call failure in ia64-hp-hpux?

2012-08-08 Thread Richard Henderson
On 08/07/2012 08:29 AM, Martin Jambor wrote:
 So I did the testing and unfortunately this only works for the first
 virtual function of a class, the sequence calling any other virtual
 function has one more statement which adds VMT offset to the
 typecasted pointer and after folding we end up calling stuff like
 MEM[f1+16B]() instead of f2() in g++.dg/template/ptrmem18.C.

Yes.  The ia64 vtable format has the function descriptors directly
in the table.  See TARGET_VTABLE_USES_DESCRIPTORS.  It is unique in
this feature so far (although if another new abi decides to use
function descriptors at all, I would recommend this as well).

Thus vtable[index] is the function pointer that is passed to the
backend for expansion in the call sequence.


r~


Re: pr45605.C devirtualize call failure in ia64-hp-hpux?

2012-08-07 Thread Richard Guenther
On Mon, Aug 6, 2012 at 8:21 PM, Martin Jambor mjam...@suse.cz wrote:
 Hi,

 I've had this flagged to look at later for quite long now...

 On Mon, Apr 30, 2012 at 07:34:24AM +, Mailaripillai, Kannan Jeganathan 
 wrote:
 Hi,

 This is related to pr45605.C test.

  Reduced testcase

 struct B {
   virtual void Run(){};
 };

 struct D : public B {
   virtual void Run() { };
 };

 int main() {
   D d;
   static_castB(d).Run();
 }

 With x86_64 linux the call to Run through object d is devirtualized.
 Whereas it looks like in ia64 hp-ux it is not devirtualized.

 -fdump-tree-fre1 output for both:

  x86_64 linux:

   MEM[(struct B *)d]._vptr.B = MEM[(void *)_ZTV1B + 16B];
   d.D.2197._vptr.B = MEM[(void *)_ZTV1D + 16B];
   D.2248_1 = MEM[(void *)_ZTV1D + 16B];
   D.2249_2 = Run;
   D::Run (d.D.2197);
   d ={v} {CLOBBER};
   return 0;

  ia64 hp-ux:

   MEM[(struct B *)d]._vptr.B = MEM[(void *)_ZTV1B + 16B];
   d.D.1878._vptr.B = MEM[(void *)_ZTV1D + 16B];
   D.1929_1 = MEM[(void *)_ZTV1D + 16B];
   D.1930_2 = (int (*__vtbl_ptr_type) ()) MEM[(void *)_ZTV1D + 16B];
   OBJ_TYPE_REF(D.1930_2;d.D.1878-0) (d.D.1878);

 Is it a bug (unexpected with O1 compilation) that it is not optimized to 
 direct call?


 There are two important and related differences.  The first one is
 that virtual method tables on ia64 constist of FDESC_EXPRs rather than
 mere ADDR_EXPRs.  The second one can be seen in the dumps just before
 fre1 (i.e. esra):

 i686:
   d.D.1854._vptr.B = MEM[(void *)_ZTV1D + 8B];
   D.1961_4 = d.D.1854._vptr.B;
   D.1962_5 = *D.1961_4;
   OBJ_TYPE_REF(D.1962_5;d.D.1854-0) (d.D.1854);

 ia64:
   d.D.1883._vptr.B = MEM[(void *)_ZTV1D + 16B];
   D.1991_4 = d.D.1883._vptr.B;
   D.1992_5 = (int (*__vtbl_ptr_type) ()) D.1991_4;
   OBJ_TYPE_REF(D.1992_5;d.D.1883-0) (d.D.1883);

 The main difference is not the type cast in the third assignment but
 the fact that there is no dereference there, which means that gimple
 folder has to deal with it at a different place.

 I played with it a bit this afternoon and came up with the following
 untested patch to fix the pr45605.C testcase.  I can bootstrap and
 test it on ia64 if we do not mind this special casing of FDESC_EXPRs
 in the midle end (I hope that all platforms that use it use it in the
 same way, I only know ia64...)

 Thanks,

 Martin


 2012-08-06  Martin Jambor  mjam...@suse.cz

 * gimple-fold.c (gimple_fold_stmt_to_constant_1): Also fold
 assignments of V_C_Es of addresses of FDESC_EXPRs.


 *** gcc/gimple-fold.c   Mon Aug  6 14:36:37 2012
 --- /tmp/FcpIKb_gimple-fold.c   Mon Aug  6 20:17:26 2012
 *** gimple_fold_stmt_to_constant_1 (gimple s
 *** 2542,2548 
  == TYPE_ADDR_SPACE (TREE_TYPE (op0))
TYPE_MODE (TREE_TYPE (lhs))
  == TYPE_MODE (TREE_TYPE (op0)))
 !   return op0;

 return
 fold_unary_ignore_overflow_loc (loc, subcode,
 --- 2542,2556 
  == TYPE_ADDR_SPACE (TREE_TYPE (op0))
TYPE_MODE (TREE_TYPE (lhs))
  == TYPE_MODE (TREE_TYPE (op0)))
 !   {
 ! tree t;
 ! if (TREE_CODE (op0) != ADDR_EXPR)
 !   return op0;
 ! t = fold_const_aggregate_ref_1 (TREE_OPERAND (op0, 0), 
 valueize);
 ! if (t  TREE_CODE (t) == FDESC_EXPR)
 !   return build_fold_addr_expr_loc (loc, TREE_OPERAND (t, 
 0));

FDESC_EXPR has two operands ... is it really ok to ignore the 2nd?

/* Operand0 is a function constant; result is part N of a function
   descriptor of type ptr_mode.  */
DEFTREECODE (FDESC_EXPR, fdesc_expr, tcc_expression, 2)

I suppose yes, from what I see in the uses in the C++ frontend.  In fact
_all_ users of FDESC_EXPR seem to ignore the 2nd operand ...!?
(I would have expected users in machine specific code ...

Thus,

Ok!

Thanks,
Richard.


 ! return op0;
 !   }

 return
 fold_unary_ignore_overflow_loc (loc, subcode,


Re: pr45605.C devirtualize call failure in ia64-hp-hpux?

2012-08-07 Thread Martin Jambor
Hi,


On Tue, Aug 07, 2012 at 03:14:21PM +0200, Richard Guenther wrote:
 On Mon, Aug 6, 2012 at 8:21 PM, Martin Jambor mjam...@suse.cz wrote:
  I've had this flagged to look at later for quite long now...
 
  On Mon, Apr 30, 2012 at 07:34:24AM +, Mailaripillai, Kannan Jeganathan 
  wrote:
  Hi,
 
  This is related to pr45605.C test.
 
   Reduced testcase
 
  struct B {
virtual void Run(){};
  };
 
  struct D : public B {
virtual void Run() { };
  };
 
  int main() {
D d;
static_castB(d).Run();
  }
 
  With x86_64 linux the call to Run through object d is devirtualized.
  Whereas it looks like in ia64 hp-ux it is not devirtualized.
 
  -fdump-tree-fre1 output for both:
 
   x86_64 linux:
 
MEM[(struct B *)d]._vptr.B = MEM[(void *)_ZTV1B + 16B];
d.D.2197._vptr.B = MEM[(void *)_ZTV1D + 16B];
D.2248_1 = MEM[(void *)_ZTV1D + 16B];
D.2249_2 = Run;
D::Run (d.D.2197);
d ={v} {CLOBBER};
return 0;
 
   ia64 hp-ux:
 
MEM[(struct B *)d]._vptr.B = MEM[(void *)_ZTV1B + 16B];
d.D.1878._vptr.B = MEM[(void *)_ZTV1D + 16B];
D.1929_1 = MEM[(void *)_ZTV1D + 16B];
D.1930_2 = (int (*__vtbl_ptr_type) ()) MEM[(void *)_ZTV1D + 16B];
OBJ_TYPE_REF(D.1930_2;d.D.1878-0) (d.D.1878);
 
  Is it a bug (unexpected with O1 compilation) that it is not optimized to 
  direct call?
 
 
  There are two important and related differences.  The first one is
  that virtual method tables on ia64 constist of FDESC_EXPRs rather than
  mere ADDR_EXPRs.  The second one can be seen in the dumps just before
  fre1 (i.e. esra):
 
  i686:
d.D.1854._vptr.B = MEM[(void *)_ZTV1D + 8B];
D.1961_4 = d.D.1854._vptr.B;
D.1962_5 = *D.1961_4;
OBJ_TYPE_REF(D.1962_5;d.D.1854-0) (d.D.1854);
 
  ia64:
d.D.1883._vptr.B = MEM[(void *)_ZTV1D + 16B];
D.1991_4 = d.D.1883._vptr.B;
D.1992_5 = (int (*__vtbl_ptr_type) ()) D.1991_4;
OBJ_TYPE_REF(D.1992_5;d.D.1883-0) (d.D.1883);
 
  The main difference is not the type cast in the third assignment but
  the fact that there is no dereference there, which means that gimple
  folder has to deal with it at a different place.
 
  I played with it a bit this afternoon and came up with the following
  untested patch to fix the pr45605.C testcase.  I can bootstrap and
  test it on ia64 if we do not mind this special casing of FDESC_EXPRs
  in the midle end (I hope that all platforms that use it use it in the
  same way, I only know ia64...)
 
  Thanks,
 
  Martin
 
 
  2012-08-06  Martin Jambor  mjam...@suse.cz
 
  * gimple-fold.c (gimple_fold_stmt_to_constant_1): Also fold
  assignments of V_C_Es of addresses of FDESC_EXPRs.
 
 
  *** gcc/gimple-fold.c   Mon Aug  6 14:36:37 2012
  --- /tmp/FcpIKb_gimple-fold.c   Mon Aug  6 20:17:26 2012
  *** gimple_fold_stmt_to_constant_1 (gimple s
  *** 2542,2548 
   == TYPE_ADDR_SPACE (TREE_TYPE (op0))
 TYPE_MODE (TREE_TYPE (lhs))
   == TYPE_MODE (TREE_TYPE (op0)))
  !   return op0;
 
  return
  fold_unary_ignore_overflow_loc (loc, subcode,
  --- 2542,2556 
   == TYPE_ADDR_SPACE (TREE_TYPE (op0))
 TYPE_MODE (TREE_TYPE (lhs))
   == TYPE_MODE (TREE_TYPE (op0)))
  !   {
  ! tree t;
  ! if (TREE_CODE (op0) != ADDR_EXPR)
  !   return op0;
  ! t = fold_const_aggregate_ref_1 (TREE_OPERAND (op0, 0), 
  valueize);
  ! if (t  TREE_CODE (t) == FDESC_EXPR)
  !   return build_fold_addr_expr_loc (loc, TREE_OPERAND (t, 
  0));
 
 FDESC_EXPR has two operands ... is it really ok to ignore the 2nd?
 
 /* Operand0 is a function constant; result is part N of a function
descriptor of type ptr_mode.  */
 DEFTREECODE (FDESC_EXPR, fdesc_expr, tcc_expression, 2)
 
 I suppose yes, from what I see in the uses in the C++ frontend.  In fact
 _all_ users of FDESC_EXPR seem to ignore the 2nd operand ...!?
 (I would have expected users in machine specific code ...
 
 Thus,
 
 Ok!

So I did the testing and unfortunately this only works for the first
virtual function of a class, the sequence calling any other virtual
function has one more statement which adds VMT offset to the
typecasted pointer and after folding we end up calling stuff like
MEM[f1+16B]() instead of f2() in g++.dg/template/ptrmem18.C.

I'll have another look at this when I have a spare while, I think
we'll need to look for these cases from the call statement upwards.
BTW, an interesting thing about this testcase is that neither on ia64
nor on i686 are the virtual calls enclosed in an OBJ_TYPE_REF...

Thanks anyway,

Martin


Re: pr45605.C devirtualize call failure in ia64-hp-hpux?

2012-08-06 Thread Martin Jambor
Hi,

I've had this flagged to look at later for quite long now...

On Mon, Apr 30, 2012 at 07:34:24AM +, Mailaripillai, Kannan Jeganathan 
wrote:
 Hi,
 
 This is related to pr45605.C test.
 
  Reduced testcase
 
 struct B {
   virtual void Run(){};
 };
 
 struct D : public B {
   virtual void Run() { };
 };
 
 int main() {
   D d;
   static_castB(d).Run();
 }
 
 With x86_64 linux the call to Run through object d is devirtualized.
 Whereas it looks like in ia64 hp-ux it is not devirtualized.
 
 -fdump-tree-fre1 output for both:
 
  x86_64 linux:
 
   MEM[(struct B *)d]._vptr.B = MEM[(void *)_ZTV1B + 16B];
   d.D.2197._vptr.B = MEM[(void *)_ZTV1D + 16B];
   D.2248_1 = MEM[(void *)_ZTV1D + 16B];
   D.2249_2 = Run;
   D::Run (d.D.2197);
   d ={v} {CLOBBER};
   return 0;
 
  ia64 hp-ux:
 
   MEM[(struct B *)d]._vptr.B = MEM[(void *)_ZTV1B + 16B];
   d.D.1878._vptr.B = MEM[(void *)_ZTV1D + 16B];
   D.1929_1 = MEM[(void *)_ZTV1D + 16B];
   D.1930_2 = (int (*__vtbl_ptr_type) ()) MEM[(void *)_ZTV1D + 16B];
   OBJ_TYPE_REF(D.1930_2;d.D.1878-0) (d.D.1878);
 
 Is it a bug (unexpected with O1 compilation) that it is not optimized to 
 direct call?


There are two important and related differences.  The first one is
that virtual method tables on ia64 constist of FDESC_EXPRs rather than
mere ADDR_EXPRs.  The second one can be seen in the dumps just before
fre1 (i.e. esra):

i686:
  d.D.1854._vptr.B = MEM[(void *)_ZTV1D + 8B];
  D.1961_4 = d.D.1854._vptr.B;
  D.1962_5 = *D.1961_4;
  OBJ_TYPE_REF(D.1962_5;d.D.1854-0) (d.D.1854);

ia64:
  d.D.1883._vptr.B = MEM[(void *)_ZTV1D + 16B];
  D.1991_4 = d.D.1883._vptr.B;
  D.1992_5 = (int (*__vtbl_ptr_type) ()) D.1991_4;
  OBJ_TYPE_REF(D.1992_5;d.D.1883-0) (d.D.1883);

The main difference is not the type cast in the third assignment but
the fact that there is no dereference there, which means that gimple
folder has to deal with it at a different place.

I played with it a bit this afternoon and came up with the following
untested patch to fix the pr45605.C testcase.  I can bootstrap and
test it on ia64 if we do not mind this special casing of FDESC_EXPRs
in the midle end (I hope that all platforms that use it use it in the
same way, I only know ia64...)

Thanks,

Martin


2012-08-06  Martin Jambor  mjam...@suse.cz

* gimple-fold.c (gimple_fold_stmt_to_constant_1): Also fold
assignments of V_C_Es of addresses of FDESC_EXPRs.


*** gcc/gimple-fold.c   Mon Aug  6 14:36:37 2012
--- /tmp/FcpIKb_gimple-fold.c   Mon Aug  6 20:17:26 2012
*** gimple_fold_stmt_to_constant_1 (gimple s
*** 2542,2548 
 == TYPE_ADDR_SPACE (TREE_TYPE (op0))
   TYPE_MODE (TREE_TYPE (lhs))
 == TYPE_MODE (TREE_TYPE (op0)))
!   return op0;
  
return
fold_unary_ignore_overflow_loc (loc, subcode,
--- 2542,2556 
 == TYPE_ADDR_SPACE (TREE_TYPE (op0))
   TYPE_MODE (TREE_TYPE (lhs))
 == TYPE_MODE (TREE_TYPE (op0)))
!   {
! tree t;
! if (TREE_CODE (op0) != ADDR_EXPR)
!   return op0;
! t = fold_const_aggregate_ref_1 (TREE_OPERAND (op0, 0), 
valueize);
! if (t  TREE_CODE (t) == FDESC_EXPR)
!   return build_fold_addr_expr_loc (loc, TREE_OPERAND (t, 0));
! return op0;
!   }
  
return
fold_unary_ignore_overflow_loc (loc, subcode,


Re: pr45605.C devirtualize call failure in ia64-hp-hpux?

2012-04-30 Thread Eric Botcazou
 With x86_64 linux the call to Run through object d is devirtualized.
 Whereas it looks like in ia64 hp-ux it is not devirtualized.

 -fdump-tree-fre1 output for both:

  x86_64 linux:

   MEM[(struct B *)d]._vptr.B = MEM[(void *)_ZTV1B + 16B];
   d.D.2197._vptr.B = MEM[(void *)_ZTV1D + 16B];
   D.2248_1 = MEM[(void *)_ZTV1D + 16B];
   D.2249_2 = Run;
   D::Run (d.D.2197);
   d ={v} {CLOBBER};
   return 0;

  ia64 hp-ux:

   MEM[(struct B *)d]._vptr.B = MEM[(void *)_ZTV1B + 16B];
   d.D.1878._vptr.B = MEM[(void *)_ZTV1D + 16B];
   D.1929_1 = MEM[(void *)_ZTV1D + 16B];
   D.1930_2 = (int (*__vtbl_ptr_type) ()) MEM[(void *)_ZTV1D + 16B];
   OBJ_TYPE_REF(D.1930_2;d.D.1878-0) (d.D.1878);

 Is it a bug (unexpected with O1 compilation) that it is not optimized to
 direct call?

I'd think so, IA-64 uses a specific construct for its vtables because of the 
ABI requirements.

-- 
Eric Botcazou