Re: Apple's Objective-C 2.0 extensions

2007-03-07 Thread Fariborz Jahanian


On Mar 7, 2007, at 9:13 AM, Eric Christopher wrote:


Hi Michael,



Two questions about Apple's Objective-C 2.0 work:

1) Does anyone know when the syntax extensions will be available   
working

in the gcc compiler?


It is work in progress. For current status, you can check out Apple's  
4.0 branch. We will push this to FSF mainline when features are  
frozen and adapted by Apple's internal developers. Note that features  
rely heavily on Leopard frameworks. So, you may not get very far  
using the new features on Tiger, etc.




2) Will their garbage collection  accelerated message dispatch  
mechanisms

also be supported?


Yes. In Leopard.

If you want more information on the features, please ask and after  
management approval, I can forward them to you.


- Fariborz





Fairborz is working on them, I imagine that it won't be until  
they're done in Leopard, but I'll let him give more information.


-eric





Re: Apple's Objective-C 2.0 extensions

2007-03-07 Thread Fariborz Jahanian


On Mar 7, 2007, at 11:16 AM, Mike Stump wrote:




Does -fobjc-gc work for you now?  It's been on mainline for a while  
now.  As for accelerated message dispatch, I'm not exactly certain  
which feature you're


Option may be recognized. But it entirely depends on Leopard runtime  
for support.


- Fariborz







Use of compound_literal_expr in c vs target_expr in c++ for compound literals

2006-07-24 Thread Fariborz Jahanian


gcc generates two separate trees for compound literals in c and c++.  
As in this test case:


struct S {
int i,j;
};
void foo (struct S);

int main ()
{
foo((struct S){1,1});
}


In c it generates compound_literal_expr and in c++ it generates  
target_expr. But gimplifier treats them differently in the following  
areas:


1) in routine mostly_copy_tree_v we don;t copy target_expr but we do  
copy compound_literal_expr. I see the following comment there:


/ * Similar to copy_tree_r() but do not copy SAVE_EXPR or  
TARGET_EXPR nodes.

   These nodes model computations that should only be done once.  If we
   were to unshare something like SAVE_EXPR(i++), the gimplification
   process would create wrong code.  */

Shouldn't compound_literal_expr be treated same as target_expr here?

2) gimplify_target_expr can be called more than once on the same  
target_expr node because first time around its TARGET_EXPR_INITIAL is  
set to NULL.
This works as a guard and prevents its temporary to be added to  
the temporary list more than once (when call is made to  
gimple_add_tmp_var).


On the other hand, such a guard does not exist for a  
compound_literal_expr and when gimple_add_tmp_var is called, it  
asserts. So, I added check for
!DECL_SEEN_IN_BIND_EXPR_P (decl) in  
gimplify_compound_literal_expr before call to gimple_add_tmp_var is  
made. As in the following diff:


% svn diff c-gimplify.c
Index: c-gimplify.c
===
--- c-gimplify.c(revision 116462)
+++ c-gimplify.c(working copy)
@@ -538,7 +538,7 @@
   /* This decl isn't mentioned in the enclosing block, so add it to  
the

  list of temps.  FIXME it seems a bit of a kludge to say that
  anonymous artificial vars aren't pushed, but everything else  
is.  */

-  if (DECL_NAME (decl) == NULL_TREE)
+  if (DECL_NAME (decl) == NULL_TREE  !DECL_SEEN_IN_BIND_EXPR_P  
(decl))

 gimple_add_tmp_var (decl);

This fixes the problem I am encountering. Is this a right approach in  
situations when compound_literal_expr is used to represent a compound  
literal in c and the expression is referenced in multiple places (by  
hanging off a save_expr call_expr tree)?


- Thanks, Fariborz ([EMAIL PROTECTED])






Re: Use of compound_literal_expr in c vs target_expr in c++ for compound literals

2006-07-24 Thread Fariborz Jahanian


On Jul 24, 2006, at 3:07 PM, Andrew Pinski wrote:




gcc generates two separate trees for compound literals in c and c++.
As in this test case:

struct S {
 int i,j;
};
void foo (struct S);

int main ()
{
 foo((struct S){1,1});
}



 On the other hand, such a guard does not exist for a
compound_literal_expr and when gimple_add_tmp_var is called, it
asserts. So, I added check for
 !DECL_SEEN_IN_BIND_EXPR_P (decl) in
gimplify_compound_literal_expr before call to gimple_add_tmp_var is
made. As in the following diff:


I think you are trying to fix PR 28418 which is an ICE in  
gimple_add_tmp_var with

compound literals in C.


Yes, looks like is similar to my problem.

- Thanks, Fariborz



Thanks,
Andrew Pinski




Re: Use of compound_literal_expr in c vs target_expr in c++ for compound literals

2006-07-24 Thread Fariborz Jahanian


On Jul 24, 2006, at 3:07 PM, Andrew Pinski wrote:




gcc generates two separate trees for compound literals in c and c++.
As in this test case:

struct S {
 int i,j;
};
void foo (struct S);

int main ()
{
 foo((struct S){1,1});
}



 On the other hand, such a guard does not exist for a
compound_literal_expr and when gimple_add_tmp_var is called, it
asserts. So, I added check for
 !DECL_SEEN_IN_BIND_EXPR_P (decl) in
gimplify_compound_literal_expr before call to gimple_add_tmp_var is
made. As in the following diff:


I think you are trying to fix PR 28418 which is an ICE in  
gimple_add_tmp_var with

compound literals in C.


My patch fixes the test case in PR 28418 as well. There are really  
two issues here:
Should we gimplify compound_literal_expr twice? Regardless of this  
issue, how do we avoid calling gimple_add_tmp_var on the same  
variable. My patch addresses the latter.


- Fariborz



Thanks,
Andrew Pinski




Re: [RFC] patch to fix an ICE involving sign-extract of mmx expression

2005-09-26 Thread Fariborz Jahanian


On Sep 23, 2005, at 12:41 PM, Richard Henderson wrote:


On Thu, Sep 22, 2005 at 01:21:06PM -0700, Fariborz Jahanian wrote:


  /* Avoid creating invalid subregs, for example when
 simplifying (x32)255.  */
! if (final_word = GET_MODE_SIZE (inner_mode)
! || (final_word % GET_MODE_SIZE (tmode)) != 0)
return NULL_RTX;



I think you should just call validate_subreg.  Ok with that change.


This is the patch I am checking in.

- fariborz ([EMAIL PROTECTED])

ChangeLog:

2005-09-26Fariborz Jahanian [EMAIL PROTECTED]

* combine.c (make_extraction): Check for valid use of subreg.

Index: combine.c
===
RCS file: /cvs/gcc/gcc/gcc/combine.c,v
retrieving revision 1.503
diff -c -p -r1.503 combine.c
*** combine.c   26 Aug 2005 21:52:23 -  1.503
--- combine.c   26 Sep 2005 16:01:23 -
*** make_extraction (enum machine_mode mode,
*** 6314,6320 
  
  /* Avoid creating invalid subregs, for example when
 simplifying (x32)255.  */
! if (final_word = GET_MODE_SIZE (inner_mode))
return NULL_RTX;
  
  new = gen_rtx_SUBREG (tmode, inner, final_word);
--- 6314,6320 
  
  /* Avoid creating invalid subregs, for example when
 simplifying (x32)255.  */
! if (!validate_subreg (tmode, inner_mode, inner, final_word))
return NULL_RTX;
  
  new = gen_rtx_SUBREG (tmode, inner, final_word);






r~





Re: [RFC] patch to fix an ICE involving sign-extract of mmx expression

2005-09-23 Thread Fariborz Jahanian


On Sep 23, 2005, at 12:41 PM, Richard Henderson wrote:


On Thu, Sep 22, 2005 at 01:21:06PM -0700, Fariborz Jahanian wrote:


  /* Avoid creating invalid subregs, for example when
 simplifying (x32)255.  */
! if (final_word = GET_MODE_SIZE (inner_mode)
! || (final_word % GET_MODE_SIZE (tmode)) != 0)
return NULL_RTX;



I think you should just call validate_subreg.  Ok with that change.


Yes. Will do so.

- fj




r~





[RFC] patch to fix an ICE involving sign-extract of mmx expression

2005-09-22 Thread Fariborz Jahanian
In a given test case with 128 bit mmx intrinsics, routine  
make_compound_operation (in combine.c) attempts to do a sign-extract  
of the middle 64bit of the 128 bit (TImode) register. Pattern we have  
is:


(lshiftrt:TI (ashift:TI (subreg:TI (reg/v:V2DI 75  
[ vu16YPrediction3 ]) 0) (const_int 32 [0x20]))

(const_int 64 [0x40]))

And here is the code which attempts to do this:
case LSHIFTRT:
   

  /* ... fall through ...  */

case ASHIFTRT:
  lhs = XEXP (x, 0);
  rhs = XEXP (x, 1);

  /* If we have (ashiftrt (ashift foo C1) C2) with C2 = C1,
 this is a SIGN_EXTRACT.  */
=if (GET_CODE (rhs) == CONST_INT
   GET_CODE (lhs) == ASHIFT
   GET_CODE (XEXP (lhs, 1)) == CONST_INT
   INTVAL (rhs) = INTVAL (XEXP (lhs, 1)))
{
  new = make_compound_operation (XEXP (lhs, 0), next_code);
  new = make_extraction (mode, new,
 INTVAL (rhs) - INTVAL (XEXP (lhs, 1)),
 NULL_RTX, mode_width - INTVAL (rhs),
 code == LSHIFTRT, 0, in_code ==  
COMPARE);




This results in gen_rtx_SUBREG asserting. We can't really do this  
extraction when the extraction mode (DImode in this case) is not  
properly aligned within its original mode. In other words,  
gen_rtx_SUBREG attempts to generate an illegal rtl; such as:


(subreg:DI (reg/v:V2DI 75 [ vu16YPrediction3 ]) 4)

and asserts. Following patch avoids this problem. If this is OK, I  
will submit a patch when fsf mainline is unfrozen.


- fariborz ([EMAIL PROTECTED])



Index: combine.c
===
RCS file: /cvs/gcc/gcc/gcc/combine.c,v
retrieving revision 1.475.2.5
diff -c -p -r1.475.2.5 combine.c
*** combine.c   26 Aug 2005 22:36:52 -  1.475.2.5
--- combine.c   22 Sep 2005 19:52:02 -
*** make_extraction (enum machine_mode mode,
*** 6197,6203 

  /* Avoid creating invalid subregs, for example when
 simplifying (x32)255.  */
! if (final_word = GET_MODE_SIZE (inner_mode))
return NULL_RTX;

  new = gen_rtx_SUBREG (tmode, inner, final_word);
--- 6197,6204 

  /* Avoid creating invalid subregs, for example when
 simplifying (x32)255.  */
! if (final_word = GET_MODE_SIZE (inner_mode)
! || (final_word % GET_MODE_SIZE (tmode)) != 0)
return NULL_RTX;

  new = gen_rtx_SUBREG (tmode, inner, final_word);



Can we have a symbol_ref node of a declared symbol without having its flags set?

2005-09-15 Thread Fariborz Jahanian
I ran into a problem when chasing down an -mfix-and-continue (an  
apple specialty :) code-gen problem.


In a test case, ivopts creates a symbol_ref via a call to  
produce_memory_decl_rtl; as in:


if (TREE_STATIC (obj) || DECL_EXTERNAL (obj))
  {
const char *name = IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME  
(obj));

x = gen_rtx_SYMBOL_REF (Pmode, name);
  }
...

But it does not set the flags for this symbol. This causes code gen  
problem in certain cases ; such as in apple-ppc-darwin PIC generation  
code, which rely on these flags. An obvious fix come to mind is to  
set the flags when symbol_ref is created. Such as in this patch. But  
a more general question is should we always set the flags for  
symbol_ref whenever such a node is created for a declared symbol?



--- 2376,2404 
  static rtx
  produce_memory_decl_rtl (tree obj, int *regno)
  {
!   rtx x, ret;
if (!obj)
  abort ();
if (TREE_STATIC (obj) || DECL_EXTERNAL (obj))
  {
const char *name = IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME  
(obj));

x = gen_rtx_SYMBOL_REF (Pmode, name);
+   ret = gen_rtx_MEM (DECL_MODE (obj), x);
+   SET_DECL_RTL (obj, ret);
+   targetm.encode_section_info (obj, DECL_RTL (obj), true);
  }
else
! {
!   x = gen_raw_REG (Pmode, (*regno)++);
!   ret = gen_rtx_MEM (DECL_MODE (obj), x);
! }

!   return ret;
  }

Thanks, fariborz ([EMAIL PROTECTED])



RFC - COST of const_double for x86 prevents constant copy propagation in cse

2005-08-25 Thread Fariborz Jahanian
(Note! I am starting a new thread of an old thread because of old  
thread's corruption which prevented me from responding).


Following test case:

struct S {
double d1, d2, d3;
};

struct S ms()
{
struct S s = {0,0,0};
return s;
}

Compiled with -O1 -mdynamic-no-pic -march=pentium4 produces:

pxor%xmm0, %xmm0
movsd   %xmm0, 16(%eax)
movsd   %xmm0, 8(%eax)
movsd   %xmm0, (%eax)

But following code results in 7% performance gain in eon as reported  
by one of Apple's performance people:


movl$0, 16(%eax)
movl$0, 20(%eax)
movl$0, 8(%eax)
movl$0, 12(%eax)
movl$0, (%eax)
movl$0, 4(%eax)

This is because cse does not do the constant propagation in this rtl  
(note that cse is capable of grabbing a constant from REG_EQUAL ).


(insn 12 7 13 0 (set (reg:DF 59)
(mem/u/i:DF (symbol_ref/u:SI (*LC0) [flags 0x2]) [0 S8  
A64])) 64 {*movdf_nointeger} (nil)

(expr_list:REG_EQUAL (const_double:DF 0.0 [0x0.0p+0])
(nil)))

(insn 13 12 15 0 (set (mem/s/j:DF (plus:SI (reg/f:SI 58 [ D.1470 ])
(const_int 16 [0x10])) [0 result.d3+0 S8 A32])
(reg:DF 59)) 64 {*movdf_nointeger} (nil)
(nil))

(insn 15 13 17 0 (set (mem/s/j:DF (plus:SI (reg/f:SI 58 [ D.1470 ])
(const_int 8 [0x8])) [0 result.d2+0 S8 A32])
(reg:DF 59)) 64 {*movdf_nointeger} (nil)
(nil))

(insn 17 15 20 0 (set (mem/s/j:DF (reg/f:SI 58 [ D.1470 ]) [0  
result.d1+0 S8 A32])

(reg:DF 59)) 64 {*movdf_nointeger} (nil)
(nil))

And the reason that it is not doing it is the definition of COST  
macro which returns a higher cost for const_double than when constant  
is available in a register. For x86 platform, this cost is evaluated  
in call to ix86_rtx_costs. It returns 1 or 2. I had a lengthy  
conversation with Ian Lance Taylor. He suggested to lower the  
const_double cost to 0. And indeed, this will lower the cost so COST  
of const_double constant wins. But careful selection of this cost in  
ix86_rtx_costs makes me cautious that this may break performance on  
some other flavors of x86 architecture and/or on some other  
benchmarks. Any comments from those familiar with this cost function  
(or any other way that cse to do its job, such as a special new cost  
function) is appreciated.


- Thanks, fariborz ([EMAIL PROTECTED]).






Re: RFC - COST of const_double for x86 prevents constant copy propagation in cse

2005-08-25 Thread Fariborz Jahanian

Forgot to attach the patch:

Index: i386.c
===
RCS file: /cvs/gcc/gcc/gcc/config/i386/i386.c,v
retrieving revision 1.795.4.33
diff -c -p -r1.795.4.33 i386.c
*** i386.c  15 Aug 2005 23:36:10 -  1.795.4.33
--- i386.c  25 Aug 2005 17:08:33 -
*** ix86_rtx_costs (rtx x, int code, int out
*** 15730,15740 
else
switch (standard_80387_constant_p (x))
  {
! case 1: /* 0.0 */
!   *total = 1;
!   break;
! default: /* Other constants */
!   *total = 2;
break;
  case 0:
  case -1:
--- 15730,15737 
else
switch (standard_80387_constant_p (x))
  {
! default: /* All constants */
!   *total = 0;
break;
  case 0:
  case -1:

On Aug 25, 2005, at 11:09 AM, Fariborz Jahanian wrote:

(Note! I am starting a new thread of an old thread because of old  
thread's corruption which prevented me from responding).


Following test case:

struct S {
double d1, d2, d3;
};

struct S ms()
{
struct S s = {0,0,0};
return s;
}

Compiled with -O1 -mdynamic-no-pic -march=pentium4 produces:

pxor%xmm0, %xmm0
movsd   %xmm0, 16(%eax)
movsd   %xmm0, 8(%eax)
movsd   %xmm0, (%eax)

But following code results in 7% performance gain in eon as  
reported by one of Apple's performance people:


movl$0, 16(%eax)
movl$0, 20(%eax)
movl$0, 8(%eax)
movl$0, 12(%eax)
movl$0, (%eax)
movl$0, 4(%eax)

This is because cse does not do the constant propagation in this  
rtl (note that cse is capable of grabbing a constant from REG_EQUAL ).


(insn 12 7 13 0 (set (reg:DF 59)
(mem/u/i:DF (symbol_ref/u:SI (*LC0) [flags 0x2]) [0 S8  
A64])) 64 {*movdf_nointeger} (nil)

(expr_list:REG_EQUAL (const_double:DF 0.0 [0x0.0p+0])
(nil)))

(insn 13 12 15 0 (set (mem/s/j:DF (plus:SI (reg/f:SI 58 [ D.1470 ])
(const_int 16 [0x10])) [0 result.d3+0 S8 A32])
(reg:DF 59)) 64 {*movdf_nointeger} (nil)
(nil))

(insn 15 13 17 0 (set (mem/s/j:DF (plus:SI (reg/f:SI 58 [ D.1470 ])
(const_int 8 [0x8])) [0 result.d2+0 S8 A32])
(reg:DF 59)) 64 {*movdf_nointeger} (nil)
(nil))

(insn 17 15 20 0 (set (mem/s/j:DF (reg/f:SI 58 [ D.1470 ]) [0  
result.d1+0 S8 A32])

(reg:DF 59)) 64 {*movdf_nointeger} (nil)
(nil))

And the reason that it is not doing it is the definition of COST  
macro which returns a higher cost for const_double than when  
constant is available in a register. For x86 platform, this cost is  
evaluated in call to ix86_rtx_costs. It returns 1 or 2. I had a  
lengthy conversation with Ian Lance Taylor. He suggested to lower  
the const_double cost to 0. And indeed, this will lower the cost so  
COST of const_double constant wins. But careful selection of this  
cost in ix86_rtx_costs makes me cautious that this may break  
performance on some other flavors of x86 architecture and/or on  
some other benchmarks. Any comments from those familiar with this  
cost function (or any other way that cse to do its job, such as a  
special new cost function) is appreciated.


- Thanks, fariborz ([EMAIL PROTECTED]).









Re: RFC - COST of const_double for x86 prevents constant copy propagation in cse

2005-08-25 Thread Fariborz Jahanian


On Aug 25, 2005, at 12:47 PM, H. J. Lu wrote:


On Thu, Aug 25, 2005 at 12:37:32PM -0700, Ian Lance Taylor wrote:


Fariborz Jahanian [EMAIL PROTECTED] writes:



Forgot to attach the patch:

Index: i386.c
===
RCS file: /cvs/gcc/gcc/gcc/config/i386/i386.c,v
retrieving revision 1.795.4.33
diff -c -p -r1.795.4.33 i386.c
*** i386.c  15 Aug 2005 23:36:10 -  1.795.4.33
--- i386.c  25 Aug 2005 17:08:33 -
*** ix86_rtx_costs (rtx x, int code, int out
*** 15730,15740 
 else
 switch (standard_80387_constant_p (x))
   {
! case 1: /* 0.0 */
!   *total = 1;
!   break;
! default: /* Other constants */
!   *total = 2;
 break;
   case 0:
   case -1:
--- 15730,15737 
 else
 switch (standard_80387_constant_p (x))
   {
! default: /* All constants */
!   *total = 0;
 break;
   case 0:
   case -1:



For what it's worth, as I told Fariborz, I suspect that returning  
0 is

correct for SFmode, but I'm somewhat doubtful for DFmode.  And his
test case is odd since the resulting code has more instructions  
and is

larger.  I know little about x86 instruction timings, but it seems
surprising that the new sequence is faster.  Maybe the problem is in
using %xmm0 instead of one of the 80387 registers--or, since this is
after all merely a constant--one of the general registers.

And in any case this type of thing should be controlled by an  
entry in

the i386 processor_costs structure.



I think the problem may be somewhere else. I got the same xmm0 code
sequence on Linux/ia32 with -msse3 -mfpmath=sse. However, I got

xorl%eax, %eax
movq%rax, 16(%rdi)
movq%rax, 8(%rdi)
movq%rax, (%rdi)


Can you try this with -march=pentium4

- fariborz



on Linux/x86-64.


H.J.





bootstrap of gcc mainline on apple-x86-darwin is broken

2005-07-19 Thread Fariborz Jahanian
Today's checkout and bootstrap on  apple-x86-darwin resulted in the  
following:


stage1/xgcc -Bstage1/ -B/usr/local/i686-apple-darwin8.1.0/bin/   -O2 - 
g -fomit-frame-pointer -DIN_GCC   -W -Wall -Wwrite-strings -Wstrict- 
prototypes -Wmissing-prototypes -pedantic -Wno-long-long -Wno- 
variadic-macros -Wold-style-definition -Werror -fno-common   - 
DHAVE_CONFIG_H -DGENERATOR_FILE  -o build/genattrtab \

build/genattrtab.o build/genautomata.o \
build/rtl.o build/read-rtl.o build/ggc-none.o build/min-insn-modes.o  
build/gensupport.o build/insn-conditions.o build/print-rtl.o build/ 
errors.o \

build/varray.o ../build-i686-apple-darwin8.1.0/libiberty/libiberty.a -lm
build/genattrtab ../../gcc-mainline/gcc/config/i386/i386.md  tmp- 
attrtab.c

make[2]: *** [s-attrtab] Error 139
make[1]: *** [stage2_build] Error 2
make: *** [bootstrap] Error 2

Is this known?

- Thanks, fariborz



x86 build is broken

2005-07-08 Thread Fariborz Jahanian
Tried building fsf mainline on x86-darwin. Syntax error compiling c- 
common.c. The preprocessed file shows the following:



if (__builtin_ acosf  1)
  { tree decl;
((void)(!((!1  !1) || !strncmp (__builtin_ acosf,  
__builtin_, strlen (__builtin_))) ? fancy_abort (../../gcc- 
mainline/gcc/builtins.def, 162, __FUNCTION__), 0 : 0));

if (!1)
  decl = lang_hooks.builtin_function (__builtin_ acosf,  
builtin_types[BT_FN_FLOAT_FLOAT], BUILT_IN_ACOSF, BUILT_IN_NORMAL,  
(1 ? (__builtin_ acosf + strlen (__builtin_)) : ((void *)0)),  
built_in_attributes[(int) (flag_errno_math ? ATTR_NOTHROW_LIST :  
(flag_unsafe_math_optimizations ? ATTR_CONST_NOTHROW_LIST :  
ATTR_PURE_NOTHROW_NOVOPS_LIST))]);

else
 decl = builtin_function_2 (__builtin_ acosf, __builtin_  
acosf + strlen (__builtin_), builtin_types[BT_FN_FLOAT_FLOAT],  
builtin_types[BT_FN_FLOAT_FLOAT], BUILT_IN_ACOSF, BUILT_IN_NORMAL,  
1, !flag_isoc99, built_in_attributes[(int) (flag_errno_math ?  
ATTR_NOTHROW_LIST : (flag_unsafe_math_optimizations ?  
ATTR_CONST_NOTHROW_LIST : ATTR_PURE_NOTHROW_NOVOPS_LIST))]);

  built_in_decls[(int) BUILT_IN_ACOSF] = decl;
  if ()
  ^^
implicit_built_in_decls[(int) BUILT_IN_ACOSF] = decl;
  }


Which is result of macro expansion of:


DEF_C99_C90RES_BUILTIN (BUILT_IN_ACOSF, acosf, BT_FN_FLOAT_FLOAT,  
ATTR_MATHFN_FPROUNDING_ERRNO)


in builtins.def.

- fariborz



Re: x86 build is broken

2005-07-08 Thread Fariborz Jahanian


On Jul 8, 2005, at 5:36 PM, Daniel Berlin wrote:


On Fri, 2005-07-08 at 17:13 -0700, Fariborz Jahanian wrote:


Tried building fsf mainline on x86-darwin. Syntax error compiling c-
common.c. The preprocessed file shows the following:




as of when?

I bootstrapped and tested x86_64-unknown-linux-gnu and x86-linux- 
gnu and

powerpc-linux-gnu in the 2.5 hours before committing my patch, so i'm
pretty sure it wasn't  me :)


I did a fresh update just to be sure; still broken for x86-darwin. I  
don't think it is related to your change. It could be darwin specific.


- fariborz









Re: x86 build is broken

2005-07-08 Thread Fariborz Jahanian


On Jul 8, 2005, at 5:41 PM, Andrew Pinski wrote:



On Jul 8, 2005, at 8:13 PM, Fariborz Jahanian wrote:


Tried building fsf mainline on x86-darwin. Syntax error compiling  
c-common.c. The preprocessed file shows the following:




This is a darwin specific bug and was introduced by Geoff K.'s  
patch today.

I committed this as obvious to fix the bug.

Thanks,
Andrew Pinski

ChangeLog:

* config/darwin.h (TARGET_C99_FUNCTIONS): Define to 1.


Yes. This should fix it. Thanks.

- fariborz




t3.diff.txt





Re: [RFH] - Less than optimal code compiling 252.eon -O2 for x86

2005-06-30 Thread Fariborz Jahanian


On Jun 30, 2005, at 11:23 AM, Jeffrey A Law wrote:


On Thu, 2005-06-30 at 20:12 +0200, Bernd Schmidt wrote:


Jeffrey A Law wrote:


I'd tend to agree.  I'd rather see the option go away than linger on
if the option is no longer useful.



I wouldn't mind that, but I'd also like to point out that there are
Makefiles out there which hard-code things like -fforce-mem.  Do  
we want

to keep the option as a stub to avoid breaking them?


Excellent point.  I believe in other cases we've kept the option
around for a release, then killed it.


I would also like to keep this feature around for a while. It is  
possible that setting of this option under -O2/-O3 has masked some  
optimization bugs. In which case, addition of -fforce-mem would be a  
temporary workaround.


- fariborz



jeff








Re: [RFH] - Less than optimal code compiling 252.eon -O2 for x86

2005-06-30 Thread Fariborz Jahanian


On Jun 30, 2005, at 12:47 PM, Steven Bosscher wrote:



Well, maybe so, but it would be a pretty lame workaround.  Why are you
so worried about bugs?  This flag was always disabled at -O1, and we
have never seen any bug reports that got fixed with -fforced-mem.  And
besides, it is better to fix bugs than to work around them.

Making the option a nop, issuing a warning in 4.1 and removing the
option completely for gcc 4.2 looks like a very reasonable approach to
me.



OK. This seems to be the consensus and I will prepare a patch base on  
that.


- Thanks, fariborz


Gr.
Steven






Re: [RFH] - Less than optimal code compiling 252.eon -O2 for x86

2005-06-27 Thread Fariborz Jahanian


On Jun 27, 2005, at 12:56 PM, Richard Henderson wrote:


Hmm.  I would suspect this is obsolete now.  We'll have forced
everything into registers (or something equivalent that we
can work with) during tree optimization.  Any CSEs that can be
made should have been made.



I will do  sanity check followed by SPEC runs (x86 and ppc darwin)  
and see if behavior changes by obsoleting -fforce-mem  in -O2  (or  
higher).


- Thanks, fariborz



r~





[RFH] - Less than optimal code compiling 252.eon -O2 for x86

2005-06-24 Thread Fariborz Jahanian
A source file mrSurfaceList.cc of 252.eon produces less efficient  
code initializing instance objects to 0 at -O2 than at -O1. Behavior  
is random and it does not happen on all x86  platforms and making the  
test smaller makes the problem go away. But here is what I found out  
is the cause.


When source is compiled with -O1 -march=pentium4,  'cse' phase sees  
the following pattern initializing a 'double' with 0.


(insn 18 13 19 0 (set (reg:SF 109)
(mem/u/i:SF (symbol_ref/u:SI (*LC11) [flags 0x2]) [0 S4  
A32])) -1 (nil)

(nil))

(insn 19 18 20 0 (set (mem/s/j:DF (plus:SI (reg/f:SI 20 frame)
(const_int -32 [0xffe0])) [0  
objectBox.pmin.e+16 S8 A128])

(float_extend:DF (reg:SF 109))) 86 {*extendsfdf2_sse} (nil)
(nil))

Then fold_rtx routine  converts it into its reduced form, resulting  
in optimum code:


(insn 19 13 21 0 (set (mem/s/j:DF (plus:SI (reg/f:SI 20 frame)
(const_int -32 [0xffe0])) [0  
objectBox.pmin.e+16 S8 A128])

(const_double:DF 0.0 [0x0.0p+0])) 64 {*movdf_nointeger} (nil)
(nil))


But when the same source is compiled with -O2 march=pentium4, 'cse'  
phase sees a slightly different pattern (note that float_extend:DF  
has moved)


(insn 18 13 19 0 (set (reg:DF 109)
(float_extend:DF (mem/u/i:SF (symbol_ref/u:SI (*LC13)  
[flags 0x2]) [0 S4 A32]))) -1 (nil)

(nil))

(insn 19 18 20 0 (set (mem/s/j:DF (plus:SI (reg/f:SI 20 frame)
(const_int -32 [0xffe0])) [0  
objectBox.pmin.e+16 S8 A128])

(reg:DF 109)) 64 {*movdf_nointeger} (nil)
(nil))

This cannot be simplified by fold_rtx, resulting in less efficient code.

Change in pattern is most likely because of additional tree  
optimization phases running at -O2. If so, then should the cse be  
taught to simplify the new rtl pattern. Or, the tree optimizer phase  
responsible for the less than optimal tree need be twiked to generate  
the same tree as with -O1?


Thanks, fariborz



[RFC] Problem with altivec_vmrghb pattern in altivec.md

2005-05-06 Thread Fariborz Jahanian
One of our internal apps fails due to problem in folding of vec_mergeh 
of unsigned char of zeros and ones. It produces a new vector of zeros 
followed
by ones. I traced the problem to the 3rd operand for the 
altivec_vmrghb pattern defined in altivec.md file. It is 255 (0xff). 
I think it
should be 21845  for unsigned chars (0x). With this change, 
customer code passes and merged pattern looks OK.
So far so good. But I tried the following test case (with -O2) and 
curiously enough it works OK with or *without* my change!. So, I am 
wondering if
I approached this problem correctly.
From the code in simplify-rtx.c where value of merge of two constants 
in a VEC_MERGE rtl is computed, it seems that the correct value for 
element
selection should be 0x. But I am curious why changing this value 
did not make a difference in the following test case (compiled with 
-O2).

- Thanks, fariborz ([EMAIL PROTECTED])
#include stdio.h
int main (int argc, const char * argv[]) {
vector unsigned char v_zero;
vector unsigned char v_c1;
v_zero   = (vector unsigned char) 
('a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p');
v_c1   = (vector unsigned char)  
('1','2','3','4','5','6','7','8','1','2','3','4','5','6','7','8');

vector unsigned char vResult = vec_mergeh(v_zero, v_c1);
printf (\t%vc\n, vResult);
return 0;
}


bootstrap fails for apple-ppc-darwin

2005-03-31 Thread Fariborz Jahanian
Today, I tried bootstrapping gcc mainline on/for apple-ppc-darwin. It 
fails in stage1.
Is this known?

- Thanks, fariborz
./xgcc -B./ -B/usr/local/powerpc-apple-darwin8.0.0/bin/ -isystem 
/usr/local/powerpc-apple-darwin8.0.0/include -isystem 
/usr/local/powerpc-apple-darwin8.0.0/sys-include 
-L/Volumes/sandbox/gcc-mainline-bootstrap.obj/gcc/../ld -O2  -DIN_GCC   
 -W -Wall -Wwrite-strings -Wstrict-prototypes -Wmissing-prototypes 
-Wold-style-definition  -isystem ./include  -Wa,-force_cpusubtype_ALL 
-g -DHAVE_GTHR_DEFAULT -DIN_LIBGCC2 -D__GCC_FLOAT_NOT_NEEDED  
-dynamiclib -nodefaultlibs 
-Wl,-install_name,/usr/local/lib/libgcc_s.1.0.dylib -Wl,-flat_namespace 
-o ppc64/libgcc_s.1.0.dylib.tmp 
-Wl,-exported_symbols_list,libgcc/ppc64/libgcc.map 
-compatibility_version 1 -current_version 1.0  -m64 
libgcc/ppc64/_muldi3_s.o libgcc/ppc64/_negdi2_s.o 
libgcc/ppc64/_lshrdi3_s.o libgcc/ppc64/_ashldi3_s.o 
libgcc/ppc64/_ashrdi3_s.o libgcc/ppc64/_cmpdi2_s.o 
libgcc/ppc64/_ucmpdi2_s.o libgcc/ppc64/_floatdidf_s.o 
libgcc/ppc64/_floatdisf_s.o libgcc/ppc64/_fixunsdfsi_s.o 
libgcc/ppc64/_fixunssfsi_s.o libgcc/ppc64/_fixunsdfdi_s.o 
libgcc/ppc64/_fixdfdi_s.o libgcc/ppc64/_fixunssfdi_s.o 
libgcc/ppc64/_fixsfdi_s.o libgcc/ppc64/_fixxfdi_s.o 
libgcc/ppc64/_fixunsxfdi_s.o libgcc/ppc64/_floatdixf_s.o 
libgcc/ppc64/_fixunsxfsi_s.o libgcc/ppc64/_fixtfdi_s.o 
libgcc/ppc64/_fixunstfdi_s.o libgcc/ppc64/_floatditf_s.o 
libgcc/ppc64/_clear_cache_s.o libgcc/ppc64/_enable_execute_stack_s.o 
libgcc/ppc64/_trampoline_s.o libgcc/ppc64/__main_s.o 
libgcc/ppc64/_absvsi2_s.o libgcc/ppc64/_absvdi2_s.o 
libgcc/ppc64/_addvsi3_s.o libgcc/ppc64/_addvdi3_s.o 
libgcc/ppc64/_subvsi3_s.o libgcc/ppc64/_subvdi3_s.o 
libgcc/ppc64/_mulvsi3_s.o libgcc/ppc64/_mulvdi3_s.o 
libgcc/ppc64/_negvsi2_s.o libgcc/ppc64/_negvdi2_s.o 
libgcc/ppc64/_ctors_s.o libgcc/ppc64/_ffssi2_s.o 
libgcc/ppc64/_ffsdi2_s.o libgcc/ppc64/_clz_s.o libgcc/ppc64/_clzsi2_s.o 
libgcc/ppc64/_clzdi2_s.o libgcc/ppc64/_ctzsi2_s.o 
libgcc/ppc64/_ctzdi2_s.o libgcc/ppc64/_popcount_tab_s.o 
libgcc/ppc64/_popcountsi2_s.o libgcc/ppc64/_popcountdi2_s.o 
libgcc/ppc64/_paritysi2_s.o libgcc/ppc64/_paritydi2_s.o 
libgcc/ppc64/_powisf2_s.o libgcc/ppc64/_powidf2_s.o 
libgcc/ppc64/_powixf2_s.o libgcc/ppc64/_powitf2_s.o 
libgcc/ppc64/_mulsc3_s.o libgcc/ppc64/_muldc3_s.o 
libgcc/ppc64/_mulxc3_s.o libgcc/ppc64/_multc3_s.o 
libgcc/ppc64/_divsc3_s.o libgcc/ppc64/_divdc3_s.o 
libgcc/ppc64/_divxc3_s.o libgcc/ppc64/_divtc3_s.o 
libgcc/ppc64/_divdi3_s.o libgcc/ppc64/_moddi3_s.o 
libgcc/ppc64/_udivdi3_s.o libgcc/ppc64/_umoddi3_s.o 
libgcc/ppc64/_udiv_w_sdiv_s.o libgcc/ppc64/_udivmoddi4_s.o 
libgcc/ppc64/darwin-tramp_s.o libgcc/ppc64/darwin-ldouble_s.o 
libgcc/ppc64/unwind-dw2_s.o libgcc/ppc64/unwind-dw2-fde-darwin_s.o 
libgcc/ppc64/unwind-sjlj_s.o libgcc/ppc64/unwind-c_s.o 
libgcc/ppc64/darwin-fallback_s.o -lc  rm -f ppc64/libgcc_s.dylib  
if [ -f ppc64/libgcc_s.1.0.dylib ]; then mv -f ppc64/libgcc_s.1.0.dylib 
ppc64/libgcc_s.1.0.dylib.backup; else true; fi  mv 
ppc64/libgcc_s.1.0.dylib.tmp ppc64/libgcc_s.1.0.dylib  ln -s 
libgcc_s.1.0.dylib ppc64/libgcc_s.dylib
/usr/bin/libtool: fatal error in ld64
make[3]: *** [ppc64/libgcc_s.dylib] Error 1
make[2]: *** [libgcc.a] Error 2
make[1]: *** [stage1_build] Error 2
make: *** [bootstrap] Error 2



C++ [RFC] taking address of a static const data member

2005-03-11 Thread Fariborz Jahanian
Section 9.4.2 of c++ standard Static data members  does not directly 
address this issue. But there is
a dejagnu c++ test case which explicitly disallows (by issuing a 
link-time error) taking address of a static
const data member.  Test case is const2.C.
This question has come up because, g++-4.0 (ppc-darwin target) issues 
the
same link error for the following test case (which requires taking 
address of Foo::foo).

#include map
struct Foo { static const int foo = 0x3ab; };
int main()
{
std::mapint, int m;
m[Foo::foo];
}
And here is const2.C for easy reference:
/ { dg-do link }
// This test should get a linker error for the reference to Aint::i.
// { dg-error i  { target *-*-* } 0 }
template class T struct B { static const int i = 3; };
template class T struct A { static const int i = BT::i; };
const int *p = Aint::i;
int main(){}
So, is g++ correct in rejecting this seemingly good user code?
- Thanks, fariborz ([EMAIL PROTECTED])


Re: C++ [RFC] taking address of a static const data member

2005-03-11 Thread Fariborz Jahanian
Thanks Andrew. Yes, standard actually mentions this that I missed.
- fariborz

On Mar 11, 2005, at 11:25 AM, Andrew Pinski wrote:
On Mar 11, 2005, at 2:16 PM, Fariborz Jahanian wrote:
So, is g++ correct in rejecting this seemingly good user code?
Yes you need a place to store the data.
So for an example in your original testcase, you need:
const int Foo::foo;
Which fixes the problem and yes 9.4.2 explains this (I cannot find it 
right
now but I know there has been multiple bugs about this in the past).

-- Pinski



Re: patch [RFC] Simple loop runs out of stack at -O1

2005-02-25 Thread Fariborz Jahanian
On Feb 25, 2005, at 1:16 PM, Joe Buck wrote:
I duplicated this on a i686-pc-linux-gnu system: the compiler is built
from last night's trunk.
% /usr/localdisk/gcc-cvs/trunk/bin/gcc -c -O1 bad.c
gcc: Internal error: Segmentation fault (program cc1)
Please submit a full bug report.
See URL:http://gcc.gnu.org/bugs.html for instructions.
Could you please file a PR and attach the proposed patch?
I wil shortly.
- fariborz