[Bug c/28418] [4.0/4.1 regression] ICE incrementing compound literal expression

2006-08-25 Thread fjahanian at apple dot com


--- Comment #8 from fjahanian at apple dot com  2006-08-25 21:36 ---
I was about to sub mit the patch. Thank you for this patch.

- Fariborz

> Subject: Bug 28418
> 
> Author: jsm28
> Date: Fri Aug 25 21:14:24 2006
> New Revision: 116436
> 
> URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=116436
> Log:
> 2006-08-25  Fariborz Jahanian  <[EMAIL PROTECTED]>
> 
> PR c/28418
> * c-gimplify.c (gimplify_compound_literal_expr): Don't add
> variable again if DECL_SEEN_IN_BIND_EXPR_P.
> 
> 2006-08-25  Joseph S. Myers  <[EMAIL PROTECTED]>
> 
> * gcc.c-torture/compile/compound-literal-1.c: New test.
> 
> Added:
> trunk/gcc/testsuite/gcc.c-torture/compile/compound-literal-1.c
> Modified:
> trunk/gcc/ChangeLog
> trunk/gcc/c-gimplify.c
> trunk/gcc/testsuite/ChangeLog
> 


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28418



[Bug c++/28554] New: Use of __attribute__ ((constructor)) on functions issues confusing error

2006-07-31 Thread fjahanian at apple dot com
In this test case, g++ issues a diagnostics which is confusing. If use of
__attribute__ ((constructor))  on a function with argument list other than
'void' is illegal, it should say so:

__attribute__ ((constructor))
static void Initialize(int argc, char *argv[], char *envp[]) {
}

% g++ -c ctor.C
ctor.C: In function '(static initializers for ctor.C)':
ctor.C:2: error: too few arguments to function 'void Initialize(int, char**,
char**)'
ctor.C:3: error: at this point in file


-- 
   Summary: Use of __attribute__ ((constructor)) on functions issues
confusing error
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
AssignedTo: unassigned at gcc dot gnu dot org
    ReportedBy: fjahanian at apple dot com
 GCC build triplet: apple-ppc-darwin
  GCC host triplet: apple-ppc-darwin
GCC target triplet: apple-ppc-darwin


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28554



[Bug c/28418] [4.0/4.1/4.2 regression] ICE incrementing compound literal expression

2006-07-24 Thread fjahanian at apple dot com


--- Comment #3 from fjahanian at apple dot com  2006-07-24 23:16 ---
gcc generates two separate trees for compound literals in c and c++. As in this
test case:

struct S {
int i,j;
};
void foo (struct S);

int main ()
{
foo((struct S){1,1});
}


In c it generates compound_literal_expr and in c++ it generates target_expr.
But gimplifier treats them differently in the following areas:

1) in routine mostly_copy_tree_v we don;t copy target_expr but we do copy
compound_literal_expr. I see the following comment there:

/ * Similar to copy_tree_r() but do not copy SAVE_EXPR or TARGET_EXPR
nodes.
   These nodes model computations that should only be done once.  If we
   were to unshare something like SAVE_EXPR(i++), the gimplification
   process would create wrong code.  */

Shouldn't compound_literal_expr be treated same as target_expr here?

2) gimplify_target_expr can be called more than once on the same target_expr
node because first time around its TARGET_EXPR_INITIAL is set to NULL.
This works as a guard and prevents its temporary to be added to the
temporary list more than once (when call is made to gimple_add_tmp_var).

On the other hand, such a guard does not exist for a compound_literal_expr
and when gimple_add_tmp_var is called, it asserts. So, I added check for
!DECL_SEEN_IN_BIND_EXPR_P (decl) in gimplify_compound_literal_expr before
call to gimple_add_tmp_var is made. As in the following diff:

% svn diff c-gimplify.c
Index: c-gimplify.c
===
--- c-gimplify.c(revision 116462)
+++ c-gimplify.c(working copy)
@@ -538,7 +538,7 @@
   /* This decl isn't mentioned in the enclosing block, so add it to the
  list of temps.  FIXME it seems a bit of a kludge to say that
  anonymous artificial vars aren't pushed, but everything else is.  */
-  if (DECL_NAME (decl) == NULL_TREE)
+  if (DECL_NAME (decl) == NULL_TREE && !DECL_SEEN_IN_BIND_EXPR_P (decl))
 gimple_add_tmp_var (decl);

This fixes the problem I am encouterring as well as the test case in this PR.


-- 

fjahanian at apple dot com changed:

   What|Removed |Added

 CC|            |fjahanian at apple dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28418



[Bug c++/24260] [4.0/4.1 Regression] stdcall attribute is ignored at static member template functions

2005-10-19 Thread fjahanian at apple dot com


--- Comment #6 from fjahanian at apple dot com  2005-10-19 17:11 ---
(In reply to comment #5)
> And did fjahanian take a look at this already to see if he
> really is to blame for causing this bug?
> 

I am miffed as to why my name was in ChangeLog-2004. PR/13989 and PR/9844 were
fixed by Ziemwit Laski (no longer at Apple). Andrew Pinski may know more about
this as he commented and pointed to Ziem's patch in that radar. annotate on
ChangeLog-2004 did not reveal any usefull info.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=24260



[Bug target/14552] compiled trivial vector intrinsic code is ineffiencent

2005-09-13 Thread fjahanian at apple dot com

--- Additional Comments From fjahanian at apple dot com  2005-09-13 21:09 
---
Hello,

What is the status of Uros's patches in:

http://gcc.gnu.org/ml/gcc-patches/2005-07/msg01128.html

Looks like they did not make it to FSF mainline? Are there remaining issues 
with them?



-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=14552


[Bug target/22152] Poor loop optimization when using mmx builtins

2005-09-12 Thread fjahanian at apple dot com

--- Additional Comments From fjahanian at apple dot com  2005-09-13 00:52 
---
Has there been any progress toward fixing the problems addressed by these PRs?
- thanks.

-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=22152


[Bug middle-end/21894] [4.0/4.1 Regression] Invalid operand to binary operator with nested function

2005-08-08 Thread fjahanian at apple dot com

--- Additional Comments From fjahanian at apple dot com  2005-08-08 17:36 
---
Thanks. Test case should say PR 21894.
> Fixed.

-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21894


[Bug rtl-optimization/22152] New: Poor loop optimization when using sse2 builtins - regression from 3.3

2005-06-22 Thread fjahanian at apple dot com
In the following trivial test case, gcc-4.1 produces very ineffecient code for 
the loop. gcc-3.3 produces 
much better code.

typedef int __m64 __attribute__ ((__vector_size__ (8)));

__m64 unsigned_add3( const __m64 *a, const __m64 *b, unsigned long count )
{
__m64 sum;
unsigned int i;

for( i = 1; i < count; i++ )
{
sum = (__m64) __builtin_ia32_paddq ((long long)a[i], 
(long long)b[i]);
}
return sum;
}

1) Loop when compiled with gcc-4.1 -O2 -msse2 (note in particular the extra 
movq to memory):

L4:
movl12(%ebp), %esi
movq(%eax,%edx,8), %mm0
paddq   (%esi,%edx,8), %mm0
incl%edx
cmpl%edx, %ecx
movq%mm0, -16(%ebp)
movl-16(%ebp), %esi
movl-12(%ebp), %edi
jne L4

2) Loop using gcc-3.3 compiled with -O2 -msse2:

L6:
movq(%esi,%edx,8), %mm0
paddq   (%eax,%edx,8), %mm0
addl$1, %edx
cmpl%ecx, %edx
jb  L6

AFAICT, culprit is reload which generates extra load and store of %mm0:

(insn 62 30 63 2 (set (mem:V2SI (plus:SI (reg/f:SI 6 bp)
(const_int -16 [0xfff0])) [0 S8 A8])
(reg:V2SI 29 mm0)) 736 {*movv2si_internal} (nil)
(nil))

(insn 63 62 32 2 (set (reg/v:V2SI 4 si [orig:61 sum ] [61])
(mem:V2SI (plus:SI (reg/f:SI 6 bp)
(const_int -16 [0xfff0])) [0 S8 A8])) 736 
{*movv2si_internal} (nil)
(nil))

Here is the larger test case from which above test was extracted:

#include 

__m64 unsigned_add3( const __m64 *a, const __m64 *b, __m64 *result, unsigned 
long count )
{
__m64 carry, temp, sum, one, onesCarry, _a, _b;
unsigned int i;

if( count > 0 )
{
_a = a[0];
_b = b[0];

one = _mm_cmpeq_pi8( _a, _a );  //-1
one = _mm_sub_si64( _mm_xor_si64( one, one ), one );//1
sum = _mm_add_si64( _a, _b );

onesCarry = _mm_and_si64( _a, _b ); //the 1's bit 
is set only if the 1's bit add 
generates a carry
onesCarry = _mm_and_si64( onesCarry, one ); 
//onesCarry &= 1

//Trim off the one's bit on both vA and vB to make room for a 
carry bit at the top after the 
add
_a = _mm_srli_si64( _a, 1 );
//vA >>= 1
_b = _mm_srli_si64( _b, 1 );
//vB >>= 1

//Add vA to vB and add the carry bit
carry = _mm_add_si64( _a, _b );
carry = _mm_add_si64( carry, onesCarry );

//right shift by 63 bits to get the carry bit for the high 64 
bit quantity
carry = _mm_srli_si64( carry, 63 );

for( i = 1; i < count; i++ )
{
result[i-1] = sum;
_a = a[i];
_b = b[i];
onesCarry = _mm_and_si64( _a, _b );
onesCarry = _mm_and_si64( onesCarry, one );
sum = _mm_add_si64( _a, _b );
_a = _mm_add_si64( _a, onesCarry );
onesCarry = _mm_and_si64( carry, _a );  //find low bit 
carry
sum = _mm_add_si64( sum, carry );   //add 
in carry bit to low word sum 
carry = _mm_add_si64( _a, onesCarry );  //add in low 
bit carry to high result
}

result[i-1] = sum;
}

return carry;
}

Again, gcc-3.3 produces much better code for this loop.

-- 
   Summary: Poor loop optimization when using sse2 builtins -
regression from 3.3
   Product: gcc
   Version: 4.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P2
 Component: rtl-optimization
AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: fjahanian at apple dot com
CC: gcc-bugs at gcc dot gnu dot org
 GCC build triplet: apple-x86-darwin
  GCC host triplet: apple-x86-darwin
GCC target triplet: apple-x86-darwin


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=22152


[Bug c++/22009] New: Friend declaration of a private member function produces error in g++-4.0

2005-06-10 Thread fjahanian at apple dot com
I think following test case is correct. But g++-4.0 produces a diagnostics. We 
should be able to declare 
a private member function of a class as a friend of another class in order for 
the member function be 
able to access private members of the befriended class. 

class FriendTestTo;

class FriendTestFrom
{
private:
void reallySetIt (FriendTestTo* PF);
};

class FriendTestTo
{
private:
  int i;
  friend void FriendTestFrom::reallySetIt (FriendTestTo*);
};

void FriendTestFrom::reallySetIt (FriendTestTo* PF){ PF->i = 1; };
% g++ -c test.cc
test.cc:6: error: 'void FriendTestFrom::reallySetIt(FriendTestTo*)' is private
test.cc:13: error: within this context

Workaround is to declare class FriendTestFrom as friend of class FriendTestTo.

-- 
   Summary: Friend declaration of a private member function produces
error in g++-4.0
   Product: gcc
   Version: 4.0.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P2
 Component: c++
AssignedTo: unassigned at gcc dot gnu dot org
    ReportedBy: fjahanian at apple dot com
CC: gcc-bugs at gcc dot gnu dot org
 GCC build triplet: apple-ppc-darwin
  GCC host triplet: apple-ppc-darwin
GCC target triplet: apple-ppc-darwin


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=22009


[Bug tree-optimization/21894] New: gcc crashes with -O1 on a call to nested function

2005-06-02 Thread fjahanian at apple dot com
Following test ICEs with gcc mainline when compiled with -O1. Test was done on 
apple-ppc-darwin.
% gcc -c -O1 bad.c
bad.c: In function 'CheckFile':
bad.c:2: internal compiler error: Bus error
Please submit a full bug report,
with preprocessed source if appropriate.
See http://developer.apple.com/bugreporter> for instructions.

/* TEST */
typedef unsigned char uchar;
static void CheckFile () {
 uchar *p;
 uchar tagname[10]; uchar * a = tagname;

  void validate(uchar const * pp, uchar const * q){
uchar const * p = pp;
if (a == tagname+4)
  {
 uchar const * x = p;
  }
  }

  while(1){
if(a == tagname)
  goto slip; 
  if (*p == '\"') 
{
  uchar const * const q = ++p; 
  validate(q, p++);
}
}
  slip:
;
}

-- 
   Summary: gcc crashes with -O1 on a call to nested function
   Product: gcc
   Version: tree-ssa
Status: UNCONFIRMED
  Severity: normal
  Priority: P2
 Component: tree-optimization
AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: fjahanian at apple dot com
CC: gcc-bugs at gcc dot gnu dot org
 GCC build triplet: apple-ppa-darwin
  GCC host triplet: apple-ppc-darwin
GCC target triplet: apple-ppa-darwin


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21894


[Bug tree-optimization/20256] -ftree-loop-linear doesn't work right in small loop

2005-02-28 Thread fjahanian at apple dot com


-- 
   What|Removed |Added

 CC||dberlin at gcc dot gnu dot
   ||org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20256


[Bug tree-optimization/20256] -ftree-loop-linear doesn't work right in small loop

2005-02-28 Thread fjahanian at apple dot com


-- 
   What|Removed |Added

 CC||dalej at apple dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20256


[Bug tree-optimization/20256] New: -ftree-loop-linear doesn't work right in small loop

2005-02-28 Thread fjahanian at apple dot com
This is a small extract from a benchmark. It shows that -O1 -ftree-loop-linear 
generates a couple
of empty loops, and incorrect behavior. 

/* main.c */
#include 

extern void init();

double mid_wts[8][35];
double  in_pats[1][35];

double  do_mid_forward(int patt)
{
double  sum;
int neurode, i;

for (neurode=0;neurode<8; neurode++)
{
sum = 0.0;
for (i=0; i<35; i++)
{ 
sum += mid_wts[neurode][i]*in_pats[patt][i];
}
sum = 1.0/(1.0+sum);
}
return sum;
}


double value;

main()
{
  init();
  printf(" %e\n", do_mid_forward (0));
}

/* init.c */
extern double mid_wts[8][35];
extern double  in_pats[1][35];

double value;

void init()
{
int i;
int neurode;

value=(double)1.0 - (double) 0.5;

for (neurode = 0; neurode<8; neurode++)
   for (i=0; i<35; i++)
  mid_wts[neurode][i] = value;

for (i=0; i<35; i++)
   in_pats[0][i] = 1.234;
}

% cc -c -O0 init.c
% cc -O1 -ftree-loop-linear main.c init.o
% ./a.out
 -2.384238e+11

Assembly file for ppc-darwin shows a couple of do-nothing empty loops. 
Remove -ftree-loop-linear and program behaves correctly.

-- 
   Summary: -ftree-loop-linear  doesn't work right in small loop
   Product: gcc
   Version: 4.0.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P2
 Component: tree-optimization
AssignedTo: unassigned at gcc dot gnu dot org
    ReportedBy: fjahanian at apple dot com
CC: gcc-bugs at gcc dot gnu dot org
 GCC build triplet: powerpc-apple-darwin
  GCC host triplet: powerpc-apple-darwin
GCC target triplet: powerpc-apple-darwin


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20256


[Bug tree-optimization/20216] [4.0/4.1 Regression] Simple loop runs out of stack at -O1

2005-02-26 Thread fjahanian at apple dot com

--- Additional Comments From fjahanian at apple dot com  2005-02-27 00:51 
---
(In reply to comment #6)
> The first part of the patch seems fine.
> We should make tree_fold_binomial non-recursive.
You meant tree_fold_factorial? tree_fold_binomial is not recursive as is.

> Note, however, that once you do that, the other part of the patch isn't 
> actually
> doing anything (the change to chrec_apply).
I agree. checking for 1024 is arbitrary and I did not propose it as a final 
solution.
I think a better solution would be to compute the factorial of the array upper 
bound,
as currently is done. If it cannot be evaluated, due to overflow, 
chrec_evaluate 
which depends on computation of tree_fold_binomial returns chrec_dont_know. In 
other words, we
do this optimization only when factorial can be computed. This prevents
setting an arbitrary limit and will let the implmentation limitations dicides 
feasibility
of this optimization. What do you think on a patch along this line?

> 
> Then all the memory usage comes from fold (all 600 meg of memory usage, i 
> mean)
> creating new trees.
> It also doesn't recurse int hat case.
> 
> In any case, limiting the input to chrec_apply to <1024 is uh, wrong, as it's
> not really fixing anything.
> 

-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20216


[Bug tree-optimization/20216] [4.0/4.1 Regression] Simple loop runs out of stack at -O1

2005-02-25 Thread fjahanian at apple dot com

--- Additional Comments From fjahanian at apple dot com  2005-02-25 21:32 
---
Created an attachment (id=8286)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=8286&action=view)
A proposed patch to fix this

Note that patch I attached is against the apple-ppc-branch. So, it may not
apply to the mainline
as is. 

-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20216


[Bug tree-optimization/20216] New: Simple loop runs out of stack at -O1

2005-02-25 Thread fjahanian at apple dot com
Following test case runs out of stack space when
gcc tries to compute factorial (159). I have a patch which does two simple 
things. 1) it rewrites 
tree_fold_factorial
function into its non-recursive version, and 2) it sets a limit before deciding 
to call chrec_evaluate. This 
limit is arbitrary in this
patch. Author of the algorithm may want to decide when to stop evaluating 
feasibility of this 
optimization. Note that even with this limit,
the computed factorial overflows. So, even a much smaller limit is needed if 
this value is significant.

/* bad.c */
static unsigned int *buffer;

void FUNC (void)
{
 unsigned int *base;
 int i, j;

 for (i = 0; i < 4; i++)
  for (j = 0; j < 160; j++)
   *base++ = buffer[j];
}

% mygccm5 -c -O1 bad.c
Out of stack space.
Try running 'limit stacksize unlimited' in the shell to raise its limit.

-- 
   Summary: Simple loop runs out of stack at -O1
   Product: gcc
   Version: 4.0.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P2
 Component: tree-optimization
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: fjahanian at apple dot com
CC: gcc-bugs at gcc dot gnu dot org
 GCC build triplet: powerpc-apple-darwin
  GCC host triplet: powerpc-apple-darwin
GCC target triplet: powerpc-apple-darwin


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20216


[Bug target/18118] bad code gen for -mcpu=G5 and unsigned long long to double

2005-01-17 Thread fjahanian at apple dot com

--- Additional Comments From fjahanian at apple dot com  2005-01-17 16:49 
---
on apple-ppc-branch -mcpu=G5 is all you need to reproduce the problem. But I 
noticed that
this bug is no longer reproducible with the FSF mainline. So, this bug has been 
fixed as far as
I am concerned. Just need to investigate which patch fixed this in mainline.

-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18118


[Bug target/18916] [4.0 Regression] mis-aligned vector code with copy memory (-maltivec)

2004-12-29 Thread fjahanian at apple dot com

--- Additional Comments From fjahanian at apple dot com  2004-12-29 17:34 
---

(In reply to comment #8)
> Why can't we make sure that temporaries which should be aligned to 128 bits 
> are actually aligned to 
> 128 bits?  Surely failing to do so will cause other problems.

Yes, this is the best way of fixing this problem, hoping not to break ABI 
conformacne in some obscure
way along the way. My last posted patch, took the approach of setting the 
alignment of the
stack temporaries to what they really were. This worked, but it also turned off 
the Vector move insns
for such temporaries. I will look at forcing the 128 bit alignment next year.

-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18916


[Bug target/18916] [4.0 Regression] mis-aligned vector code with copy memory (-maltivec)

2004-12-20 Thread fjahanian at apple dot com

--- Additional Comments From fjahanian at apple dot com  2004-12-21 01:25 
---
My last patch also had problems, in that it changed alignment of local vector 
variables on  stack.
This alignment cannot be changed because AltiVec intrincics expect 128bit 
alignment. So,
I conclude that only tempoaries with expected 128bit or more alignments are not 
aligned
properly. The safest fix would be to simply change the alignment at the rtl 
level when
temporaries of 128bit alignment need be generated. This requires change to the 
middle end.
Following patch shows the concept and is not an FSF ready patch (which requires 
target-hook or
some such). This patch essentially says that if 128 alignment of local 
temporaries on stack can not
be guaranteed (or changed), then set the alignment value in the rtl to what can 
be guaranteed. 
With this patch, emit_block_move will not generate lvx/stvx for these cases.

Index: expr.c
===

RCS file: /cvs/gcc/gcc/gcc/expr.c,v
retrieving revision 1.761
diff -c -p -r1.761 expr.c
*** expr.c  18 Dec 2004 14:38:31 -  1.761
--- expr.c  21 Dec 2004 01:23:27 -
*** emit_push_insn (rtx x, enum machine_mode
*** 3457,3463 
 to record the alignment of the stack slot.  */
  /* ALIGN may well be better aligned than TYPE, e.g. due to
 PARM_BOUNDARY.  Assume the caller isn't lying.  */
! set_mem_align (target, align);
  
  emit_block_move (target, xinner, size, BLOCK_OP_CALL_PARM);
}
--- 3457,3469 
 to record the alignment of the stack slot.  */
  /* ALIGN may well be better aligned than TYPE, e.g. due to
 PARM_BOUNDARY.  Assume the caller isn't lying.  */
!   /* powerpc-darwin currently does not enforce 128 bit alignment of 
!  temporaries on the stack. To do so, requires changes which will 
break
!  ABI compatibility. On the other hand, Leaving this unchanged 
generates 
!  incorrect code in cases where block move is implemented using
!  AltiVec instructions whose src and dest must be 128 bit aligned
!  (expand_block_move implementation in rs6000.c). */ 
!   set_mem_align (target, align >= 128 ? PARM_BOUNDARY : align);
  
  emit_block_move (target, xinner, size, BLOCK_OP_CALL_PARM);
}
*** store_expr (tree exp, rtx target, int ca
*** 4206,4214 
emit_group_load (target, temp, TREE_TYPE (exp),
 int_size_in_bytes (TREE_TYPE (exp)));
else if (GET_MODE (temp) == BLKmode)
!   emit_block_move (target, temp, expr_size (exp),
!(call_param_p
! ? BLOCK_OP_CALL_PARM : BLOCK_OP_NORMAL));
else
{
  temp = force_operand (temp, target);
--- 4212,4224 
emit_group_load (target, temp, TREE_TYPE (exp),
 int_size_in_bytes (TREE_TYPE (exp)));
else if (GET_MODE (temp) == BLKmode)
! {
!   /* See previous comment. */
!   set_mem_align (temp, MEM_ALIGN (temp) >= 128 ? PARM_BOUNDARY : 
MEM_ALIGN (temp));
!   emit_block_move (target, temp, expr_size (exp),
!(call_param_p
! ? BLOCK_OP_CALL_PARM : BLOCK_OP_NORMAL));
! }
else
{
  temp = force_operand (temp, target);




(In reply to comment #6)
> And this is the patch that I had in mind. Can this break ABI compatibily? My 
> limited testing shows
> that it does not.
> 
> Index: rs6000.c
> 
===
> 
> RCS file: /cvs/gcc/gcc/gcc/config/rs6000/rs6000.c,v
> retrieving revision 1.332.2.46.2.84
> diff -c -p -r1.332.2.46.2.84 rs6000.c
> *** rs6000.c16 Dec 2004 03:23:30 -  1.332.2.46.2.84
> --- rs6000.c18 Dec 2004 01:44:28 -
> *** function_arg_boundary (enum machine_mode
> *** 5190,5195 
> --- 5190,5201 
>|| (type && TREE_CODE (type) == VECTOR_TYPE
>&& int_size_in_bytes (type) >= 16))
>   return 128;
> +   else if (DEFAULT_ABI == ABI_DARWIN && mode == BLKmode
> +  && TYPE_ALIGN (type) >= 128)
> + {
> +   TYPE_ALIGN (type) = PARM_BOUNDARY;
> +   return PARM_BOUNDARY;
> + }
> else
>   return PARM_BOUNDARY;
>   }
> 

-- 
   What|Removed |Added

 CC||dalej at apple dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18916


[Bug target/18916] [4.0 Regression] mis-aligned vector code with copy memory (-maltivec)

2004-12-17 Thread fjahanian at apple dot com

--- Additional Comments From fjahanian at apple dot com  2004-12-18 01:46 
---
And this is the patch that I had in mind. Can this break ABI compatibily? My 
limited testing shows
that it does not.

Index: rs6000.c
===

RCS file: /cvs/gcc/gcc/gcc/config/rs6000/rs6000.c,v
retrieving revision 1.332.2.46.2.84
diff -c -p -r1.332.2.46.2.84 rs6000.c
*** rs6000.c16 Dec 2004 03:23:30 -  1.332.2.46.2.84
--- rs6000.c18 Dec 2004 01:44:28 -
*** function_arg_boundary (enum machine_mode
*** 5190,5195 
--- 5190,5201 
   || (type && TREE_CODE (type) == VECTOR_TYPE
   && int_size_in_bytes (type) >= 16))
  return 128;
+   else if (DEFAULT_ABI == ABI_DARWIN && mode == BLKmode
+  && TYPE_ALIGN (type) >= 128)
+ {
+   TYPE_ALIGN (type) = PARM_BOUNDARY;
+   return PARM_BOUNDARY;
+ }
else
  return PARM_BOUNDARY;
  }

(In reply to comment #5)
> Followin patch fixes the alignment problem. But it cannot be applied because 
> it breaks ABI
> compatibilty. 
> 
> A possible solution is to relax alignment of the type in question (with 
> alignment of 128) to that of the
> PARM_BOUNDARY (32). This will not (should not ?) break the ABI compatibility 
> (because it is currently 
> on PARM_BOUNDARY). But it will prevent vector code to be generated (which is 
> cause of the abort).
> Comments are most welcome.
> 
> 
> Index: rs6000.c
> 
===
> 
> RCS file: /cvs/gcc/gcc/gcc/config/rs6000/rs6000.c,v
> retrieving revision 1.332.2.46.2.84
> diff -c -p -r1.332.2.46.2.84 rs6000.c
> *** rs6000.c16 Dec 2004 03:23:30 -  1.332.2.46.2.84
> --- rs6000.c18 Dec 2004 00:20:54 -
> *** function_arg_boundary (enum machine_mode
> *** 5190,5195 
> --- 5190,5197 
>|| (type && TREE_CODE (type) == VECTOR_TYPE
>&& int_size_in_bytes (type) >= 16))
>   return 128;
> +   else if (DEFAULT_ABI == ABI_DARWIN && mode == BLKmode)
> + return MAX (TYPE_ALIGN (type), PARM_BOUNDARY);
> else
>   return PARM_BOUNDARY;
>   }



-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18916


[Bug target/18916] [4.0 Regression] mis-aligned vector code with copy memory (-maltivec)

2004-12-17 Thread fjahanian at apple dot com

--- Additional Comments From fjahanian at apple dot com  2004-12-18 00:43 
---
Followin patch fixes the alignment problem. But it cannot be applied because it 
breaks ABI
compatibilty. 

A possible solution is to relax alignment of the type in question (with 
alignment of 128) to that of the
PARM_BOUNDARY (32). This will not (should not ?) break the ABI compatibility 
(because it is currently 
on PARM_BOUNDARY). But it will prevent vector code to be generated (which is 
cause of the abort).
Comments are most welcome.


Index: rs6000.c
===

RCS file: /cvs/gcc/gcc/gcc/config/rs6000/rs6000.c,v
retrieving revision 1.332.2.46.2.84
diff -c -p -r1.332.2.46.2.84 rs6000.c
*** rs6000.c16 Dec 2004 03:23:30 -  1.332.2.46.2.84
--- rs6000.c18 Dec 2004 00:20:54 -
*** function_arg_boundary (enum machine_mode
*** 5190,5195 
--- 5190,5197 
   || (type && TREE_CODE (type) == VECTOR_TYPE
   && int_size_in_bytes (type) >= 16))
  return 128;
+   else if (DEFAULT_ABI == ABI_DARWIN && mode == BLKmode)
+ return MAX (TYPE_ALIGN (type), PARM_BOUNDARY);
else
  return PARM_BOUNDARY;
  }

-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18916


[Bug tree-optimization/18792] ICE with -O1 -ftree-loop-linear on small test case

2004-12-17 Thread fjahanian at apple dot com

--- Additional Comments From fjahanian at apple dot com  2004-12-17 19:40 
---
Why hasn't been there be a resolution of this PR? It seems that all issues, 
including elimination of
loop numbers, etc. have been taken care of. Thanks.

-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18792


[Bug target/18916] vector code is generated to copy data to mis-aligned memory (-mcpu=G5)

2004-12-09 Thread fjahanian at apple dot com

--- Additional Comments From fjahanian at apple dot com  2004-12-10 01:42 
---
AFAICT, I don't see how gcc middle-end can force correct parameter alignment 
when alignment is more
strict than PARAM_BOUNDARY. There is no code to do so (I am looking at 
store_one_arg which is
the one responsible for determining the alignment). It does set the MEM_ALIGN 
field to 128 in this 
case, but there is no extra padding to move the target address to the next 128 
bit boundary.

 

-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18916


[Bug target/18916] vector code is generated to copy data to mis-aligned memory (-mcpu=G5)

2004-12-09 Thread fjahanian at apple dot com


-- 
   What|Removed |Added

 CC||dje at watson dot ibm dot
   ||com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18916


[Bug target/18916] New: vector code is generated to copy data to mis-aligned memory (-mcpu=G5)

2004-12-09 Thread fjahanian at apple dot com
Following test case, compiled with -mcpu=G5, aborts.
It aborts because in passing the 32-byte argument (g1sScld1) to testvaScld1 
routine
gcc allocates a temporary on the stack for the purpose of storing
g1sScld1 and then loading it into GPRs. Recently, rs6000.c
was modified in routine expand_block_move to do lvx/stvx when alignment
of src and destination are 128 bits. But in the case of temporaries allocated
on the stack, target alignment is not correct. It is true that we set the 
MEM_ALIGN
of target temporary to 128 bit, but it comes from the alignment of the source
which is a user variable and has the 128 bit alignment.

So, in the given test case, routine expand_block_move generates
stvx to temporary stack location which is misaligned and bad things happen.

extern void abort (void);

typedef __builtin_va_list __gnuc_va_list;
typedef __gnuc_va_list va_list;

typedef struct { _Complex long double a; } Scld1;

void testvaScld1 (int n, ...)
{   
  va_list ap;

   __builtin_va_start(ap,n);

   Scld1 t = __builtin_va_arg(ap,Scld1);

   if (t.a != (_Complex long double)1)
 abort();

   __builtin_va_end(ap);
}

int main ()
{
  Scld1 g1sScld1;
  g1sScld1.a = (_Complex long double)1;
  testvaScld1 (1, g1sScld1);
  return 0;
}

-- 
   Summary: vector code is generated to copy data to mis-aligned
memory (-mcpu=G5)
   Product: gcc
   Version: 4.0.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P2
 Component: target
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: fjahanian at apple dot com
CC: gcc-bugs at gcc dot gnu dot org
 GCC build triplet: apple-ppc-darwin
  GCC host triplet: apple-ppc-darwin
GCC target triplet: apple-ppc-darwin


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18916


[Bug tree-optimization/18792] ICE with -O1 -ftree-loop-linear on small test case

2004-12-07 Thread fjahanian at apple dot com

--- Additional Comments From fjahanian at apple dot com  2004-12-07 23:04 
---
I agree that bug is before linear loop xform. Make a slight, none-cfg change to 
the test case and
loop_nbr come out different (and sequential in the nesting). Somehow, changing 
the first loop
condition makes a big difference!

void put_atoms_in_triclinic_unitcell(int i, float x[1][3])
{
  int d;

  while (i < 0)
   for (d=0; d<=3; d++)
  x[i][d] = 0;

  while (x[i][3] >= 0)
   for (d=0; d<=3; d++)
  x[i][d] = 0;

}



-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18792


[Bug tree-optimization/18792] ICE with -O1 -ftree-loop-linear on small test case

2004-12-07 Thread fjahanian at apple dot com

--- Additional Comments From fjahanian at apple dot com  2004-12-07 22:37 
---
Zdenek,

Could you take a look at this? 

-- 
   What|Removed |Added

 CC||rakdver at atrey dot karlin
   ||dot mff dot cuni dot cz


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18792


[Bug middle-end/18641] [4.0 Regression] Another ICE caused by reload of a psuedo reg into f0 for a DImode expr

2004-12-06 Thread fjahanian at apple dot com

--- Additional Comments From fjahanian at apple dot com  2004-12-06 23:32 
---
David's patch (including darwin.h patch attached here) successufully 
bootstrapped, dejagnu tested
on apple-ppc-darwin. Please apply the patch to mainline.

Index: darwin.h
===

RCS file: /cvs/gcc/gcc/gcc/config/rs6000/darwin.h,v
retrieving revision 1.72
diff -c -p -r1.72 darwin.h
*** darwin.h27 Nov 2004 22:45:22 -  1.72
--- darwin.h6 Dec 2004 17:56:34 -
*** do {
\
*** 344,351 
  
  #undef PREFERRED_RELOAD_CLASS
  #define PREFERRED_RELOAD_CLASS(X,CLASS)   \
!   ((GET_CODE (X) == CONST_DOUBLE  \
! && GET_MODE_CLASS (GET_MODE (X)) == MODE_FLOAT)   \
 ? NO_REGS  \
 : ((GET_CODE (X) == SYMBOL_REF || GET_CODE (X) == HIGH)\
&& reg_class_subset_p (BASE_REGS, (CLASS))) \
--- 344,351 
  
  #undef PREFERRED_RELOAD_CLASS
  #define PREFERRED_RELOAD_CLASS(X,CLASS)   \
!   ((CONSTANT_P (X)\
!   && reg_classes_intersect_p ((CLASS), FLOAT_REGS))\
 ? NO_REGS  \
 : ((GET_CODE (X) == SYMBOL_REF || GET_CODE (X) == HIGH)\
&& reg_class_subset_p (BASE_REGS, (CLASS))) \

-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18641


[Bug middle-end/18641] [4.0 Regression] Another ICE caused by reload of a psuedo reg into f0 for a DImode expr

2004-12-06 Thread fjahanian at apple dot com

--- Additional Comments From fjahanian at apple dot com  2004-12-06 17:55 
---
I applied the patch to fsf-mainline (including darwin.h) and it worked for me.
I will do the bootstrap, dejagnu testing and let you know how it went. - Thanks.

-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18641


[Bug tree-optimization/18792] New: ICE with -O1 -ftree-loop-linear on small test case

2004-12-02 Thread fjahanian at apple dot com
Following small test case, extraced from a SPEC2004 benchmark ICEs when 
compiled with gcc-4.0
and -O1 -ftree-loop-linear

/* test */
void put_atoms_in_triclinic_unitcell(float x[][3])
{
  int i=0,d;

  while (x[i][3] < 0)
   for (d=0; d<=3; d++)
  x[i][d] = 0;

  while (x[i][3] >= 0)
   for (d=0; d<=3; d++)
  x[i][d] = 0;

}


% gcc-4.0 -c bad.c -O1 -ftree-loop-linear
bad.c: In function 'put_atoms_in_triclinic_unitcell':
bad.c:2: internal compiler error: in build_classic_dist_vector, at 
tree-data-ref.c:1871
Please submit a full bug report,
with preprocessed source if appropriate.
See http://gcc.gnu.org/bugs.html> for instructions.

-- 
   Summary: ICE with -O1 -ftree-loop-linear on small test case
   Product: gcc
   Version: 4.0.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P2
 Component: tree-optimization
AssignedTo: unassigned at gcc dot gnu dot org
    ReportedBy: fjahanian at apple dot com
CC: gcc-bugs at gcc dot gnu dot org
 GCC build triplet: apple-ppc-darwin
  GCC host triplet: apple-ppc-darwin
GCC target triplet: apple-ppc-darwin


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18792


[Bug middle-end/18641] [4.0 Regression] Another ICE caused by reload of a psuedo reg into f0 for a DImode expr

2004-12-01 Thread fjahanian at apple dot com

--- Additional Comments From fjahanian at apple dot com  2004-12-01 22:07 
---
Regardless of how we fix this specific problem; by reverting Ulrich's patch to 
find_reloads_address, 
making the small change he proposed in find_reloads, or something else, there 
remains the
problem each time a 64-bit integer constant is loaded into an FPR. This is a 
cronic problem which
we need to address. Now is as good as ever. Being a newby in this area please 
bear with me.

I see a couple of solutions:

1) Do not use FPR for a 64-bit constant integers. This is indeed what happens 
when reverting Ulrich's
patch. What benefit do we gain by allowing use of FPR for these cases? 
Don't we always need to
eventually load the constant into a pair of GPRs, via going to memory 
first. What are the cases where
using FPR is beneficial (to reduce register pressure is one answer, but 
then we still need to go
to memory and back to GPRs for any useful operation).

2) Handle this special case in the splitter which is used. But this requires 
going to memory. Can this
be done in the splitter? This seems to be a better solution if 1) cannot be 
disallowed. 

-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18641


[Bug target/18118] bad code gen for -mcpu=G5 and unsigned long long to double

2004-11-29 Thread fjahanian at apple dot com

--- Additional Comments From fjahanian at apple dot com  2004-11-29 17:15 
---
This patch doesn't fix the problem I reported on apple-ppc-darwin.

-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18118


[Bug target/18641] New: Another ICE caused by reload of a psuedo reg into f0 for a DImode expr

2004-11-23 Thread fjahanian at apple dot com
This is similar to PR/152866. In the following test case compiled with -O0
gcc-4.0 produces following patter in reload phase:

(insn 68 47 67 7 (set (reg:DI 32 f0)
(const_int 4294967295 [0x])) 354 {*movdi_internal32} (nil)
(nil))

This pattern cause ICE in gen_reg_rtx.

This is the usual problem. Reload decides to use a float register for a 'long 
long' expression, a
constant in this case because this is legit. for powerpc. But ppc patterns 
cannot handle it.

/* Test case */
void crc()
{
int  toread;
long long nleft;
unsigned char buf[(128 * 1024)];

nleft = 0;
while (toread = (nleft < (2147483647 * 2U + 1U)) ? nleft: (2147483647 * 2U 
+ 1U) )
;
}

-- 
   Summary: Another ICE caused by reload of a psuedo reg into f0 for
a DImode expr
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P2
 Component: target
AssignedTo: uweigand at de dot ibm dot com
    ReportedBy: fjahanian at apple dot com
CC: dje at gcc dot gnu dot org,gcc-bugs at gcc dot gnu dot
org
 GCC build triplet: powerpc-apple-darwin7.0.0
  GCC host triplet: powerpc-apple-darwin7.0.0
GCC target triplet: powerpc-apple-darwin7.0.0


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18641


[Bug middle-end/16266] [4.0 regression] gcc.dg/c99-intconst-1.c compilation is very slow

2004-11-17 Thread fjahanian at apple dot com

--- Additional Comments From fjahanian at apple dot com  2004-11-17 18:02 
---
Following patch has broken many dejagnu tests on apple-ppc-darwing with 
-mcpu=G5.

http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/function.c.diff?cvsroot=gcc&r1=1.581&r2=1.582

 FAIL: gcc.c-torture/execute/20041011-1.c compilation,  -O1
  FAIL: gcc.c-torture/execute/20041011-1.c compilation,  -O2
  FAIL: gcc.c-torture/execute/950612-1.c compilation,  -O1
  FAIL: gcc.c-torture/execute/950612-1.c compilation,  -O2
  FAIL: gcc.c-torture/execute/950612-1.c compilation,  -Os
  FAIL: gcc.c-torture/execute/ashldi-1.c compilation,  -O1
  FAIL: gcc.c-torture/execute/ashrdi-1.c compilation,  -O1
  FAIL: gcc.c-torture/execute/ashrdi-1.c compilation,  -O2
  FAIL: gcc.c-torture/execute/ashrdi-1.c compilation,  -O3 -fomit-frame-pointer
  FAIL: gcc.c-torture/execute/ashrdi-1.c compilation,  -O3 -fomit-frame-pointer 
-funroll-loops
  FAIL: gcc.c-torture/execute/ashrdi-1.c compilation,  -O3 -fomit-frame-pointer 
-funroll-all-loops 
-finline-functions
  FAIL: gcc.c-torture/execute/ashrdi-1.c compilation,  -O3 -g
  FAIL: gcc.c-torture/execute/ashrdi-1.c compilation,  -Os
  FAIL: gcc.c-torture/execute/lshrdi-1.c compilation,  -O1


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=16266


[Bug target/15286] ICE cause by reload

2004-10-26 Thread fjahanian at apple dot com

--- Additional Comments From fjahanian at apple dot com  2004-10-26 15:17 ---
I tested the patch on apple-ppc-darwin; bootstrapped and dejagnu tested (with and 
without 
-mcpu=G5). There were no regressions. This is an important bug for us. We have had 4
separate reporting of this bug. It also happens in SPEC2004. 

- Thanks.

-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=15286


[Bug target/15286] ICE cause by reload

2004-10-25 Thread fjahanian at apple dot com

--- Additional Comments From fjahanian at apple dot com  2004-10-25 23:58 ---
I tried the last patch and for the following statement built with -O2 -mcpu=G5 
(aaple's mixed mode)
I get the following instruction sequence. It looks OK to me. But David's case might be 
different
than what I am looking at:


clock_start=(((double)clock())/((double)(100)));


bl L_clock$stub
rldicl r3,r3,0,32
lha r0,232(r29)
addis r2,r31,ha16(LC40-"L008$pb")
std r3,456(r1)
lfd f12,lo16(LC40-"L008$pb")(r2)
cmpwi cr7,r0,0
nop
lfd f13,456(r1)
fcfid f0,f13
fdiv f0,f0,f12
fctidz f0,f0
stfd f0,528(r1)
nop
nop
nop
ld r19,528(r1)
ble cr7,L147
 ...

ti+=double)clock())/((double)(100)))-clock_start);

L186:
bl L_clock$stub
rldicl r3,r3,0,32
rldicl r2,r19,0,32
std r3,464(r1)
std r2,472(r1)
addis r2,r31,ha16(LC40-"L008$pb")
lfd f0,464(r1)
lfd f13,472(r1)
lwz r0,816(r30)
cmpwi cr7,r0,0
fcfid f12,f0
lfd f0,lo16(LC40-"L008$pb")(r2)
fcfid f11,f13
addis r2,r31,ha16(LC39-"L008$pb")
fdiv f12,f12,f0
lfd f0,lo16(LC39-"L008$pb")(r2)
fsub f12,f12,f11
...



-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=15286


[Bug target/15286] ICE cause by reload

2004-10-25 Thread fjahanian at apple dot com

--- Additional Comments From fjahanian at apple dot com  2004-10-25 21:14 ---
By mistake, I applied the test for !reload_completed  to you earlier patch (which was 
worng). 
In any case,
after correcting the patch and with your latest patch, all my test cases passed. Now, 
I need
to do a complete bootstrap with -mcpu=G5 on apple-ppc-darwin and will let you know
how it goes. Thanks.

-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=15286


[Bug target/15286] ICE cause by reload

2004-10-25 Thread fjahanian at apple dot com

--- Additional Comments From fjahanian at apple dot com  2004-10-25 20:58 ---
You need to replace GET_MODE_SIZE (x) with GET_MODE_SIZE (GET_MODE (x)), etc. for a 
clean
compile. But as I mentioned in last comment, I still get the ICE with or without this 
patch (along
with the previous patch) in all the test cases that I tried.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=15286


[Bug target/15286] ICE cause by reload

2004-10-25 Thread fjahanian at apple dot com

--- Additional Comments From fjahanian at apple dot com  2004-10-25 19:12 ---
You referred to them as 'both patches' in comment #21.

-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=15286


[Bug target/15286] ICE cause by reload

2004-10-25 Thread fjahanian at apple dot com

--- Additional Comments From fjahanian at apple dot com  2004-10-25 18:39 ---
I applied the last two patch, but it didn;t help:

% mygccf -O2 -mcpu=G5 -c loader_obj.i
loader_obj.c: In function 'load_obj':
loader_obj.c:92: error: unrecognizable insn:
(insn 1395 601 1396 50 (set (subreg:DI (mem:SI (plus:SI (reg/f:SI 1 r1)
(const_int 716 [0x2cc])) [0 allocednf+0 S4 A8]) 0)
(reg:DI 32 f0)) -1 (nil)
(nil))
loader_obj.c:92: internal compiler error: in extract_insn, at recog.c:2034
Please submit a full bug report,
with preprocessed source if appropriate.
See http://developer.apple.com/bugreporter> for instructions.

Just to be clear, this is the patch I applied.

Index: simplify-rtx.c
===

RCS file: /cvs/gcc/gcc/gcc/simplify-rtx.c,v
retrieving revision 1.107.2.31.2.9
diff -c -p -r1.107.2.31.2.9 simplify-rtx.c
*** simplify-rtx.c  16 Oct 2004 00:06:42 -  1.107.2.31.2.9
--- simplify-rtx.c  25 Oct 2004 18:38:20 -
*** simplify_gen_subreg (enum machine_mode o
*** 3800,3806 
if (newx)
  return newx;
  
!   if (GET_CODE (op) == SUBREG || GET_MODE (op) == VOIDmode)
  return NULL_RTX;
  
return gen_rtx_SUBREG (outermode, op, byte);
--- 3800,3808 
if (newx)
  return newx;
  
!   if ((GET_CODE (op) == SUBREG || GET_MODE (op) == VOIDmode
!|| (REG_P (op) && REGNO (op) < FIRST_PSEUDO_REGISTER))
!  && !reload_completed)
  return NULL_RTX;
  
return gen_rtx_SUBREG (outermode, op, byte);

-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=15286


[Bug target/18118] New: bad code gen for -mcpu=G5

2004-10-22 Thread fjahanian at apple dot com
Following test case, extracted from rbug.c of dejagnu fails on apple-ppc-darwin when 
-mcpu=G5
is specified.

double s (unsigned long long k)
{
  return (float)k;
}

extern void abort();

main ()
{
  unsigned long long int k;
  double x;

  k = 0x82345081ULL;
  x = s (k);
  k = (unsigned long long) x;
  if (k != 0x82345100ULL)
abort();

  return 0;
}

-- 
   Summary: bad code gen for -mcpu=G5
   Product: gcc
   Version: 4.0.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P2
 Component: target
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: fjahanian at apple dot com
CC: gcc-bugs at gcc dot gnu dot org
 GCC build triplet: apple-ppc-darwin
  GCC host triplet: apple-ppc-darwin
GCC target triplet: apple-ppc-darwin


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18118


[Bug tree-optimization/17892] [4.0 Regression] gcc-4.0 should not reassociate floating point add or multiplication

2004-10-12 Thread fjahanian at apple dot com

--- Additional Comments From fjahanian at apple dot com  2004-10-12 20:57 ---
tree-outof-ssa.c is not part of this patch. I accidentally checked it in. I have since 
backed it out.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17892


[Bug tree-optimization/17955] New: Perform associative optimization when it is safe

2004-10-12 Thread fjahanian at apple dot com
PR/17892  was filed because gcc-4.0 performs an unsafe optimization of (X*C)*C into 
X*(C*C).
Fix to this PR prevents certain safe transformation; such as X*2.0*2.0->X*4.0 from 
taking place.
This PR is to track this enhancement.

-- 
   Summary: Perform associative optimization when it is safe
   Product: gcc
   Version: 4.0.0
Status: UNCONFIRMED
  Severity: enhancement
  Priority: P3
 Component: tree-optimization
AssignedTo: roger at eyesopen dot com
ReportedBy: fjahanian at apple dot com
CC: gcc-bugs at gcc dot gnu dot org
 GCC build triplet: apple-ppc-darwin
  GCC host triplet: apple-ppc-darwin
GCC target triplet: apple-ppc-darwin


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17955


[Bug tree-optimization/17884] [4.0 Regression] asm 'volatile' is not honored as documented

2004-10-08 Thread fjahanian at apple dot com

--- Additional Comments From fjahanian at apple dot com  2004-10-08 16:23 ---
But this is a regression from gcc-3.3. Also, without this patch, there is no other 
place which checks
for a volatility of an 'asm' statement. Then why not just say in the documentation 
that 'volatile'
has no effect on an 'asm'? BTW, thanks for preparing the patch for me.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17884


[Bug tree-optimization/17892] New: gcc-4.0 should not reassociate floating point add or multiplication

2004-10-08 Thread fjahanian at apple dot com
In the following code the repeated multiplication is folded into a single operation 
(multiplication by Infinity). 
For different values of "x" this leads to undeserved or absent floating point 
exceptions, and 
breaks some of the elementary math functions in Libm. Occurs at optimization O1 and 
higher. 

static const double C = 0x1.0p1023;

double foo(double x)
{
return ( ( (x * C) * C ) * C );
}

-- 
   Summary: gcc-4.0 should not reassociate floating point add or
multiplication
   Product: gcc
   Version: 4.0.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P1
 Component: tree-optimization
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: fjahanian at apple dot com
CC: gcc-bugs at gcc dot gnu dot org,roger at eyesopen dot
com
 GCC build triplet: powerpc-apple-darwin
  GCC host triplet: powerpc-apple-darwin
GCC target triplet: powerpc-apple-darwin


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17892