Yara status on PPC (powerpc-darwin)

2006-04-17 Thread Andrew_Pinski
I decided to look into the Yara branch to see if it could even be 
bootstrap on PPC (with Yara turned on by default).


I ran into an ICE while compiling libgcc2.c for __muldi3.
The ICE was in emit_secondary_memory_move.
The preprocessed source is:
typedef int SItype __attribute__ ((mode (SI)));
typedef int DItype __attribute__ ((mode (DI)));
struct DWstruct {SItype high, low;};
typedef union
{
struct DWstruct s;
DItype ll;
} DWunion;
DItype __muldi3 (DItype u)
{
DWunion w;
w.ll = 0;
w.s.high = u;
return w.ll;
}

---

And then I decided just to look into code generation:
int  f(void)
{
   return 0;
}

---
With the above code, I noticed that GCC saved and restored
the link register which is not needed because this is a leaf function.

The reason why it was being saved/restored is because 
current_function_is_leaf was not being set at all with Yara on.
Before it was being set in the local-alloc.c.

The next code generation issue is related to the ICE above as
both are caused by spilling long long variables to the stack always (or it 
seems).

Also it looks like it might be producing wrong code too as the one half is 
not zero'd out.
Testcase:

typedef int SItype __attribute__ ((mode (SI)));
typedef int DItype __attribute__ ((mode (DI)));
DItype __muldi3 (DItype u)
{
DItype ll;
ll = 0;
ll = (ll &~0x) | (u&0x);
   return ll;
}
---
Asm produced WITHOUT Yara turned on:
.machine ppc
.text
.align 2
.globl ___muldi3
___muldi3:
li r3,0
blr
.subsections_via_symbols

---
Asm produced WITH Yara turned on:
.machine ppc
.text
.align 2
.globl ___muldi3
___muldi3:
mflr r0
stw r0,8(r1)
stwu r1,-48(r1)
lwz r0,24(r1)
stw r4,20(r1)
stw r0,12(r1)
lfd f0,8(r1)
stfd f0,16(r1)
lwz r3,16(r1)
lwz r4,20(r1)
addi r1,r1,48
lwz r0,8(r1)
mtlr r0
blr
.subsections_via_symbols
---

Hopefully this helps the progress of Yara some more.

Thanks,
Andrew Pinski




Re: Yara status on PPC (powerpc-darwin)

2006-04-18 Thread Andrew_Pinski
Just to follow up on this email since I looked into some of the
issues a little more last night.



Andrew Pinski/R&D/SCEA wrote on 04/17/2006 08:21:01 AM:

> I decided to look into the Yara branch to see if it could even be 
> bootstrap on PPC (with Yara turned on by default).
> 
> I ran into an ICE while compiling libgcc2.c for __muldi3.
> The ICE was in emit_secondary_memory_move.
> The preprocessed source is:
> typedef int SItype __attribute__ ((mode (SI)));
> typedef int DItype __attribute__ ((mode (DI)));
> struct DWstruct {SItype high, low;};
> typedef union
> {
> struct DWstruct s;
> DItype ll;
> } DWunion;
> DItype __muldi3 (DItype u)
> {
> DWunion w;
> w.ll = 0;
> w.s.high = u;
> return w.ll;
> }

The problem here is the translation table is broken.
Here is what we get on powerpc-darwin:
Class cover:
 FLOAT_REGS NON_FLOAT_REGS
Class translation:
 NO_REGS -> NO_REGS
 BASE_REGS -> NO_REGS
 GENERAL_REGS -> NO_REGS
 FLOAT_REGS -> FLOAT_REGS
 ALTIVEC_REGS -> NO_REGS
 VRSAVE_REGS -> NO_REGS
 VSCR_REGS -> NO_REGS
 SPE_ACC_REGS -> NO_REGS
 SPEFSCR_REGS -> NO_REGS
 NON_SPECIAL_REGS -> NO_REGS
 MQ_REGS -> NON_FLOAT_REGS
 LINK_REGS -> NON_FLOAT_REGS
 CTR_REGS -> NON_FLOAT_REGS
 LINK_OR_CTR_REGS -> NON_FLOAT_REGS
 SPECIAL_REGS -> NO_REGS
 SPEC_OR_GEN_REGS -> NO_REGS
 CR0_REGS -> NON_FLOAT_REGS
 CR_REGS -> NON_FLOAT_REGS
 NON_FLOAT_REGS -> NON_FLOAT_REGS
 XER_REGS -> NO_REGS
 ALL_REGS -> NO_REGS 


GENERAL_REGS points to NO_REGS which means we always spill :(.
This is the same issue as the thrid problem in fact.


> 
---
> 
> And then I decided just to look into code generation:
> int  f(void)
> {
>return 0;
> }
> 
> 
---
> With the above code, I noticed that GCC saved and restored
> the link register which is not needed because this is a leaf function.

The saving and restoring of the link register can be solved by the 
attached
patch which should also help x86 code gen at the same time (I have not 
bootstrapped
it yet there).

-- Pinski




Index: yara.c
===
--- yara.c  (revision 112997)
+++ yara.c  (working copy)
@@ -449,6 +449,12 @@ gate_yara (void)
 static unsigned int
 rest_of_handle_yara (void)
 {
+
+  /* Determine if the current function is a leaf before running reload
+ since this can impact optimizations done by the prologue and
+ epilogue thus changing register elimination offsets.  */
+  current_function_is_leaf = leaf_function_p ();
+
   compact_blocks ();
 
   /* Allocate the reg_renumber array.  */

Status of the pointer_plus branch

2007-05-12 Thread Andrew_Pinski
Hi,
  I am not asking right now to merge this branch to the mainline, I think 
it needs more eyes on the code.  But here is the current status of the 
branch.

It bootstraps and tests on i686-linux-gnu with two regressions.
It builds/tests for spu-elf with two regressions (the same as the x86 
regressions).
I am working on fixing up the bootstrap/tests for 
powerpc64-linux-gnu/powerpc-linux-gnu right now (well the patch was in 
testing but the machine crashed because of the heat in the office).

For i686-linux-gnu, this includes Ada, Java, Fortran, C++, objective-C, 
Objective-C++ and C.  I did not test treelang yet but a quick look through 
the source shows that this language should just work.


The two regressions which exist are:

FAIL: gcc.dg/vect/vect-102.c scan-tree-dump-times possible dependence 
between data-refs 1
FAIL: gcc.dg/vect/vect-104.c scan-tree-dump-times possible dependence 
between data-refs 1


This is cause we get "a p+ (b+1)*4" while on the mainline we get a + 
"(((int*)b)*4B + 4B)".
So we don't fold "(b+1)*4" to "b*4 + 4" but I have not looked into why 
yet.

Next week I will be ask Mark Mitchell and the SC about merging this branch 
into the mainline, I should have some benchmark results during that time 
frame also.

To people who are going to test the branch on other targets than the one I 
mentioned already:
Every target's gimplify_va_arg needs to be fixed, likewise for each 
target's va_start, the changes are usually s/PLUS_EXPR/POINTER_PLUS_EXPR 
and make sure the second operand is converted to sizetype (most already 
just use sizetype anyways).


Thanks,
Andrew Pinski


Re: PTR-PLUS merge into the mainline

2007-06-28 Thread Andrew_Pinski
Roman Zippel <[EMAIL PROTECTED]> wrote on 06/28/2007 07:54:43 PM:

> Hi,
> Notice that it generates the (i + 1) * 4 instead of (i * 4) + 4 as with 
> the other cases. While I tried to debug this I narrowed it down to the 
> changes in fold_binary(), but I don't really know how to fix this, so 
> I could use some help here.

The main thing is that this is really PR 32120.  The problem is only 
related to the
merge because of the way fold_binary works.

Thanks,
Andrew Pinski