Re: Generating optimized PIR?

2004-01-06 Thread Leopold Toetsch
Dan Sugalski wrote:

Oh, yeah, lots of spilling. In a hacked up version of the PIR I see 50 
spills in the main routine. 
Brrr. It would definitely help, if you can use lexicals or gloabals, so 
that you don't have that much long-ranged variables - and of course 
splitting the beast if possible.

leo







Re: Generating optimized PIR?

2004-01-06 Thread Dan Sugalski
At 12:30 PM +0100 1/6/04, Leopold Toetsch wrote:
Dan Sugalski <[EMAIL PROTECTED]> wrote:
 Optimized for speed, at least.

 I'm finding that large subs seem to give imcc headaches. I'm not sure
 if it's O(n^2) or O(2^n) headaches, but definitely issues.
Live analysis and register allocation are the main problems. The former
is O(n_lines * n_vars). The latter goes horribly slow in the case of
spilling. Please run your program with the "-v" switch, you should see
some statistics including spilled variables count.
Oh, yeah, lots of spilling. In a hacked up version of the PIR I see 
50 spills in the main routine. Not a good thing, I think, and it 
takes forever even with Parrot built with optimizations. (Which does 
make quite a difference, though not enough)
--
Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: Generating optimized PIR?

2004-01-06 Thread Leopold Toetsch
Dan Sugalski <[EMAIL PROTECTED]> wrote:
> Optimized for speed, at least.

> I'm finding that large subs seem to give imcc headaches. I'm not sure
> if it's O(n^2) or O(2^n) headaches, but definitely issues.

Live analysis and register allocation are the main problems. The former
is O(n_lines * n_vars). The latter goes horribly slow in the case of
spilling. Please run your program with the "-v" switch, you should see
some statistics including spilled variables count.

leo


Re: Generating optimized PIR?

2004-01-05 Thread Melvin Smith
At 10:59 AM 1/5/2004 -0500, Dan Sugalski wrote:
Optimized for speed, at least.

I'm finding that large subs seem to give imcc headaches. I'm not sure if 
it's O(n^2) or O(2^n) headaches, but definitely issues. At the moment I've 
got parrot churning on some pir code and it's taking quite a while. Final 
time tally:

real41m46.978s
user21m24.300s
sys 0m41.080s
Ack.

I expected this would happen. Most probably it is register-coloring/spilling.
I'm a little rusty on the register colorer; I do know the first version I wrote
was not very fast for large numbers of registers, but I believe Leo had 
improved
on it a bit.

I'd really like to see the piece of code so I can do some profiling.


Ended with a missing symbol error, of course, just to rub it in a bit. 
vm_stat reports a lot of zero-fill page allocation (about 1600 4K 
pages/sec) though the memory footprint isn't growing to match, so that 
might indicate at least something of the problem. (For all I know there's 
some sort of degenerate GC issue somewhere, depending on how imcc does its 
memory allocation and management)
That could be IMCC repeatedly allocating/freeing its interference graphs
for each basic block, but I'm not positive.
I know IMCC's being redone, and we're nowhere near close to optimized, but 
I think it'd be worth it to get a handle on what sorts of things are 
likely to trigger off exponential time compiles when fed to IMCC. I'm 
assuming that it's due to a combination of sub size (about 61K of source 
in one sub) and locals needing coloring (132) but it'd be nice to know for 
sure.
There are several tests I can think of that we should include in IMCC.

1) A large number of locals with a very long code segment, without
branches or labels. This would tests large graph coloring without
lots of basic blocks.
2) A large number of locals _with_ the normal amount of branches and
labels. This would test the life analysis on a large number of basic blocks.
3) A small number of locals with variants of 1 & 2 above for branching/labels.

Any chance of getting the code sample?

-Melvin




Re: Generating optimized PIR?

2004-01-05 Thread Pete Lomax
On Mon, 5 Jan 2004 10:59:18 -0500, Dan Sugalski <[EMAIL PROTECTED]> wrote:

>I know IMCC's being redone, and we're nowhere near close to 
>optimized,
That was my guess
> but I think it'd be worth it to get a handle on what sorts 
>of things are likely to trigger off exponential time compiles when 
>fed to IMCC. I'm assuming that it's due to a combination of sub size 
>(about 61K of source in one sub) and locals needing coloring (132) 
>but it'd be nice to know for sure.
My experience was as follows (don't take these times too literally
since this is a very old, very slow box; the relative times are what
count). This was creating a single .sub:

compile source and write a 2,500 line imc file: 0.2 seconds
The first line of the imc file included a hand-crafted 1000 line file.
parrot -o t.pasm t.imc: 7 seconds
load the final pasm file (now 3,500 lines) and run it: 0.3 seconds

Editing the hand-crafted 1000-line include file to replace symbolic
registers (ie $I1 etc) with real registers (I1) cut imc time down to
3.5 seconds.
Changing the code emitter to re-use symbolic registers no longer in
use (ie not local variables, and not still on the parse stack) cut the
time down to 1.39 seconds.

Just letting you know what I found, I shall let you draw your own
conclusions.

Lastly, while I know that -O2 is not complete, I don't know by how
much it is incomplete. You may want to check the times on that too.

Regards,
Pete


Generating optimized PIR?

2004-01-05 Thread Dan Sugalski
Optimized for speed, at least.

I'm finding that large subs seem to give imcc headaches. I'm not sure 
if it's O(n^2) or O(2^n) headaches, but definitely issues. At the 
moment I've got parrot churning on some pir code and it's taking 
quite a while. Final time tally:

real41m46.978s
user21m24.300s
sys 0m41.080s
Ended with a missing symbol error, of course, just to rub it in a 
bit. vm_stat reports a lot of zero-fill page allocation (about 1600 
4K pages/sec) though the memory footprint isn't growing to match, so 
that might indicate at least something of the problem. (For all I 
know there's some sort of degenerate GC issue somewhere, depending on 
how imcc does its memory allocation and management)

I know IMCC's being redone, and we're nowhere near close to 
optimized, but I think it'd be worth it to get a handle on what sorts 
of things are likely to trigger off exponential time compiles when 
fed to IMCC. I'm assuming that it's due to a combination of sub size 
(about 61K of source in one sub) and locals needing coloring (132) 
but it'd be nice to know for sure.

For me, reducing the size of the sub's untenable, but I can switch 
from using locals to using pad or global variables that get fetched 
as need be if that means less compile time.
--
Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk