Re: IMCC temp optimizations...

2004-04-22 Thread Leopold Toetsch
Dan Sugalski [EMAIL PROTECTED] wrote:

 I think it may be a handy thing if someone'd go through and draw up a
 set of rules for the use of temps, and things that'll cause the
 register coloring algorithm to go mad. (I'd like to avoid 30 minute
 compile sessions--it's a tad tedious :)

Are you still sticking everything in one big _main?

leo


Re: IMCC temp optimizations...

2004-04-22 Thread Leopold Toetsch
Dan Sugalski wrote:
... (I'd like to avoid 30 minute compile 
sessions--it's a tad tedious :)
Should be faster now by some factor.

How many symbols are in the biggest compilation unit (parrot -v 
registers in .imc)?

leo




Re: IMCC temp optimizations...

2004-04-22 Thread Dan Sugalski
At 1:22 PM +0200 4/22/04, Leopold Toetsch wrote:
Dan Sugalski wrote:
... (I'd like to avoid 30 minute compile sessions--it's a tad tedious :)
Should be faster now by some factor.
Cool, thanks. I've an optimized build of parrot going now, and I'll 
see what things look like when it's dine.

How many symbols are in the biggest compilation unit (parrot -v 
registers in .imc)?
Dunno what parrot thinks--it's not done yet. grep says 569 .locals 
and 473 temp PMC registers. ($Px) I think that can reasonably be 
considered A Lot. I'm rejigging the compiler to cut down on the 
number of .local declarations, but that'll increase the temp pmc 
usage, at least with the relatively simple temp system I've got now. 
(I can throw dummy labels in to create fake basic blocks if that'll 
help the register coloring code)
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: IMCC temp optimizations...

2004-04-22 Thread Dan Sugalski
At 7:55 AM +0200 4/22/04, Leopold Toetsch wrote:
Dan Sugalski [EMAIL PROTECTED] wrote:

 I think it may be a handy thing if someone'd go through and draw up a
 set of rules for the use of temps, and things that'll cause the
 register coloring algorithm to go mad. (I'd like to avoid 30 minute
 compile sessions--it's a tad tedious :)
Are you still sticking everything in one big _main?
Except for the library code, yup. No way around that.
--
Dan
--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: IMCC temp optimizations...

2004-04-22 Thread Leopold Toetsch
Dan Sugalski wrote:
As I sit here and wait for parrot to churn on the output from compiling 
a relatively small program
I've put in another factor ~2.5 change for a unit with 2000 temps.

leo



Re: IMCC temp optimizations...

2004-04-22 Thread Leopold Toetsch
Dan Sugalski [EMAIL PROTECTED] wrote:

 Dunno what parrot thinks--it's not done yet. grep says 569 .locals
 and 473 temp PMC registers.

I've now enabled some more code that speeds up temp allocation more
(~2.5 for 2000 temps in a unit). This needs that the usage range is
limited to 10 lines. If the code from a compiler looks like the output
from the program below, this works fine.

 ... I'm rejigging the compiler to cut down on the
 number of .local declarations, but that'll increase the temp pmc
 usage, at least with the relatively simple temp system I've got now.

Temps are fine. .locals ranging from top of the program (not counting
the declaration) down do hurt. Many small short ranged temps are much
better then long ranged vars because they have more interferences on
each other.

 (I can throw dummy labels in to create fake basic blocks if that'll
 help the register coloring code)

That makes it worse. More blocks, more life ranges to compare.

If you can emit code similar to gen.pl it could take advantage of my
last change.

leo

$ cat gen.pl
#!/usr/bin/perl
use strict;

#  a = 1 + 1 + 1 ... + 1

my ($i, $t, $N);
$N = $ARGV[0] || 299;
print .sub _main [EMAIL PROTECTED];
$t++;
print \t\$P$t = new PerlInt\n;
print \t\$P$t = 1\n;
for $i (0..$N) {
$t++;
print \t\$P$t = new PerlInt\n;
print \t\$P$t = 1\n;
$t++;
print \t\$P$t = new PerlInt\n;
print \t\$P$t = [EMAIL PROTECTED] + [EMAIL PROTECTED];
}
print \tprint \$P$t\n;
print \tprint \\\n\\n;
print .end\n;



Re: IMCC temp optimizations...

2004-04-22 Thread Angel Faus

A good register allocation algorithm will always have problems with 
big subs, there is nothing that we can do about it.

I think that what real compilers do to solve this problem is 
implement two different algorithms: one for normal subs which tries 
to generate optimal code, and a naive one for very large subs with 
many virtual registers. 

That makes compilation much faster, and the execution penalty doesn't 
hurt too much.

Actually, it's (for me) an open question whether the good register 
allocation should be the default one. Perl (and python and..) users 
expect blazing compilation times, so maybe we should reserve it for 
higher -O levels.

But then, we won't know how bad are our compilation times until there 
are real programs written in perl6/parrot.

-angel


Leopold Toetsch wrote:
 Dan Sugalski [EMAIL PROTECTED] wrote:
  Dunno what parrot thinks--it's not done yet. grep says 569
  .locals and 473 temp PMC registers.

 I've now enabled some more code that speeds up temp allocation more
 (~2.5 for 2000 temps in a unit). This needs that the usage range is
 limited to 10 lines. If the code from a compiler looks like the
 output from the program below, this works fine.



Re: IMCC temp optimizations...

2004-04-22 Thread Dan Sugalski
At 4:03 PM +0200 4/22/04, Leopold Toetsch wrote:
Dan Sugalski [EMAIL PROTECTED] wrote:

 Dunno what parrot thinks--it's not done yet. grep says 569 .locals
 and 473 temp PMC registers.
I've now enabled some more code that speeds up temp allocation more
(~2.5 for 2000 temps in a unit). This needs that the usage range is
limited to 10 lines. If the code from a compiler looks like the output
from the program below, this works fine.
This sped it up a lot. The output is:

Starting parse...
sub _MAIN:
registers in .imc:   I34, N0, S7, P1014
0 labels, 0 lines deleted, 0 if_branch, 0 branch_branch
0 used once deleted
0 invariants_moved
registers needed:I43, N0, S12, P3327
registers in .pasm:  I25, N0, S20, P32 - 464 spilled
2007 basic_blocks, 2079 edges
13691 lines compiled.
Writing sample.pbc
packed code 348208 bytes
sample.pbc written.
Which actually came out in reasonable time, rather than me giving up 
after 45 minutes. :) Still takes ages, so I've a lot of work to do 
here.

  ... I'm rejigging the compiler to cut down on the
 number of .local declarations, but that'll increase the temp pmc
 usage, at least with the relatively simple temp system I've got now.
Temps are fine. .locals ranging from top of the program (not counting
the declaration) down do hurt. Many small short ranged temps are much
better then long ranged vars because they have more interferences on
each other.
Hrm. Does the code currently consider something like:

  $P0 = foo

to start a new lifetime for $P0?

--
Dan
--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: IMCC temp optimizations...

2004-04-22 Thread Leopold Toetsch
Dan Sugalski [EMAIL PROTECTED] wrote:
  registers needed:I43, N0, S12, P3327
  registers in .pasm:  I25, N0, S20, P32 - 464 spilled
  2007 basic_blocks, 2079 edges

Ouch. Register allocation is spending huge times during spilling.
Something is definetely wrong with your code - wrong in the sense of: it
doesn't play with, what imcc expects ;)

You must have too much pseudo-globals in that unit, spanning a huge
range and interfering with one another. Do you use lexical vars or
Parrot globals?

 Hrm. Does the code currently consider something like:

$P0 = foo

 to start a new lifetime for $P0?

If you don't use $P0 beyond that point yes. Do you name all
temps $P0 or some such? Or are you giving them unique names. You should
do the latter.

leo


Re: IMCC temp optimizations...

2004-04-22 Thread Dan Sugalski
At 6:04 PM +0200 4/22/04, Leopold Toetsch wrote:
Dan Sugalski [EMAIL PROTECTED] wrote:
  registers needed:I43, N0, S12, P3327
  registers in .pasm:  I25, N0, S20, P32 - 464 spilled
  2007 basic_blocks, 2079 edges
Ouch. Register allocation is spending huge times during spilling.
Something is definetely wrong with your code - wrong in the sense of: it
doesn't play with, what imcc expects ;)
Yep, the 45 minute compile times were the hint that maybe imcc and I 
were fighting a bit. :)

You must have too much pseudo-globals in that unit, spanning a huge
range and interfering with one another. Do you use lexical vars or
Parrot globals?
I was using .locals for the actual variables in the source program, 
and $Px for all the temps the compiler generated. I've been migrating 
a lot of the code to use a few .local-defined hashes and indexing 
into them, and it looks like that's the way to go. (This'd be easier 
if this language had, y'know, actual subroutines and stuff...)

  Hrm. Does the code currently consider something like:

$P0 = foo

 to start a new lifetime for $P0?
If you don't use $P0 beyond that point yes. Do you name all
temps $P0 or some such? Or are you giving them unique names. You should
do the latter.
I've a lot of 'constant' temps named $P0. I'll go fix that and see where we go.
--
Dan
--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: IMCC temp optimizations...

2004-04-22 Thread Leopold Toetsch
Dan Sugalski [EMAIL PROTECTED] wrote:

 I was using .locals for the actual variables in the source program,

Well, you know it: .locals aren't vars.

 and $Px for all the temps the compiler generated. I've been migrating
 a lot of the code to use a few .local-defined hashes and indexing
 into them, and it looks like that's the way to go.

If you use a hash anyway, just use globals - this seems to match the
target language's POV:

  $Px = global foo  # fetch
  global foo = $Py  # store

If spilling is still needed, the spill code can be improved, if the
variable already has a store (like a lexical or a global). Needs only
cutting down the life range, and a refetch in the other locations of
usage. This saves the store and simplifies spilling.

But if you treat all your variables like that, there will be no spilling
at all.

 ... (This'd be easier
 if this language had, y'know, actual subroutines and stuff...)

Yep.

leo


IMCC temp optimizations...

2004-04-21 Thread Dan Sugalski
As I sit here and wait for parrot to churn on the output from 
compiling a relatively small program, I'm reminded again that imcc's 
got some degenerate behaviour when it comes to register coloring and 
.locals.

I think it may be a handy thing if someone'd go through and draw up a 
set of rules for the use of temps, and things that'll cause the 
register coloring algorithm to go mad. (I'd like to avoid 30 minute 
compile sessions--it's a tad tedious :)
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk