Re: Register Allocator

2008-07-08 Thread chromatic
On Tuesday 15 August 2006 12:23:34 Vishal Soni wrote:

 I should be able to hack up a vanilla register allocation scheme that does
 no live variable analysis to find overlapping registers to reduce the
 number of registers. Currently I have implemented this scheme in Byte Code
 Generator (compilers/bcg/src/bcg_reg_alloc_vanilla.c). All this does is it
 makes sure each varaible gets a unique register. The quality of the code
 generate might not be good but this should be very fast in compile time
 performance compared to Graph Based allocator. This should be simple to do
 and we can still keep the graph allocator but only run it when the
 optimization flag -Ox is used.

 Let me know your thoughts.  I'll try to catch you guys later on IRC.

Rakudo is nearly in a state where this could come in handy.

-- c


IMCC Musings from Register Allocator Change Experience

2008-06-19 Thread chromatic
Per some investigation and discussion at the hackathon, I've looked in to 
refactoring the register allocator.  IMCC keeps track of all symbolic 
registers within a compilation unit in a hash of SymReg structures.

When building PBC, IMCC needs to decide how to map symbolic registers ($P1, 
$N0, $S3, lexicals, etc) to Parrot registers (P0, N0, S0, etc).

To do so, it performs several types of analysis -- much of this is non trivial 
anyway -- but ends up walking this hash several times per compilation unit 
per register type.

Algorithmically speaking, even though coloring and allocating registers is 
non-trivial, I believe even a mathemetician would refer to this as sort of a 
disaster.

The more I dive into IMCC's allocation and context flow and optimization code, 
the more I realize that refactoring it into something more amenable to how 
PIR and PBC actually work will take a lot of time that perhaps I could spend 
on implementing new things, not just improving existing things that (for all 
that they're proofs of concept with bugs, sure, but known bugs) mostly work.

Much of the problem is that plenty of things are interconnected unfortunately.  
That means that changing something in one place ripples throughout the code.

I'm starting to convince myself that the entire PBC-modeling structure that 
IMCC takes is fundamentally wrong -- but the only likely replacement, pirc, 
has blocked for months awaiting better access to PBC.

We can probably reach Parrot 1.0 without replacing IMCC (and that's probably 
the right decision), but IMCC as it exists now will be an increasing 
maintenance and performance and evolutionary drawback over time.

Fortunately, we can implement stage one of refactoring by cleaning up the code 
slightly and performing small changes -- but we really need to rethink its 
internal data structures dramatically.  That makes me wonder if we're better 
off investing resources elsewhere and encouraging a likely replacement.

-- c


Register Allocator

2006-08-15 Thread Vishal Soni

Hi,

I read the #parrotsketch log from today. I cannot join the IRC now. The
Graph based Register allocation is good for statically compiled languages
like C. The real value of Graph Based allocation comes when you have limited
number of registers and have to spill some of the variables to memory. Which
variable goes to memory is decided by a cost function. The other issue might
be Def-Usage (DU) chains. In a 9000 lines sub this might be getting to
large. Going to Single Static Assignment(SSA) form eliminates the need of DU
chains. I plan on implementing SSA in Byte Code Generator. But that is
future. I need some decisions to be made on POST nodes to continue my work
with BCG.

In Parrot we do not have to worry about spilling as we can assume we have
infinite number of registers of I,P,N and C type. Linear Scan allocation is
good but would require some time to implement as it goes after the data
collected from live varaible analysis.

The simpler solution is to not do any optimization for reducing the number
of registers. This is fine for now as we are in a development phase. IMHO
getting things working is more important than optimization.

I should be able to hack up a vanilla register allocation scheme that does
no live variable analysis to find overlapping registers to reduce the number
of registers. Currently I have implemented this scheme in Byte Code
Generator (compilers/bcg/src/bcg_reg_alloc_vanilla.c). All this does is it
makes sure each varaible gets a unique register. The quality of the code
generate might not be good but this should be very fast in compile time
performance compared to Graph Based allocator. This should be simple to do
and we can still keep the graph allocator but only run it when the
optimization flag -Ox is used.

Let me know your thoughts.  I'll try to catch you guys later on IRC.

--
Thanks,
Vishal


[perl #36585] [PATCH]Refactoring the Register Allocator

2005-07-18 Thread via RT
# New Ticket Created by  Curtis Rawls 
# Please include the string:  [perl #36585]
# in the subject line of all future correspondence about this issue. 
# URL: https://rt.perl.org/rt3/Ticket/Display.html?id=36585 


This patch pulls the graph coloring portion of imc_reg_alloc() out and
puts it in its own function, graph_coloring_reg_alloc().  This will
allow alternate register allocators to be implemented.

-Curtis


reg_alloc.patch
Description: Binary data


Re: [perl #36585] [PATCH]Refactoring the Register Allocator

2005-07-18 Thread Leopold Toetsch

Curtis Rawls (via RT) wrote:


This patch pulls the graph coloring portion of imc_reg_alloc() out and
puts it in its own function, graph_coloring_reg_alloc().  This will
allow alternate register allocators to be implemented.


Thanks, applied - r8648
leo



Re: Attack of the fifty foot register allocator vs. the undead continuation monster

2005-06-13 Thread Nicholas Clark
On Sun, Jun 12, 2005 at 03:03:24PM -0400, Matt Fowles wrote:

 3) Chip is right, Piers is right. The two of you have are working from
 a different base set of definitions/axioms or misunderstood each other
 in some other way.

Historically, (pre Perl 6 actually) I think that this scenario was the
biggest cause of arguments on perl5-porters.

It's an important one for people to be able to recognise.

Nicholas Clark





Re: Attack of the fifty foot register allocator vs. the undead continuation monster

2005-06-13 Thread Chip Salzenberg
On Sun, Jun 12, 2005 at 09:41:05PM -0700, Bill Coffman wrote:
 Continuations can be taken from within any sub, and possibly even
 when appending to a list, if you're using lazy list eval.

Oh no ... it's even worse than you think.  Almost *any* opcode that
operates on a PMC can trigger a continuation.  And I only need two
words to prove it:

Tied variables.

Ouch ouch ouch.

Sounds to me like register allocation is effectively dead, and all
that's left is register renumbering.

The only exception (ahem) I can think of is if we start supporting,
and compilers start emitting, continuation resets (i.e. protected
areas that guarantee that continuations won't propagate out; kind of
like the effect of C eval BLOCK  on Perl's die).  But even then, it
might not be worth the effort just to get a little tiny bit of reuse
out of an effectively infinite register set.

 There are also some issue about changing variables through continuations, 
 like you can only change PMC integers, which point to some fixed location, 
 containing the int, i.e. you can only use references. These parts make less 
 sense to me, but the issue is basically, some registers are restored, and 
 some are kept safe? Well, that issue hasn't been quite resolved either.

I think that's been answered: We don't have to save  restore
anything, because it's all automatically saved by virtue of being in
memory as the state of a *virtual* machine.
-- 
Chip Salzenberg [EMAIL PROTECTED]


Re: Attack of the fifty foot register allocator vs. the undead continuation monster

2005-06-13 Thread Brent 'Dax' Royal-Gordon
On 6/13/05, Chip Salzenberg [EMAIL PROTECTED] wrote:
 Oh no ... it's even worse than you think.  Almost *any* opcode that
 operates on a PMC can trigger a continuation.  And I only need two
 words to prove it:
 
 Tied variables.

Isn't this *exactly* why Perl 6 is requiring you to mark tied
variables when they're declared?

In this thread and the other one, I'm seeing a lot of people getting
scared of the degenerate case, which is the most dynamic behavior
possible.  But the degenerate case is rarely the common one, and we
always knew it would come at a speed penalty.  More importantly, the
degenerate case can be ruled out pretty well by a fairly simple
lexical analysis.

In the common case of no tied variables, I think we can assume that
PMC code isn't going to do zany things like invoking some random
continuation.  If there are tied variables present which might do
strange things like that, the compiler should emit a PIR directive
saying anything goes in this section.  Perhaps some languages will
always do that, but that's the price of working in those languages.

-- 
Brent 'Dax' Royal-Gordon [EMAIL PROTECTED]
Perl and Parrot hacker


Re: Attack of the fifty foot register allocator vs. the undead continuation monster

2005-06-13 Thread Autrijus Tang
On Mon, Jun 13, 2005 at 09:21:00AM -0700, Brent 'Dax' Royal-Gordon wrote:
 On 6/13/05, Chip Salzenberg [EMAIL PROTECTED] wrote:
  Oh no ... it's even worse than you think.  Almost *any* opcode that
  operates on a PMC can trigger a continuation.  And I only need two
  words to prove it:
  
  Tied variables.
 
 Isn't this *exactly* why Perl 6 is requiring you to mark tied
 variables when they're declared?

Yes.   And tied vars invoking full continuations is something that
we can just document out of existence.

Thanks,
/Autrijus/


pgpihHbq4msqs.pgp
Description: PGP signature


Re: Attack of the fifty foot register allocator vs. the undead continuation monster

2005-06-13 Thread Chip Salzenberg
On Mon, Jun 13, 2005 at 09:21:00AM -0700, Brent 'Dax' Royal-Gordon wrote:
 On 6/13/05, Chip Salzenberg [EMAIL PROTECTED] wrote:
  Oh no ... it's even worse than you think.  Almost *any* opcode that
  operates on a PMC can trigger a continuation.  And I only need two
  words to prove it:
  
  Tied variables.
 
 Isn't this *exactly* why Perl 6 is requiring you to mark tied
 variables when they're declared?

Well, that doesn't help you in and of itself, unless every reference
that could point to a tied variable was also marked.  And I don't see
it happening.

That said, I was just informed on IRC that it's already established
that once you descend into a PMC vtable entry or anything else in
Parrot that's written in C (or could be), you're not allowed to jump
sideways with a full continuation, but only upward in the style of
an escape continuation (e.g. throwing an exception).

So consider me unpanicked.  About PMCs vs. register allocation,
anyway.  I'm still a bit on edge about this Perl 6 thingy.
-- 
Chip Salzenberg [EMAIL PROTECTED]


Re: Attack of the fifty foot register allocator vs. the undead continuation monster

2005-06-13 Thread Chip Salzenberg
On Tue, Jun 14, 2005 at 12:37:52AM +0800, Autrijus Tang wrote:
 On Mon, Jun 13, 2005 at 09:21:00AM -0700, Brent 'Dax' Royal-Gordon wrote:
  On 6/13/05, Chip Salzenberg [EMAIL PROTECTED] wrote:
   Oh no ... it's even worse than you think.  Almost *any* opcode that
   operates on a PMC can trigger a continuation.  And I only need two
   words to prove it:
   
   Tied variables.
  
  Isn't this *exactly* why Perl 6 is requiring you to mark tied
  variables when they're declared?
 
 Yes.

Um:

   my $x is tied;
   tied $x, SomePackage;
   unsuspecting_victim(\$x);   # ???

-- 
Chip Salzenberg [EMAIL PROTECTED]


Re: Attack of the fifty foot register allocator vs. the undead continuation monster

2005-06-13 Thread Autrijus Tang
On Mon, Jun 13, 2005 at 06:52:35PM +0200, Chip Salzenberg wrote:
   Isn't this *exactly* why Perl 6 is requiring you to mark tied
   variables when they're declared?
  
  Yes.
 
 Um:
 
my $x is tied;
tied $x, SomePackage;
unsuspecting_victim(\$x);   # ???

Hmm, you can't say is tied; it must be something that actually
actively implementing the Tieable role (see S06):

my $x is DB_File;

but regardless, the DB_File thing can't invoke a full CC, so this is
orthogonal to the problem we're talking about; sorry.

Thanks,
/Autrijus/


pgpYbgQOB1Hqb.pgp
Description: PGP signature


Attack of the fifty foot register allocator vs. the undead continuation monster

2005-06-12 Thread Chip Salzenberg
On Wed, Jun 08, 2005 at 10:26:59PM +0100, The Perl 6 Summarizer wrote:
   Loop Improvements
 Oh no! It's the register allocator problems again. One of these days I
 swear I'm going to swot up on this stuff properly, work out whether it's
 really the case that full continuations break any conceivable register
 allocator and summarize all the issues for everyone in a nice white
 paper/summary.

It's not really that complicated.  It just takes a long time to explain.
   -- Dr. Howland Owll, on SubGenius doctrine

Consider this code:

sub example1 {
   my $a = foo();
   print $a;
   my $b = bar();
   print $b;
}

It's obvious that $a and $b can be allocated the same register because
once $b has been set, $a is never used again.

(Please ignore that $a and $b are variables that can't be stored
purely in registers.  The real world case that I'm illustrating deals
with temporaries and other values that lack user-visible names.  But
the issue is best illustrated this way.)

Now consider this:

sub example2 {
   my $a = foo();
   do {
   print $a;
   $b = bar();
   print $b;
   } while ($b);
}

You can see that it's now *not* OK to allocate $a and $b to the same
register, because the flow of control can jump back to the print $a
even after $b is assigned.

Look at the first function again, and consider what happens if foo
captures its return continuation _and_bar_invokes_it_.  It would
effectively amount to the same issue as example2:

sub foo {
   $a = 1;
   foo();
 _implicit_label_for_return_continuation:
   print $a;
   $b = bar();
   print $b;
}

bar() {
   if rand()  0.5 { goto _implicit_label_for_return_continuation }
   return lucky;
}

Therefore, register allocation must allow for implicit flow of control
from *every* function call to *every* function return ... or, more
precisely, to where *every* continuation is taken, including function
return continuations.
-- 
Chip Salzenberg [EMAIL PROTECTED]


Re: Attack of the fifty foot register allocator vs. the undead continuation monster

2005-06-12 Thread Piers Cawley
Chip Salzenberg [EMAIL PROTECTED] writes:

 On Wed, Jun 08, 2005 at 10:26:59PM +0100, The Perl 6 Summarizer wrote:
   Loop Improvements
 Oh no! It's the register allocator problems again. One of these days I
 swear I'm going to swot up on this stuff properly, work out whether it's
 really the case that full continuations break any conceivable register
 allocator and summarize all the issues for everyone in a nice white
 paper/summary.

 It's not really that complicated.  It just takes a long time to explain.
-- Dr. Howland Owll, on SubGenius doctrine

 Consider this code:

 sub example1 {
my $a = foo();
print $a;
my $b = bar();
print $b;
 }

 It's obvious that $a and $b can be allocated the same register because
 once $b has been set, $a is never used again.

 (Please ignore that $a and $b are variables that can't be stored
 purely in registers.  The real world case that I'm illustrating deals
 with temporaries and other values that lack user-visible names.  But
 the issue is best illustrated this way.)

 Now consider this:

 sub example2 {
my $a = foo();
do {
print $a;
$b = bar();
print $b;
} while ($b);
 }

 You can see that it's now *not* OK to allocate $a and $b to the same
 register, because the flow of control can jump back to the print $a
 even after $b is assigned.

 Look at the first function again, and consider what happens if foo
 captures its return continuation _and_bar_invokes_it_.  It would
 effectively amount to the same issue as example2:

 sub foo {
$a = 1;
foo();
  _implicit_label_for_return_continuation:
print $a;
$b = bar();
print $b;
 }

 bar() {
if rand()  0.5 { goto _implicit_label_for_return_continuation }
return lucky;
 }

 Therefore, register allocation must allow for implicit flow of control
 from *every* function call to *every* function return ... or, more
 precisely, to where *every* continuation is taken, including function
 return continuations.

Buf if you fallow the calling conventions that looks like:

   sub foo {
 $a = 1.
 $c = 10;
 print $c
 
save_dollar_a_and_only_dollar_a_because_im_going_to_use_it_after_this_function_call
 foo()
_implicit_label_for_return_continuation:
 restore_dollar_a
_ooh_i_dont_have_to_save_anything
 $b = bar()
_nor_do_i_have_to_restore_anything
print $b
   }

That's what caller saves means. You only have to save everything that you're
going to care about after the function returns. You don't have to save the
world, because if it was important it's already been saved further up the call
chain.

This means, of course, that the continuation needs to save the state of the
restore stack, but I thought we already knew that.

Of course, if you're going to actually use GOTO to get to some label that you
should only get to via a continuation (as you do in the code example) then you
deserve to get everything you've got coming to you. A continuation must contain
everything needed to restore the user registers to the correct state no matter
how many times they were taken.

I really don't see how this affects register allocation; the call to bar
doesn't need to save $a because it's not referred to (lexically) after the call
returns. So what if the call to bar might take a continuation. Taking that
continuation should be exactly equivalent to returning from the call to
foo. You really have to stop thinking of continuations as gotos.


Re: Attack of the fifty foot register allocator vs. the undead continuation monster

2005-06-12 Thread Chip Salzenberg
On Sun, Jun 12, 2005 at 03:15:22PM +0100, Piers Cawley wrote:
 But if you fallow the calling conventions that looks like:
 
sub foo {
  $a = 1.
  $c = 10;
  print $c
  
 save_dollar_a_and_only_dollar_a_because_im_going_to_use_it_after_this_function_call
  foo()
 _implicit_label_for_return_continuation:
  restore_dollar_a
 _ooh_i_dont_have_to_save_anything
  $b = bar()
 _nor_do_i_have_to_restore_anything
 print $b
}

You have greatly misunderstood.  We're talking about how foo manages
its callee-saves registers.  The registers involved, the ones that I'm
calling $a and $b, are P16-P31.

 Of course, if you're going to actually use GOTO to get to some label
 that you should only get to via a continuation ...

For purposes of allocating the callee-saves registers, a continuation
may as well _be_ a goto.

Don't feel bad, though.  I thought the same thing the first time *I*
heard about this problem.
-- 
Chip Salzenberg [EMAIL PROTECTED]


Re: Attack of the fifty foot register allocator vs. the undead continuation monster

2005-06-12 Thread Piers Cawley
Chip Salzenberg [EMAIL PROTECTED] writes:

 On Sun, Jun 12, 2005 at 03:15:22PM +0100, Piers Cawley wrote:
 But if you fallow the calling conventions that looks like:
 
sub foo {
  $a = 1.
  $c = 10;
  print $c
  
 save_dollar_a_and_only_dollar_a_because_im_going_to_use_it_after_this_function_call
  foo()
 _implicit_label_for_return_continuation:
  restore_dollar_a
 _ooh_i_dont_have_to_save_anything
  $b = bar()
 _nor_do_i_have_to_restore_anything
 print $b
}

 You have greatly misunderstood.  We're talking about how foo manages
 its callee-saves registers.  The registers involved, the ones that I'm
 calling $a and $b, are P16-P31.

 Of course, if you're going to actually use GOTO to get to some label
 that you should only get to via a continuation ...

 For purposes of allocating the callee-saves registers, a continuation
 may as well _be_ a goto.

No it's not. A continuation should carry all the information required to
restore the registers to the correct state when it is taken. A goto
doesn't. For the purposes of allocating the registers in foo you can allocate
$a to P16, and $b to p16, because when the call to bar takes the continuation
back to bar, the 'restore' phase should grab $a from the continuation and bung
it back on P16. The continuation doesn't even need to know where to restore $a
to, because the 'caller restores' code should take care of that.

 Don't feel bad, though.  I thought the same thing the first time *I*
 heard about this problem.

I think you should have held that thought.


Re: Attack of the fifty foot register allocator vs. the undead continuation monster

2005-06-12 Thread Chip Salzenberg
I'd like like to note for other readers and the p6i archives that
Piers has failed to grasp the problem, so the solution seems pointless
to him.  I'm sorry that's the case, but I've already explained enough.
-- 
Chip Salzenberg [EMAIL PROTECTED]


Re: Attack of the fifty foot register allocator vs. the undead continuation monster

2005-06-12 Thread MrJoltCola

At 07:15 AM 6/12/2005, Chip Salzenberg wrote:


Therefore, register allocation must allow for implicit flow of control
from *every* function call to *every* function return ... or, more
precisely, to where *every* continuation is taken, including function
return continuations.


Yes.

But for casual readers, you might want to qualify that with:

For a given basic block...


Just a couple of thoughts to add:

1) As far as variable lifetime, the brute-force method would assume
lifetime windows (du-chains) from the first definition of each variable
to the last function call in a basic block. Horrible for optimization.

2) For languages without eval() it is possible simply to do compile
time analysis of routines and associate an AST attribute with
whether the routine captures or invokes a continuation, and for
languages with Continuation Passing Style calls, you can analyze
whether the function does anything but use Return Continuations.
This would give the register allocator enough information to properly
break down the basic blocks into correct DU-chains and allocate
with confidence.

However, given Parrot's intended target languages, I feel that (2) is
going to be rarely applicable.

-Melvin



Re: Attack of the fifty foot register allocator vs. the undead continuation monster

2005-06-12 Thread MrJoltCola

At 01:16 PM 6/12/2005, MrJoltCola wrote:

At 07:15 AM 6/12/2005, Chip Salzenberg wrote:
1) As far as variable lifetime, the brute-force method would assume
lifetime windows (du-chains) from the first definition of each variable
to the last function call in a basic block. Horrible for optimization.


I said that wrong. If a standard du-chain intersects the window between
first function call and last function call in a basic block, you probably
have to extend the du-chain to the last call in the block.

-Melvin



Re: Attack of the fifty foot register allocator vs. the undead continuation monster

2005-06-12 Thread Matt Fowles
Chip~

On 6/12/05, Chip Salzenberg [EMAIL PROTECTED] wrote:
 I'd like like to note for other readers and the p6i archives that
 Piers has failed to grasp the problem, so the solution seems pointless
 to him.  I'm sorry that's the case, but I've already explained enough.

This response worries me firstly because of its rudeness and second
because of the problem itself.  As I see it there are four
possibilities a:

1) Chip is right, Piers is wrong.  This is a complex problem and
refusing to explain it means that others will doubtless also
misunderstand it, which you have a chance to preempt here.

2) Chip is wrong, Piers is right.  This is a complex problem and
refusing discussion on it would be a costly mistake.

3) Chip is right, Piers is right. The two of you have are working from
a different base set of definitions/axioms or misunderstood each other
in some other way.

4) Chip is wrong, Piers is wrong.  Shutting down open conversation so
forcefully and caustically will prevent discussion in the future and
this problem will continue to haunt parrot as no viable solution has
been seen.

Regardless of which of these possibilities is true.  I see a need for
more discussion of this issue.  Preferably a discussion that does not
degrade into backhanded insults.  I have my own ideas about this
problem, but I will save that for another response.

Matt
-- 
Computer Science is merely the post-Turing Decline of Formal Systems Theory.
-???


Re: Attack of the fifty foot register allocator vs. the undead continuation monster

2005-06-12 Thread Piers Cawley
Matt Fowles [EMAIL PROTECTED] writes:

 Chip~

 On 6/12/05, Chip Salzenberg [EMAIL PROTECTED] wrote:
 I'd like like to note for other readers and the p6i archives that
 Piers has failed to grasp the problem, so the solution seems pointless
 to him.  I'm sorry that's the case, but I've already explained enough.

 This response worries me firstly because of its rudeness and second
 because of the problem itself.  As I see it there are four
 possibilities a:

 1) Chip is right, Piers is wrong.  This is a complex problem and
 refusing to explain it means that others will doubtless also
 misunderstand it, which you have a chance to preempt here.

 2) Chip is wrong, Piers is right.  This is a complex problem and
 refusing discussion on it would be a costly mistake.

 3) Chip is right, Piers is right. The two of you have are working from
 a different base set of definitions/axioms or misunderstood each other
 in some other way.

 4) Chip is wrong, Piers is wrong.  Shutting down open conversation so
 forcefully and caustically will prevent discussion in the future and
 this problem will continue to haunt parrot as no viable solution has
 been seen.

 Regardless of which of these possibilities is true.  I see a need for
 more discussion of this issue.  Preferably a discussion that does not
 degrade into backhanded insults.  I have my own ideas about this
 problem, but I will save that for another response.

Don't worry Matt, we're still talking. It takes more than sarcasm to stop me.


Re: Attack of the fifty foot register allocator vs. the undead continuation monster

2005-06-12 Thread Curtis Rawls
On 6/12/05, Piers Cawley [EMAIL PROTECTED] wrote:
 Chip Salzenberg [EMAIL PROTECTED] writes:
 
  On Sun, Jun 12, 2005 at 03:15:22PM +0100, Piers Cawley wrote:
  But if you fallow the calling conventions that looks like:
 
 sub foo {
   $a = 1.
   $c = 10;
   print $c
   
  save_dollar_a_and_only_dollar_a_because_im_going_to_use_it_after_this_function_call
   foo()
  _implicit_label_for_return_continuation:
   restore_dollar_a
  _ooh_i_dont_have_to_save_anything
   $b = bar()
  _nor_do_i_have_to_restore_anything
  print $b
 }
 
  You have greatly misunderstood.  We're talking about how foo manages
  its callee-saves registers.  The registers involved, the ones that I'm
  calling $a and $b, are P16-P31.
 
  Of course, if you're going to actually use GOTO to get to some label
  that you should only get to via a continuation ...
 
  For purposes of allocating the callee-saves registers, a continuation
  may as well _be_ a goto.
 
 No it's not. A continuation should carry all the information required to
 restore the registers to the correct state when it is taken. A goto
 doesn't. For the purposes of allocating the registers in foo you can allocate
 $a to P16, and $b to p16, because when the call to bar takes the continuation
 back to bar, the 'restore' phase should grab $a from the continuation and bung
 it back on P16. The continuation doesn't even need to know where to restore $a
 to, because the 'caller restores' code should take care of that.
 
  Don't feel bad, though.  I thought the same thing the first time *I*
  heard about this problem.
 
 I think you should have held that thought.
 

I also do not see the wisdom of reducing continuations to just a GOTO.

Continuations should have all the abilities of a function, with the
additional abilities of returning its state, in the form of its
activation record, and being reinvoked with said activation record. 
The activation record stores all info about the state of the function
(saved registers, calling function, PC, etc).  Maybe think of it as a
snapshot of the function's state.

So, in the example, bar() reinvokes the continuation, but this all
takes place in a new call to foo(), using the returned activation
record.  The registers and PC return to the continuation's previous
state.  So, just like a recursive function, you are running the same
function in a different context, and the registers do not overlap.

Callee-save registers must be saved after the function gains control,
and restored before it passes off control.  So anywhere a function
returns a continuation, it must restore the callee-save registers
before and save them after.

It might also be helpful to take a look at other systems that also
implement continuations:
 -Stackless Python (http://www.stackless.com/spcpaper.htm)
 -Standard ML (http://www.smlnj.org/doc/features.html)
 -Formalizing Implementation Strategies
(http://citeseer.ist.psu.edu/danvy00formalizing.html)
 -Others (http://c2.com/cgi/wiki?ContinuationImplementation)

-Curtis Rawls


Re: Attack of the fifty foot register allocator vs. the undead continuation monster

2005-06-12 Thread Chip Salzenberg
On Sun, Jun 12, 2005 at 03:40:55PM -0600, Curtis Rawls wrote:
 I also do not see the wisdom of reducing continuations to just a GOTO.

It really isn't just a goto, at least, not a hardware-CPU-style goto.
By virtue of Parrot's registers living in memory attached to the
activiation record, switching control to a saved activation record
automatically restores all the registers.  Automagically, even.

I've talked this over extensively with Piers (on IRC), Autrijus, and
Leo, and I'm now convinced that switching to a saved activation
record, along with the register contents still in it, is enough.

Granted, the register allocator[*] will still need to learn that it
can't use any given register for incompatible purposes across
subroutine call boundaries (or across labels).  But given the new
larger cheeze-its^Wregister frame, that's no problem.

 Continuations should have all the abilities of a function, with the
 additional abilities of returning its state, in the form of its
 activation record, and being reinvoked with said activation record. 
 The activation record stores all info about the state of the function
 (saved registers, calling function, PC, etc).  Maybe think of it as a
 snapshot of the function's state.

Quite.  My glorified goto description neglected a key difference
between the Parrot VM and a real CPU: a goto in a real CPU does not
restore registers, while switching activation records in Parrot does.

[*] the PIR register allocator, that is, which may under certain
circumstances realize that $P30 and $P35 have disjoint lifetimes
and are never used across a function call, and therefore can be
assigned to the same actual P register (e.g. P20)
-- 
Chip Salzenberg [EMAIL PROTECTED]


Re: Attack of the fifty foot register allocator vs. the undead continuation monster

2005-06-12 Thread Bill Coffman
On 6/12/05, Curtis Rawls [EMAIL PROTECTED] wrote:
 
 [snip]
 It might also be helpful to take a look at other systems that also
 implement continuations:
 -Stackless Python (http://www.stackless.com/spcpaper.htm)
 -Standard ML (http://www.smlnj.org/doc/features.html)
 -Formalizing Implementation Strategies
 (http://citeseer.ist.psu.edu/danvy00formalizing.html)
 -Others (http://c2.com/cgi/wiki?ContinuationImplementation)
 
 -Curtis Rawls
 

I found this link helpful when trying to understand continuations. The code 
they use is not secure. It is basically using buffer overflow attacks as a 
programming technique? Well, you'll have to read to understand what I mean.

http://homepage.mac.com/sigfpe/Computing/continuations.html

The part that I understand about continuations, is that they wreak havoc in 
the control flow graph, as Chip initially said:
 Therefore, register allocation must allow for implicit flow of control
 from *every* function call to *every* function return ... or, more
 precisely, to where *every* continuation is taken, including function
 return continuations.

Although I think that is actually sugar coating it a bit. Continuations can 
be taken from within any sub, and possibly even when appending to a list, if 
you're using lazy list eval.

This, I found out when my changes to reg_alloc.c broke the continuations 
tests. The changes were not the problem, as Leo confirmed, it was the way 
the allocator and the continuations interacted. 

There are also some issue about changing variables through continuations, 
like you can only change PMC integers, which point to some fixed location, 
containing the int, i.e. you can only use references. These parts make less 
sense to me, but the issue is basically, some registers are restored, and 
some are kept safe? Well, that issue hasn't been quite resolved either.

If I thought the register allocator had a chance of working, I would 
probably find the time to start working on it again. But probably there are 
plenty of people who would be happy to do so, if this issue could be 
resolved.

-- 
-Bill Coffman


Re: calling conventions, tracebacks, and register allocator

2004-11-09 Thread Jeff Clites
On Nov 8, 2004, at 11:15 AM, Matt Fowles wrote:
Dan~
On Mon, 8 Nov 2004 13:45:08 -0500, Dan Sugalski [EMAIL PROTECTED] wrote:
The calling conventions and code surrounding them will *not* change
now. When all the sub stuff, and the things that depend on it, are
fully specified and implemented... *then* we can consider changes.
Until then, things stand the way they are.
I missunderstood.  I though you were saying that what is currently in
is final and will *never* be changed.  Thanks for the clarification.
Nevertheless, this is a legitimate topic for discussion, and the issues 
are fresh in people's minds. That's independent of any impediments that 
might block implementing changes at the current time.

JEff


Re: calling conventions, tracebacks, and register allocator

2004-11-09 Thread Miroslav Silovic
[EMAIL PROTECTED] wrote:
On Nov 8, 2004, at 11:15 AM, Matt Fowles wrote:
Dan~
On Mon, 8 Nov 2004 13:45:08 -0500, Dan Sugalski [EMAIL PROTECTED] wrote:
The calling conventions and code surrounding them will *not* change
now. When all the sub stuff, and the things that depend on it, are
fully specified and implemented... *then* we can consider changes.
Until then, things stand the way they are.

I missunderstood.  I though you were saying that what is currently in
is final and will *never* be changed.  Thanks for the clarification.

Nevertheless, this is a legitimate topic for discussion, and the 
issues are fresh in people's minds. That's independent of any 
impediments that might block implementing changes at the current time.

I think it'd be a good idea to at least agree on a good TODO list, and 
commit that to the bugtracker. Because it may turn out that some changes 
are fine to delay, while some must be implemented now or never (because, 
for example, a large ammount of compiler code will get broken because of 
them).

   Miro


Re: calling conventions, tracebacks, and register allocator

2004-11-09 Thread Dan Sugalski
At 12:58 AM -0800 11/9/04, Jeff Clites wrote:
On Nov 8, 2004, at 11:15 AM, Matt Fowles wrote:
Dan~
On Mon, 8 Nov 2004 13:45:08 -0500, Dan Sugalski [EMAIL PROTECTED] wrote:
The calling conventions and code surrounding them will *not* change
now. When all the sub stuff, and the things that depend on it, are
fully specified and implemented... *then* we can consider changes.
Until then, things stand the way they are.
I missunderstood.  I though you were saying that what is currently in
is final and will *never* be changed.  Thanks for the clarification.
Nevertheless, this is a legitimate topic for discussion, and the 
issues are fresh in people's minds. That's independent of any 
impediments that might block implementing changes at the current 
time.
While I won't stop any discussion, neither will I pay any attention 
to it, so I'm not sure there's much point to it.
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: calling conventions, tracebacks, and register allocator

2004-11-08 Thread Dan Sugalski
At 1:11 PM +0100 11/6/04, Leopold Toetsch wrote:
[calling convention change snippage]
I've already said no changes to the calling conventions, quite a 
while ago. I don't see inconvenience in the register allocation code 
as a reason to change it. Got a better reason?
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: calling conventions, tracebacks, and register allocator

2004-11-08 Thread Leopold Toetsch
Dan Sugalski [EMAIL PROTECTED] wrote:
 At 1:11 PM +0100 11/6/04, Leopold Toetsch wrote:

 [calling convention change snippage]

 ... Got a better reason?

And there is of course:

4) invoke's (and friends) register usage is assymmetrical and ugly. It's
like defining:

   set 5  # set I0, 5

Ad 1) again a note: Imcc's instructions have *r[16] attached, a NULL
terminated list of used registers/operands. This list doesn't have the
length of the opcode's arguments because of such opcodes like invoke. The
argument count of a plain Cinvoke is 0, but in r[0] a reference to P0
is created to track the implicit usage of P0.

This all complicates code generation for no good.

This are really reasons enough.

leo


Re: calling conventions, tracebacks, and register allocator

2004-11-08 Thread Dan Sugalski
At 7:17 PM +0100 11/8/04, Leopold Toetsch wrote:
Dan Sugalski [EMAIL PROTECTED] wrote:
 At 1:11 PM +0100 11/6/04, Leopold Toetsch wrote:

 [calling convention change snippage]

 ... Got a better reason?
And there is of course:
4) invoke's (and friends) register usage is assymmetrical and ugly.
[snip]
Ad 1) again a note: Imcc's instructions have *r[16] attached, a NULL
terminated list of used registers/operands. This list doesn't have the
length of the opcode's arguments because of such opcodes like invoke. The
argument count of a plain Cinvoke is 0, but in r[0] a reference to P0
is created to track the implicit usage of P0.
Okay, aesthetics and making up for a flaw in the implementation of 
how IMCC tracks opcodes and registers.

Neither of those are sufficient, individually or together.
--
Dan
--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: calling conventions, tracebacks, and register allocator

2004-11-08 Thread Matt Fowles
Dan~

On Mon, 8 Nov 2004 13:23:36 -0500, Dan Sugalski [EMAIL PROTECTED] wrote:
 Okay, aesthetics and making up for a flaw in the implementation of
 how IMCC tracks opcodes and registers.
 
 Neither of those are sufficient, individually or together.

It feels to me like you are dismissing Leo's arguments a little too
lightly here.  I find his logic fairly convincing...

Matt
-- 
Computer Science is merely the post-Turing Decline of Formal Systems Theory.
-???


Re: calling conventions, tracebacks, and register allocator

2004-11-08 Thread Leopold Toetsch
Dan Sugalski [EMAIL PROTECTED] wrote:
 At 1:11 PM +0100 11/6/04, Leopold Toetsch wrote:

 [calling convention change snippage]

 I've already said no changes to the calling conventions, quite a
 while ago.

This doesn't really change calling convention, it changes call opcodes.
It makes register usage explicit.

 ... I don't see inconvenience in the register allocation code
 as a reason to change it. Got a better reason?

1) please grep -w invoke in imcc. There is quite an amount of code that
tracks register usage of this opcode.

2) I've outlined that the usage of especially P0 - P2 blocks registers
because we've to allocate them as non-volatiles too. This reduces
the usable register count by 3 for methods.

Current produced code looks like this:

   set P16, P1  # preserve our return continuation
   set P17, P2  # preserve self
   set S0, meth   # this method unavailable
   set P2, obj
   callmethodcc  # call the meth  P0 gets overwritten
   set P1, P16   # restore P1
   set P2, P17   # restore self

3) We have these variables in the context. Having them in registers too
just duplicates argument copying. It's not needed.

The return continuation works similar to the link register on PPC. There
are 2 opcodes to access this register. You can use it for branches...
Why should we have it additionally in P1, where it clutters the caller's
usage of P1 to be able to return?

leo


Re: calling conventions, tracebacks, and register allocator

2004-11-08 Thread Dan Sugalski
At 1:38 PM -0500 11/8/04, Matt Fowles wrote:
Dan~
On Mon, 8 Nov 2004 13:23:36 -0500, Dan Sugalski [EMAIL PROTECTED] wrote:
 Okay, aesthetics and making up for a flaw in the implementation of
 how IMCC tracks opcodes and registers.
 Neither of those are sufficient, individually or together.
It feels to me like you are dismissing Leo's arguments a little too
lightly here.  I find his logic fairly convincing...
No, I'm not dismissing this stuff lightly. Leo's point to me earlier 
is dead-ion correct -- screwing around with the design before 
everything is specified and we have a working implementation is 
premature optimization. And screwing around with stuff that works 
when there's stuff that doesn't is misdirected effort.

The calling conventions and code surrounding them will *not* change 
now. When all the sub stuff, and the things that depend on it, are 
fully specified and implemented... *then* we can consider changes. 
Until then, things stand the way they are.
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: calling conventions, tracebacks, and register allocator

2004-11-08 Thread Matt Fowles
Dan~

On Mon, 8 Nov 2004 13:45:08 -0500, Dan Sugalski [EMAIL PROTECTED] wrote:
 The calling conventions and code surrounding them will *not* change
 now. When all the sub stuff, and the things that depend on it, are
 fully specified and implemented... *then* we can consider changes.
 Until then, things stand the way they are.

I missunderstood.  I though you were saying that what is currently in
is final and will *never* be changed.  Thanks for the clarification.

Matt
-- 
Computer Science is merely the post-Turing Decline of Formal Systems Theory.
-???


Re: calling conventions, tracebacks, and register allocator

2004-11-08 Thread Dan Sugalski
At 2:15 PM -0500 11/8/04, Matt Fowles wrote:
Dan~
On Mon, 8 Nov 2004 13:45:08 -0500, Dan Sugalski [EMAIL PROTECTED] wrote:
 The calling conventions and code surrounding them will *not* change
 now. When all the sub stuff, and the things that depend on it, are
 fully specified and implemented... *then* we can consider changes.
 Until then, things stand the way they are.
I missunderstood.  I though you were saying that what is currently in
is final and will *never* be changed.  Thanks for the clarification.
Ah, OK. I was unclear, then. Sorry for the confusion.
--
Dan
--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: calling conventions, tracebacks, and register allocator

2004-11-08 Thread Leopold Toetsch
Dan Sugalski [EMAIL PROTECTED] wrote:

 Okay, aesthetics and making up for a flaw in the implementation of
 how IMCC tracks opcodes and registers.

That flaw is caused by the assymmetry of opcodes, or by indirect
register usage if opcodes like bare Cinvoke.
But as that shall not be fixed now, lets' keep all these ugly hacks.

 Neither of those are sufficient, individually or together.

Your code is spilling ... ;)

leo


Re: calling conventions, tracebacks, and register allocator

2004-11-08 Thread Dan Sugalski
At 9:39 PM +0100 11/8/04, Leopold Toetsch wrote:
Dan Sugalski [EMAIL PROTECTED] wrote:
 Okay, aesthetics and making up for a flaw in the implementation of
 how IMCC tracks opcodes and registers.
That flaw is caused by the assymmetry of opcodes, or by indirect
register usage if opcodes like bare Cinvoke.
But as that shall not be fixed now, lets' keep all these ugly hacks.
It's so nice to see that any design element you don't approve of is 
referred to in such glowing terms...

Regardless, now that this matter's closed it's time to get other 
stuff going. Tail calls, for one.

  Neither of those are sufficient, individually or together.
Your code is spilling ... ;)
Well, duh. I've got a single sub with over a meg of source and 20k+ 
explicit temps. Of *course* it's going to make the register coloring 
algorithm scream. While the coloring and spilling code needs help, 
feeding it naively generated code is part of the problem.
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


calling conventions, tracebacks, and register allocator

2004-11-06 Thread Leopold Toetsch
We now have dedicated PMC* pointers in the context that hold 
current_cont, current_sub, and current_object. This is necessary to 
create traceback information. But subroutine and return opcodes are not 
adapted yet.

We have e.g.:
   invoke # implicitely P0 and use P1 for return
  # could also be a method call
   invoke P1  # likely a return
The implicit usage of P0 in function and method calls is a PITA for the 
register allocator, all this implicit usage causes special handling n 
code, so that P0 gets properly tracked.

Additionally, P0 and especially P1 is first created in the callers 
registers. This implies that the old contents of P1 has to be preserved 
by copying to another register, which increases pressure on the register 
allocator. The same is done for P2 in method calls. This scheme just 
reduces the usable register count by 3 in methods and by 2 in functions.

To streamline that and take advantage of the current_* pointers in the 
context, I'd like to adapt call/return ops slightly:

  invoke PSub, PRet   # replaces implicit P0, P1 handling
  invokecc PSub   # replaces implicit invokecc [P0]
  callmethod Pobj, Smeth, Pret  # analog, no implicit S0
  callmethodcc Pobj, Smeth
  ret_cc  # replaces invoke P1
This gets rid of all hidden register semantics of these opcodes and 
makes it explicit for the register allocator (and when reading the 
code), which registers are used. As a minor improvement it also avoids 
the set P0, PSub and such opcodes, to move the needed registers in place.

In the called subroutine P2 remains self. But P0 and P1 (current sub 
and continuation) are available only on demand, currently with the 
interpinfo opcode. Maybe two opcodes for that are in order too:

   get_sub Px   # simplifies coroutines
   get_cc  Px   # simplifies call/cc
leo


Re: [COMMIT] Register allocator for the JIT

2002-08-09 Thread Daniel Grunblatt


On Wed, 7 Aug 2002, Nicholas Clark wrote:

   I'd not thought of this. You're right, but I think the trade off is the
   cumulative time you save each time the JITted routine is called, versus
   the time you took to track usage. I guess inside frequently called functions
   it could become worthwhile.
 
  The JITted routine is called only once, right?

 Er, yes, D'oh. Good point. I was thinking in terms of the JIT being run
 per subroutine, but actually it's run per block of bytecode.

 Except that

 1: If I understand the plan, loading any sort of perl module (use Foo.pm;)
could cause parrot to load pre-compiled bytecode for that module, rather
than the module. In which case, parrot is going to have multiple
discontinuous blocks of bytecode in memory, with calls between them.
So I assume that the JIT will be JITting the perl script first, then as
and when each module is loaded, that gets JITted.

 2: BEGIN blocks.
(As in, they need to run as soon as the closing } is seen, which means that
 certainly bytecode, if not JITted bytecode, is going to get run in chunks
 in a script that uses a BEGIN block)

 3: regexps are going to compile to bytecode, and I was assuming that in turn
that bytecode was going to get JITted, and called into once per regexp
match
Right.


 4: We may want to create fat bytecode files, with pre-JITted (ie compiled)
code next to the bytecode for CPU(s) of the machine the file is mounted to.

That's an idea, we can also make native executables.

   It would be useful if the bytecode had a way for the bytecode generator to
   give a hint about how hard any sort of runtime optimiser should try to
   optimise a function. So we'd only do this sort of thing for (say) repeatedly
   run regexps.
 
  In some time we can try to do some profiling on the fly to decide where to
  optimise.

 If I follow the recent work on imcc correctly, then it is already trying to
 determine where loops are to avoid spilling registers inside loops (by
 spilling them outside. Is this the waterbed theory of graph colouring?)
 If so, imcc has already done some work about where slightly more JIT effort
 might pay off, so it seems a shame to throw that out.

Yes, that's why we are not going to do any kind of optimization that is
not required to be done at runtime, that means, we expect the most optimal
bytecode.

Daniel Grunblatt.