Re: IMCC temp optimizations...
Dan Sugalski [EMAIL PROTECTED] wrote: I think it may be a handy thing if someone'd go through and draw up a set of rules for the use of temps, and things that'll cause the register coloring algorithm to go mad. (I'd like to avoid 30 minute compile sessions--it's a tad tedious :) Are you still sticking everything in one big _main? leo
Re: IMCC temp optimizations...
Dan Sugalski wrote: ... (I'd like to avoid 30 minute compile sessions--it's a tad tedious :) Should be faster now by some factor. How many symbols are in the biggest compilation unit (parrot -v registers in .imc)? leo
Re: IMCC temp optimizations...
At 1:22 PM +0200 4/22/04, Leopold Toetsch wrote: Dan Sugalski wrote: ... (I'd like to avoid 30 minute compile sessions--it's a tad tedious :) Should be faster now by some factor. Cool, thanks. I've an optimized build of parrot going now, and I'll see what things look like when it's dine. How many symbols are in the biggest compilation unit (parrot -v registers in .imc)? Dunno what parrot thinks--it's not done yet. grep says 569 .locals and 473 temp PMC registers. ($Px) I think that can reasonably be considered A Lot. I'm rejigging the compiler to cut down on the number of .local declarations, but that'll increase the temp pmc usage, at least with the relatively simple temp system I've got now. (I can throw dummy labels in to create fake basic blocks if that'll help the register coloring code) -- Dan --it's like this--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Re: IMCC temp optimizations...
At 7:55 AM +0200 4/22/04, Leopold Toetsch wrote: Dan Sugalski [EMAIL PROTECTED] wrote: I think it may be a handy thing if someone'd go through and draw up a set of rules for the use of temps, and things that'll cause the register coloring algorithm to go mad. (I'd like to avoid 30 minute compile sessions--it's a tad tedious :) Are you still sticking everything in one big _main? Except for the library code, yup. No way around that. -- Dan --it's like this--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Re: IMCC temp optimizations...
Dan Sugalski wrote: As I sit here and wait for parrot to churn on the output from compiling a relatively small program I've put in another factor ~2.5 change for a unit with 2000 temps. leo
Re: IMCC temp optimizations...
Dan Sugalski [EMAIL PROTECTED] wrote: Dunno what parrot thinks--it's not done yet. grep says 569 .locals and 473 temp PMC registers. I've now enabled some more code that speeds up temp allocation more (~2.5 for 2000 temps in a unit). This needs that the usage range is limited to 10 lines. If the code from a compiler looks like the output from the program below, this works fine. ... I'm rejigging the compiler to cut down on the number of .local declarations, but that'll increase the temp pmc usage, at least with the relatively simple temp system I've got now. Temps are fine. .locals ranging from top of the program (not counting the declaration) down do hurt. Many small short ranged temps are much better then long ranged vars because they have more interferences on each other. (I can throw dummy labels in to create fake basic blocks if that'll help the register coloring code) That makes it worse. More blocks, more life ranges to compare. If you can emit code similar to gen.pl it could take advantage of my last change. leo $ cat gen.pl #!/usr/bin/perl use strict; # a = 1 + 1 + 1 ... + 1 my ($i, $t, $N); $N = $ARGV[0] || 299; print .sub _main [EMAIL PROTECTED]; $t++; print \t\$P$t = new PerlInt\n; print \t\$P$t = 1\n; for $i (0..$N) { $t++; print \t\$P$t = new PerlInt\n; print \t\$P$t = 1\n; $t++; print \t\$P$t = new PerlInt\n; print \t\$P$t = [EMAIL PROTECTED] + [EMAIL PROTECTED]; } print \tprint \$P$t\n; print \tprint \\\n\\n; print .end\n;
Re: IMCC temp optimizations...
A good register allocation algorithm will always have problems with big subs, there is nothing that we can do about it. I think that what real compilers do to solve this problem is implement two different algorithms: one for normal subs which tries to generate optimal code, and a naive one for very large subs with many virtual registers. That makes compilation much faster, and the execution penalty doesn't hurt too much. Actually, it's (for me) an open question whether the good register allocation should be the default one. Perl (and python and..) users expect blazing compilation times, so maybe we should reserve it for higher -O levels. But then, we won't know how bad are our compilation times until there are real programs written in perl6/parrot. -angel Leopold Toetsch wrote: Dan Sugalski [EMAIL PROTECTED] wrote: Dunno what parrot thinks--it's not done yet. grep says 569 .locals and 473 temp PMC registers. I've now enabled some more code that speeds up temp allocation more (~2.5 for 2000 temps in a unit). This needs that the usage range is limited to 10 lines. If the code from a compiler looks like the output from the program below, this works fine.
Re: IMCC temp optimizations...
At 4:03 PM +0200 4/22/04, Leopold Toetsch wrote: Dan Sugalski [EMAIL PROTECTED] wrote: Dunno what parrot thinks--it's not done yet. grep says 569 .locals and 473 temp PMC registers. I've now enabled some more code that speeds up temp allocation more (~2.5 for 2000 temps in a unit). This needs that the usage range is limited to 10 lines. If the code from a compiler looks like the output from the program below, this works fine. This sped it up a lot. The output is: Starting parse... sub _MAIN: registers in .imc: I34, N0, S7, P1014 0 labels, 0 lines deleted, 0 if_branch, 0 branch_branch 0 used once deleted 0 invariants_moved registers needed:I43, N0, S12, P3327 registers in .pasm: I25, N0, S20, P32 - 464 spilled 2007 basic_blocks, 2079 edges 13691 lines compiled. Writing sample.pbc packed code 348208 bytes sample.pbc written. Which actually came out in reasonable time, rather than me giving up after 45 minutes. :) Still takes ages, so I've a lot of work to do here. ... I'm rejigging the compiler to cut down on the number of .local declarations, but that'll increase the temp pmc usage, at least with the relatively simple temp system I've got now. Temps are fine. .locals ranging from top of the program (not counting the declaration) down do hurt. Many small short ranged temps are much better then long ranged vars because they have more interferences on each other. Hrm. Does the code currently consider something like: $P0 = foo to start a new lifetime for $P0? -- Dan --it's like this--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Re: IMCC temp optimizations...
Dan Sugalski [EMAIL PROTECTED] wrote: registers needed:I43, N0, S12, P3327 registers in .pasm: I25, N0, S20, P32 - 464 spilled 2007 basic_blocks, 2079 edges Ouch. Register allocation is spending huge times during spilling. Something is definetely wrong with your code - wrong in the sense of: it doesn't play with, what imcc expects ;) You must have too much pseudo-globals in that unit, spanning a huge range and interfering with one another. Do you use lexical vars or Parrot globals? Hrm. Does the code currently consider something like: $P0 = foo to start a new lifetime for $P0? If you don't use $P0 beyond that point yes. Do you name all temps $P0 or some such? Or are you giving them unique names. You should do the latter. leo
Re: IMCC temp optimizations...
At 6:04 PM +0200 4/22/04, Leopold Toetsch wrote: Dan Sugalski [EMAIL PROTECTED] wrote: registers needed:I43, N0, S12, P3327 registers in .pasm: I25, N0, S20, P32 - 464 spilled 2007 basic_blocks, 2079 edges Ouch. Register allocation is spending huge times during spilling. Something is definetely wrong with your code - wrong in the sense of: it doesn't play with, what imcc expects ;) Yep, the 45 minute compile times were the hint that maybe imcc and I were fighting a bit. :) You must have too much pseudo-globals in that unit, spanning a huge range and interfering with one another. Do you use lexical vars or Parrot globals? I was using .locals for the actual variables in the source program, and $Px for all the temps the compiler generated. I've been migrating a lot of the code to use a few .local-defined hashes and indexing into them, and it looks like that's the way to go. (This'd be easier if this language had, y'know, actual subroutines and stuff...) Hrm. Does the code currently consider something like: $P0 = foo to start a new lifetime for $P0? If you don't use $P0 beyond that point yes. Do you name all temps $P0 or some such? Or are you giving them unique names. You should do the latter. I've a lot of 'constant' temps named $P0. I'll go fix that and see where we go. -- Dan --it's like this--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Re: IMCC temp optimizations...
Dan Sugalski [EMAIL PROTECTED] wrote: I was using .locals for the actual variables in the source program, Well, you know it: .locals aren't vars. and $Px for all the temps the compiler generated. I've been migrating a lot of the code to use a few .local-defined hashes and indexing into them, and it looks like that's the way to go. If you use a hash anyway, just use globals - this seems to match the target language's POV: $Px = global foo # fetch global foo = $Py # store If spilling is still needed, the spill code can be improved, if the variable already has a store (like a lexical or a global). Needs only cutting down the life range, and a refetch in the other locations of usage. This saves the store and simplifies spilling. But if you treat all your variables like that, there will be no spilling at all. ... (This'd be easier if this language had, y'know, actual subroutines and stuff...) Yep. leo
IMCC temp optimizations...
As I sit here and wait for parrot to churn on the output from compiling a relatively small program, I'm reminded again that imcc's got some degenerate behaviour when it comes to register coloring and .locals. I think it may be a handy thing if someone'd go through and draw up a set of rules for the use of temps, and things that'll cause the register coloring algorithm to go mad. (I'd like to avoid 30 minute compile sessions--it's a tad tedious :) -- Dan --it's like this--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk