Leopold Toetsch wrote:
Phil Hassey wrote:
But with a processor with 16 registers (do such things exist?).
Parrot would be overflowing registers that it could have been using
in the JIT.
RISC processor have a lot of them. But before there are unused
processor registers, we will allocate P
Nicholas Clark [EMAIL PROTECTED] writes:
On Wed, Feb 26, 2003 at 02:21:32AM +0100, Angel Faus wrote:
[snip lots of good stuff]
All this is obviously machine dependent: the code generated should
only run in the machine it was compiled for. So we should always keep
the original imc code in
This concludes for now this experiment. It works, but to do it right, it
should go in the direction Angel Faus has mentioned. Also calling
conventions have to be done before, to get the data flow right.
With the -Oj option a mininal CFG section is created in the packfile,
which is used by
On Tue, Feb 25, 2003 at 11:58:41PM +0100, Leopold Toetsch wrote:
Nicholas Clark wrote:
[thanks for the explanation]
And is this all premature optimisation, give that we haven't got objects,
exceptions, IO or a Z-code interpreter yet?
And yes: We don't have exceptions and threads yet. The
[snip]
Maybe we starting to get to the point of having imcc deliver parrot
bytecode if you want to be portable, and something approaching native
machine code if you want speed.
IMHO yes, the normal options produce a plain PBC file, more or less
optimized at PASM level, the -Oj option is
Phil Hassey wrote:
... The current bytecode from parrot already has potential
for slowing things down, and that's what worries me here.
I don't see that.
My understanding is that PBC has a limit of 16 (32?) integer registers. When
a code block needs more than 16 registers, they are overflowed
[ you seem to be living some hors ahead in time ]
Yep, sorry about that.
The problem stays the same: spilling processors to parrot's or
parrots to array.
Thinking a bit more about it, now I believe that the best way to do it
would be:
(1) First, do a register allocation for machine
Phil Hassey wrote:
[snip]
Although it might be nice if IMC were binary at this stage (for some
feel-good-reason?).
You mean, that a HL like perl6 should produce a binary equivalent to
ther current .imc file? Yep - this was discussed already, albeit there
was no discussion, how this
Angel Faus wrote:
(1) First, do a register allocation for machine registers, assuming
that there are N machine registers and infinite parrot registers.
This uses equally the top N used registers for processor regs. The
spilling for (1) is loading/moving them to parrot registers/temp
Although it might be nice if IMC were binary at this stage (for some
feel-good-reason?).
You mean, that a HL like perl6 should produce a binary equivalent to
ther current .imc file? Yep - this was discussed already, albeit there
was no discussion, how this should look like. And the lexer
Leopold Toetsch wrote:
- do register allocation for JIT in imcc
- use the first N registers as MAPped processor registers
I have committed the next bunch of changes and an updated jit.pod.
- it should now be platform independent, *but* other platforms have to
define what they consider as
I explained very badly. The issue is not spilling (at the parrot
level)
The problem is: if you only pick the highest priority parrot registers
and put them in real registers you are losing oportunities where
copying the date once will save you from copying it many times. You
are, in some
Angel Faus wrote:
Saturday 22 February 2003 16:28, Leopold Toetsch wrote:
With your approach there are three levels of parrot registers:
- The first N registers, which in JIT will be mapped to physical
registers.
- The others 32 - N parrot registers, which will be in memory.
- The spilled
On Tuesday 25 February 2003 08:51, Leopold Toetsch wrote:
Angel Faus wrote:
Saturday 22 February 2003 16:28, Leopold Toetsch wrote:
With your approach there are three levels of parrot registers:
- The first N registers, which in JIT will be mapped to physical
registers.
- The
Saturday 22 February 2003 16:28, Leopold Toetsch wrote:
Gopal V wrote:
If memory serves me right, Leopold Toetsch wrote:
Ok .. well I sort of understood that the first N registers will
be the ones MAPped ?. So I thought re-ordering/sorting was the
operation performed.
Yep. Register
On Tue, Feb 25, 2003 at 07:18:11PM +0100, Angel Faus wrote:
I believe it would be smarter if we instructed IMCC to generate code
that only uses N parrot registers (where N is the number of machine
register available). This way we avoid the risk of having to copy
twice the data.
It's not
Phil Hassey wrote:
Not knowing much about virtual machine design... Here's a question --
Why do we have a set number of registers? Particularily since JITed code
ends up setting the register constraints again, I'm not sure why parrot
should set up register limit constraints first. Couldn't
Nicholas Clark wrote:
On Wed, Feb 26, 2003 at 02:21:32AM +0100, Angel Faus wrote:
[snip lots of good stuff]
All this is obviously machine dependent: the code generated should
only run in the machine it was compiled for. So we should always keep
the original imc code in case we copy the pbc
[ you seem to be living some hors ahead in time ]
Angel Faus wrote:
I explained very badly. The issue is not spilling (at the parrot
level)
The problem stays the same: spilling processors to parrot's or parrots
to array.
[ ... ]
set I3, 1
add I3, I3, 1
print I3
fast_save I3, 1
set I3, 1
Dan Sugalski wrote:
At 12:09 PM +0100 2/20/03, Leopold Toetsch wrote:
Starting from the unbearable fact, that optimized compiled C is still
faster then parrot -j (in primes.pasm), I did this experiment:
- do register allocation for JIT in imcc
- use the first N registers as MAPped processor
If memory serves me right, Dan Sugalski wrote:
This sounds pretty interesting, and I bet it could make things
faster. The one thing to be careful of is that it's easy to get
yourself into a position where you spend more time optimizing the
code you're JITting than you win in the end.
I
Gopal V wrote:
I'm assuming that the temporaries are the things being moved around here ?.
It is not so much a matter of moving things around, but a matter of
allocating (and renumbering) parrot (or for JIT) processor registers.
These are of course mainly temporaries, but even when you have
At 4:28 PM +0100 2/22/03, Leopold Toetsch wrote:
Gopal V wrote:
Direct hardware maps (like using CX for loop count etc) will need to be
platform dependent ?. Or you could have a fixed reg that can be used for
loop count (and gets mapped on hardware appropriately).
We currently don't have special
Nicholas Clark wrote in perl.perl6.internals :
r-score = r-use_count + (r-lhs_use_count 2);
r-score += 1 (loop_depth * 3);
[...]
I wonder how hard it would be to make a --fsummon-nasal-demons flag for gcc
that added trap code for all classes of undefined behaviour, and caused
Please don't take the following as a criticism of imcc - I'm sure I manage
to write code with things like this all the time.
On Sat, Feb 22, 2003 at 08:13:59PM +0530, Gopal V wrote:
If memory serves me right, Leopold Toetsch wrote:
r-score = r-use_count + (r-lhs_use_count 2);
If memory serves me right, Leopold Toetsch wrote:
I'm assuming that the temporaries are the things being moved around here ?.
It is not so much a matter of moving things around, but a matter of
allocating (and renumbering) parrot (or for JIT) processor registers.
Ok .. well I sort of
Gopal V wrote:
If memory serves me right, Leopold Toetsch wrote:
Ok .. well I sort of understood that the first N registers will be the
ones MAPped ?. So I thought re-ordering/sorting was the operation performed.
Yep. Register renumbering, so that the top N used (in terms of score)
registers
Nicholas Clark wrote:
r-score += 1 (loop_depth * 3);
until variables in 11 deep loops go undefined?
Not undefined, but spilled. First *oops*, but second of course this all
not final. I did change scoring several times from the code base AFAIK
Angel Faus did implement. And we don't
On Sat, Feb 22, 2003 at 08:44:12PM -, Rafael Garcia-Suarez wrote:
Nicholas Clark wrote in perl.perl6.internals :
r-score = r-use_count + (r-lhs_use_count 2);
r-score += 1 (loop_depth * 3);
[...]
I wonder how hard it would be to make a --fsummon-nasal-demons flag for gcc
On Sat, Feb 22, 2003 at 09:27:04PM +, nick wrote:
On Sat, Feb 22, 2003 at 08:44:12PM -, Rafael Garcia-Suarez wrote:
What undefined behaviour are you referring to exactly ? the shift
overrun ? AFAIK it's very predictable (given one int size). Cases of
Will you accept a shortcut
Leopold Toetsch wrote:
- do register allocation for JIT in imcc
- use the first N registers as MAPped processor registers
The [RFC] imcc calling conventions didn't get any response. Should I
take this fact as an implict yep, fine?
Here is again the relevant part, which has implications on
At 12:09 PM +0100 2/20/03, Leopold Toetsch wrote:
Starting from the unbearable fact, that optimized compiled C is
still faster then parrot -j (in primes.pasm), I did this experiment:
- do register allocation for JIT in imcc
- use the first N registers as MAPped processor registers
This sounds
Dan Sugalski wrote:
At 12:09 PM +0100 2/20/03, Leopold Toetsch wrote:
Starting from the unbearable fact, that optimized compiled C is still
faster then parrot -j (in primes.pasm), I did this experiment:
- do register allocation for JIT in imcc
- use the first N registers as MAPped processor
Sean O'Rourke wrote:
On Thu, 20 Feb 2003, Leopold Toetsch wrote:
What do people think?
Cool idea -- a lot of optimization-helpers could eventually be passed on
to the jit (possibly in the metadata?). One thought -- the information
imcc computes should be platform-independent. e.g. it
Starting from the unbearable fact, that optimized compiled C is still
faster then parrot -j (in primes.pasm), I did this experiment:
- do register allocation for JIT in imcc
- use the first N registers as MAPped processor registers
Here is the JIT optimized PASM output of
$ imcc -Oj -o p.pasm
On Thursday 20 February 2003 18:14, Leopold Toetsch wrote:
Tupshin Harper wrote:
Leopold Toetsch wrote:
Starting from the unbearable fact, that optimized compiled C is still
faster then parrot -j (in primes.pasm)
Lol...what are you going to do when somebody comes along with the
Leopold Toetsch wrote:
Starting from the unbearable fact, that optimized compiled C is still
faster then parrot -j (in primes.pasm)
Lol...what are you going to do when somebody comes along with the
unbearable example of primes.s(optimized x86 assembly), and you are
forced to throw up your
Tupshin Harper wrote:
Leopold Toetsch wrote:
Starting from the unbearable fact, that optimized compiled C is still
faster then parrot -j (in primes.pasm)
Lol...what are you going to do when somebody comes along with the
unbearable example of primes.s(optimized x86 assembly), and you are
38 matches
Mail list logo