Re: [perl #21399] [PATCH]Patch to fix compiler warnings in smallobject.c

2003-02-28 Thread Leopold Toetsch
Steve Peters (via RT) wrote:


The attached patch fixes a compiler warning in smallobject.c.  The #define 
UNITS_PER_ALLOC_GROWTH_FACTOR has a value of 1.75, but is multiplied to a 
size_t.  This patch sets UNITS_PER_ALLOC_GROWTH_FACTOR to (size_t)2.


Not good. Changing values to get rid of warnings is not the way to go. 
And why multiplying a float to a size_t should be a warning is out of my 
scope.
You could try explicit casts to get rid of the warning.


Steve Peters
[EMAIL PROTECTED]


leo




Re: This week's Perl 6 Summary

2003-02-28 Thread Leopold Toetsch
Dan Sugalski wrote:


... And with that 
limitation, I'd rather have a lower-overhead JIT with a win for the 
shorter programs than a high-overhead one with a win for long-running 
programs.
I see that limitation. But currently we have a high overhead JIT. The 
problem is not so much program run time, but load time.

One example: t/op/stacks_33.pasm (8242 lines) because of macros
expands to 38955 lines, giving 4102 basic blocks and 6150 edges
connecting them.
compile/run options and timings (first 4 include running)
plain   1.07
-P  1.09
-j  2.4
-Oj 2.3
-ox.pbc / -j1.07 + 1.3
-ox.pbc -Oj /-j 2.1 + 0.2
So writing out minimal CFG (blocks  Branch targets) + register
usage gives 6 times the startup speed for this -Oj compiled PBC file.
Program run time is ~0.
BTW, running the -Oj compiled PBC with a normal core does succeed 
(including correct output), albeit there are a lot of out of bound 
register accesses (which go to high integer regs)

PC=12; OP=82 (set_n_ic); ARGS=(N-2=0, 0)
PC=15; OP=82 (set_n_ic); ARGS=(N-4=0, 1024)
PC=18; OP=79 (set_n_n); ARGS=(N3=0, N-4=1024)
PC=21; OP=79 (set_n_n); ARGS=(N2=0, N-3=0)
PC=24; OP=79 (set_n_n); ARGS=(N1=0, N-2=0)
PC=27; OP=79 (set_n_n); ARGS=(N0=0, N-1=0)
PC=30; OP=678 (pushn)
I think, that the -b option should have a check for this.

(timings from a PIII/600, imcc -O3 compiled)

leo




[CVS ci] Using imcc as JIT optimizer #3

2003-02-28 Thread Leopold Toetsch
This concludes for now this experiment. It works, but to do it right, it 
should go in the direction Angel Faus has mentioned. Also calling 
conventions have to be done before, to get the data flow right.
With the -Oj option a mininal CFG section is created in the packfile, 
which is used by parrots JIT code, to get sections and register 
mappings. This is significantly faster then current's jit optimizer, 
which has a relatively high impact on program load times.
The JIT loader looks at the packfile now, and uses either method to 
generate information needed for actually producing bytecode.

Further included:
- some CFG hacks to figure out info about subroutines
- implemented the in the comment mentioned optimization in the
  register interference code
- implement read/write semantics of pusx/popx/clearx/saveall/restoreall
- some bugfixes WRT memory handling of SymRegs/life_info
- improved default_dump for pdump
- removed unused warnings in jit.c, all -O3 unitialized warnings in imcc
leo

PS
$ imcc -O1j  primes.pasm
Elapsed time: 3.485836
$ ./primes  # -O3 gcc 2.95.2
Elapsed time: 3.643756
$ imcc -O1 -j  primes.pasm
Elapsed time: 3.884460
$ make test IMCC=languages/imcc/imcc -O1j
succeeds, except for t/op/interp_2, where the trace output is different 
due to inserted register load/store ops. For the nci stuff -Oj gets 
disabled internally.



Re: List datatype

2003-02-28 Thread Dan Sugalski
At 9:34 PM -0800 2/27/03, David wrote:
Is there a List datatype for Parrot? I'm looking for something along the lines
of what's in Python. Specifically, it should be able to do the following
operations:
Not yet, though we do need one. There's not much difference between a 
list and an array, but the differences that are there are pretty 
important.
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: This week's Perl 6 Summary

2003-02-28 Thread Dan Sugalski
At 8:54 AM +0100 2/28/03, Leopold Toetsch wrote:
Dan Sugalski wrote:

... And with that limitation, I'd rather have a lower-overhead JIT 
with a win for the shorter programs than a high-overhead one with a 
win for long-running programs.
I see that limitation. But currently we have a high overhead JIT. 
The problem is not so much program run time, but load time.
Damn. Okay, what sort of metadata would be appropriate to aid in 
this? If it means having the assembler, IMCC, or some external 
utility write a chunk that identifies the basic blocks and edges, 
then I'm all for it.
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: This week's Perl 6 Summary

2003-02-28 Thread Leopold Toetsch
Dan Sugalski wrote:

At 8:54 AM +0100 2/28/03, Leopold Toetsch wrote:

I see that limitation. But currently we have a high overhead JIT. The 
problem is not so much program run time, but load time.

Damn. Okay, what sort of metadata would be appropriate to aid in this? 
If it means having the assembler, IMCC, or some external utility write a 
chunk that identifies the basic blocks and edges, then I'm all for it.


gprof does indicate that the branch target calculation is the main culprit.
My -Oj hack writes out currently 6 opcode_t per BB:
- bb-begin (with highbit set for a branch target)
- bb-end
- 4 * registers_used
(end is somehow redundant, the latter 4 could be shifted into one op).

The plain jit optimizer would need:
- bb-begin (implying bb-end)
- (bb-end)
- bb-end-branch_target (where the -end branches to)
- flags (branch source or target per block boundary), could also be
  coded into offsets
From this info, jit optimizer could build its internal sections (parts 
of blocks that are JITed or not). A BB is at least one section, but 
could be split into more. The register usage scan and allocation is all 
the same (two linear scans over all ops) and another run for actual code 
generation.
Here could also be some need for improvement, e.g. register usage could 
as well be passed by imcc (top N first usage per block - albeit this is 
different to current usage calculation per section). Sean already did 
propose this variant. This could save one scan through all ops.

Timing estimations WRT *big* programs are all rather vague, we just 
don't have them yet. We badly need a at least medium implemented HL 
*with* some RL test cases for this. The Java spec suite implemented in a 
supported HL would be nice to compare :) Ook.

leo





Priorities of ToDos

2003-02-28 Thread Matthias Huerlemann
Hallo

Can you tell me what the priority of coroutines is? I am developing a
compiler to pasm for a language that would make use of coroutines. And I
don't know enough about the parrot engine to implement this myself.
I guess exceptions and/or objects have higher priority at the moment
(for 0.0.10 or so).

Regards,
Matthias