Re: Using imcc as JIT optimizer
Sean O'Rourke wrote: On Thu, 20 Feb 2003, Leopold Toetsch wrote: What do people think? Cool idea -- a lot of optimization-helpers could eventually be passed on to the jit (possibly in the metadata?). One thought -- the information imcc computes should be platform-independent. e.g. it could pass a control flow graph to the JIT, but it probably shouldn't do register allocation for a specific number of registers. How much worse do you think it would be to have IMCC just rank the Parrot registers in order of decreasing spill cost, then have the JIT take the top N, where N is the number of available architectural registers? The registers are already in that order (with -Op or -Oj), this wouldn't be a problem. Difficulties arise, when it comes to the register load/save instructions, which get inserted by imcc in my scheme. These are definitely processor/$arch specific. They depend on the number of mappable (and non-preserved too) registers, and on the state of the op_jit function table. Of course CFG and register life information could be passed on to the JIT, but this seems a little bit complicated, as JIT has it's own sections, which match either a basic block from imcc or are a sequence of non-JITable instructions. But in the long run, it could be a way to go. OTOH - PBC compatibility is not a big point here, when JIT is involved: in 99% of the time the code would run on the machine, where it is generated. And it would be AFAIK easier, to make some JIT crosscompiler. This would basically only need the amount of mappable registers and the extcall bits from the jump table, read in from some config file. /s leo
Using imcc as JIT optimizer
Starting from the unbearable fact, that optimized compiled C is still faster then parrot -j (in primes.pasm), I did this experiment: - do register allocation for JIT in imcc - use the first N registers as MAPped processor registers Here is the JIT optimized PASM output of $ imcc -Oj -o p.pasm primes.pasm $ cat p.pasm set ri2, 1 set I5, 50 set I4, 0 print N primes up to print I5 print is: time N1 set rn1, N1 # load REDO: set ri0, 2 div ri3, ri2, 2 LOOP: cmod ri1, ri2, ri0 if ri1, OK # with -O1j unless ri1, NEXT branch NEXT # deleted OK: # deleted inc ri0 le ri0, ri3, LOOP inc I4 set I6, ri2 NEXT: inc ri2 le ri2, I5, REDO time N0 set rn0, N0 # load print I4 print \nlast is: print I6 print \n sub rn0, rn1 set N0, rn0 # save print Elapsed time: print N0 print \n end The ri? and rn? are processor registers, above is for intel (4 mapped int/float regs), you can translate the ri? to [%ebx, %edi, %esi, %edx). The processor regs are represented as (-1 - parrot_reg), i.e. %ebx == -1, %edi == -2 ... The MAP macro in jit_emit.h would then be: # define MAP(i) ((i)= 0 ? 0 : ...map_branch[jit_info-op_i -1-(i)]) where the mappings are directly intval_map or floatval_map. JIT wouldn't need any further calculations. The load/save instructions get inserted by looking at op_jit[].extcall, i.e. if the instruction reads or writes a register, it gets saved/loaded before/after and the parrot register is used instead. (Only the print and time ops are external in i386). I currently have the imcc part for some common cases, emough for above output. What do people think? For reference: a similar idea: Of mops and microops leo PS: -O3 C 3.64s, JIT ~3.55.
Re: Using imcc as JIT optimizer
On Thursday 20 February 2003 18:14, Leopold Toetsch wrote: Tupshin Harper wrote: Leopold Toetsch wrote: Starting from the unbearable fact, that optimized compiled C is still faster then parrot -j (in primes.pasm) Lol...what are you going to do when somebody comes along with the unbearable example of primes.s(optimized x86 assembly), and you are forced to throw up your hands in defeat? ;-) It only may be equally fast, that's it :) Nahh, you know it can be faster... may be in a couple of years ;-D Cool idea, if I understand correctly, and I am in awe of how fast the bloody thing is already. That's integer/float only. When it comes to objects, different things matter. -Tupshin leo
Re: Configure.pl --cgoto=0 doesn't work
On Thu, 20 Feb 2003, Nicholas Clark wrote: If I perl Configure.pl --cgoto=0 make all test then the build fails with: ccache /usr/local/bin/gcc -DHAS_FPSETMASK -DHAS_FLOATINGPOINT_H -I/usr/local/include -Wall -Wstrict-prototypes -Wmissing-prototypes -Winline -Wshadow -Wpointer-arith -Wcast-qual -Wcast-align -Wwrite-strings -Waggregate-return -Winline -W -Wno-unused -Wsign-compare -Wformat-nonliteral -Wformat-security -Wpacked -Wpadded -Wdisabled-optimization -I./include -DHAS_JIT -DI386 -o jit_cpu.o -c jit_cpu.c In file included from jit_cpu.c:39: include/parrot/jit_emit.h:2302:39: parrot/oplib/core_ops_cgp.h: No such file or directory In file included from jit_cpu.c:39: include/parrot/jit_emit.h: In function `Parrot_jit_begin': include/parrot/jit_emit.h:2349: `cgp_core' undeclared (first use in this function) include/parrot/jit_emit.h:2349: (Each undeclared identifier is reported only once include/parrot/jit_emit.h:2349: for each function it appears in.) *** Error code 1 The problem is that an ifdef in jit/i386/jit_emit.h is defining JIT_CGP based on whether or not the compiler is GCC, and not on whether HAS_COMPUTED_GOTO is defined. The attached patch fixes this, but I'm not sure if the __GCC__ bit is still necessary. Leo? Simon --- jit/i386/jit_emit.h.old Thu Feb 20 20:59:11 2003 +++ jit/i386/jit_emit.h Thu Feb 20 20:58:52 2003 -8,7 +8,7 #include assert.h -#ifdef __GNUC__ +#if defined HAS_COMPUTED_GOTO defined __GCC__ # define JIT_CGP #endif
Re: Configure.pl --cgoto=0 doesn't work
On Thu, 20 Feb 2003, Simon Glover wrote: On Thu, 20 Feb 2003, Nicholas Clark wrote: If I perl Configure.pl --cgoto=0 make all test then the build fails with: ccache /usr/local/bin/gcc -DHAS_FPSETMASK -DHAS_FLOATINGPOINT_H -I/usr/local/include -Wall -Wstrict-prototypes -Wmissing-prototypes -Winline -Wshadow -Wpointer-arith -Wcast-qual -Wcast-align -Wwrite-strings -Waggregate-return -Winline -W -Wno-unused -Wsign-compare -Wformat-nonliteral -Wformat-security -Wpacked -Wpadded -Wdisabled-optimization -I./include -DHAS_JIT -DI386 -o jit_cpu.o -c jit_cpu.c In file included from jit_cpu.c:39: include/parrot/jit_emit.h:2302:39: parrot/oplib/core_ops_cgp.h: No such file or directory In file included from jit_cpu.c:39: include/parrot/jit_emit.h: In function `Parrot_jit_begin': include/parrot/jit_emit.h:2349: `cgp_core' undeclared (first use in this function) include/parrot/jit_emit.h:2349: (Each undeclared identifier is reported only once include/parrot/jit_emit.h:2349: for each function it appears in.) *** Error code 1 The problem is that an ifdef in jit/i386/jit_emit.h is defining JIT_CGP based on whether or not the compiler is GCC, and not on whether HAS_COMPUTED_GOTO is defined. The attached patch fixes this, but I'm not sure if the __GCC__ bit is still necessary. Leo? OK, let's try this again, with the _correct_ spelling this time... Simon --- jit/i386/jit_emit.h.old Thu Feb 20 21:43:53 2003 +++ jit/i386/jit_emit.h Thu Feb 20 21:43:20 2003 -8,7 +8,7 #include assert.h -#ifdef __GNUC__ +#if defined HAVE_COMPUTED_GOTO defined __GCC__ # define JIT_CGP #endif
Re: --optimize
On Tue, 18 Feb 2003 [EMAIL PROTECTED] wrote: I think --optimize alone is busted. You'd be right: the code adding the optimization flags to the list of compiler flags is only executed if 'debugging' is defined, which is only the case when the --debugging flag has been used. The patches below seem to fix this problem and get the --optimize option working as intended. One minor niggle that remains is that when I use the --debugging option, I get two copies of '-g' appended to the ldflags, ie: LINKFLAGS = -L/usr/local/lib LDFLAGS = -L/usr/local/lib -g -g I'm not sure what's happening here, but fortunately it seems to be harmless. Simon --- config/init/debug.pl.oldThu Feb 20 21:56:31 2003 +++ config/init/debug.plThu Feb 20 22:01:42 2003 -10,11 +10,11 $description=Enabling debugging...; sub runstep { if (Configure::Data-get('debugging')) { -my($ccflags, $linkflags, $ldflags, $optimize) = - Configure::Data-get(qw(ccflags linkflags ldflags optimize)); +my($ccflags, $linkflags, $ldflags) = + Configure::Data-get(qw(ccflags linkflags ldflags)); my($cc_debug, $link_debug, $ld_debug) = Configure::Data-get(qw(cc_debug link_debug ld_debug)); -$ccflags .= $cc_debug $optimize; +$ccflags .= $cc_debug; $linkflags .= $link_debug; $ldflags .= $ld_debug; --- /dev/null Thu Aug 30 16:30:55 2001 +++ config/init/optimize.pl Thu Feb 20 22:01:57 2003 -0,0 +1,26 +package Configure::Step; + +use strict; +use vars qw($description args); +use Parrot::Configure::Step; + +$description=Enabling optimization...; + [EMAIL PROTECTED](); + +sub runstep { + if (Configure::Data-get('optimize')) { +my($ccflags, $optimize) = + Configure::Data-get(qw(ccflags optimize)); +$ccflags .= $optimize; + +Configure::Data-set( + ccflags = $ccflags, +); + } + else { +print (none requested) ; + } +} + +1; --- lib/Parrot/Configure/RunSteps.pm.oldThu Feb 20 21:59:48 2003 +++ lib/Parrot/Configure/RunSteps.pmThu Feb 20 22:00:03 2003 -10,6 +10,7 use vars qw(steps); init/miniparrot.pl init/hints.pl init/debug.pl +init/optimize.pl inter/progs.pl inter/types.pl inter/ops.pl --- MANIFEST.oldThu Feb 20 22:13:05 2003 +++ MANIFESTThu Feb 20 22:13:17 2003 -112,6 +112,7 config/init/hints/os2.pl config/init/hints/vms.pl config/init/manifest.pl config/init/miniparrot.pl +config/init/optimize.pl config/inter/exp.pl config/inter/ops.pl config/inter/pmc.pl
Re: Using imcc as JIT optimizer
Leopold Toetsch wrote: Starting from the unbearable fact, that optimized compiled C is still faster then parrot -j (in primes.pasm) Lol...what are you going to do when somebody comes along with the unbearable example of primes.s(optimized x86 assembly), and you are forced to throw up your hands in defeat? ;-) Cool idea, if I understand correctly, and I am in awe of how fast the bloody thing is already. -Tupshin
Re: Objects, methods, attributes, properties, and other related frobnitzes
At 2:06 PM + 2/19/03, Peter Haworth wrote: On Fri, 14 Feb 2003 15:56:25 -0500, Dan Sugalski wrote: I got clarification. The sequence is: 1) Search for method of the matching name in inheritance tree 2) if #1 fails, search for an AUTOLOAD 3) if #2 fails (or all AUTOLOADs give up) then do MM dispatch Shouldn't we be traversing the inheritance tree once, doing these three steps at each node until one works, rather doing each step once for the whole tree. MM dispatch probably complicates this, though. No, you have to do it multiple times. AUTOLOAD is a last-chance fallback, so it ought not be called until all other chances have failed. If my derived class has an autoloaded method which overrides the base class' method, I don't want the base class method to be called, just because parrot does things in a peculiar order. Well, I know it's the same order that perl5 does things, but it's still peculiar. If you prototype the sub but AUTOLOAD the body it'll work OK. -- Dan --it's like this--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Configure.pl --cgoto=0 doesn't work
If I perl Configure.pl --cgoto=0 make all test then the build fails with: ccache /usr/local/bin/gcc -DHAS_FPSETMASK -DHAS_FLOATINGPOINT_H -I/usr/local/include -Wall -Wstrict-prototypes -Wmissing-prototypes -Winline -Wshadow -Wpointer-arith -Wcast-qual -Wcast-align -Wwrite-strings -Waggregate-return -Winline -W -Wno-unused -Wsign-compare -Wformat-nonliteral -Wformat-security -Wpacked -Wpadded -Wdisabled-optimization -I./include -DHAS_JIT -DI386 -o jit_cpu.o -c jit_cpu.c In file included from jit_cpu.c:39: include/parrot/jit_emit.h:2302:39: parrot/oplib/core_ops_cgp.h: No such file or directory In file included from jit_cpu.c:39: include/parrot/jit_emit.h: In function `Parrot_jit_begin': include/parrot/jit_emit.h:2349: `cgp_core' undeclared (first use in this function) include/parrot/jit_emit.h:2349: (Each undeclared identifier is reported only once include/parrot/jit_emit.h:2349: for each function it appears in.) *** Error code 1 And if I don't disable cgoto I run out of swap. :-( I've got 96M of swap, which happens to be enough to build world, perl and gcc 3.2 I've not tried X yet :-) Nicholas Clark
Re: Using imcc as JIT optimizer
Tupshin Harper wrote: Leopold Toetsch wrote: Starting from the unbearable fact, that optimized compiled C is still faster then parrot -j (in primes.pasm) Lol...what are you going to do when somebody comes along with the unbearable example of primes.s(optimized x86 assembly), and you are forced to throw up your hands in defeat? ;-) It only may be equally fast, that's it :) Cool idea, if I understand correctly, and I am in awe of how fast the bloody thing is already. That's integer/float only. When it comes to objects, different things matter. -Tupshin leo
Re: Objects, methods, attributes, properties, and other related frobnitzes
Dan Sugalski [EMAIL PROTECTED]: At 2:06 PM + 2/19/03, Peter Haworth wrote: On Fri, 14 Feb 2003 15:56:25 -0500, Dan Sugalski wrote: I got clarification. The sequence is: 1) Search for method of the matching name in inheritance tree 2) if #1 fails, search for an AUTOLOAD 3) if #2 fails (or all AUTOLOADs give up) then do MM dispatch Shouldn't we be traversing the inheritance tree once, doing these three steps at each node until one works, rather doing each step once for the whole tree. MM dispatch probably complicates this, though. No, you have to do it multiple times. AUTOLOAD is a last-chance fallback, so it ought not be called until all other chances have failed. Pardon me for coming in in the middle, but it seems to me that only one traversal should be necessary. The first traversal can accumulate a temporary linked list of AUTOLOAD subroutines. If the first traversal locates an appropriate method, the linked list is discarded. If no appropriate method is found, control is dispatched to the AUTOLOAD subroutine at the head of the list, if there is one; if the list is empty the MM dispatch is tried.