Re: segfault in parrot using pugs PIR output
On Thu, 2005-06-30 at 19:36 +0200, Leopold Toetsch wrote: > Brian Wheeler wrote: > > Its been a while since I tinkered with parrot so I thought I'd start > > playing again...but I've hit a segfault. > > Should of course not happen... But it seems that the codegen is mixing > old and new calling conventions. > > The trace: > > > 8233 set I0, 1- I0=1, > > 8236 set I1, 0- I1=0, > > 8239 set I2, 0- I2=0, > > 8242 set I3, 0- I3=0, > > 8245 set I4, 0- I4=0, > > 8248 set S1, "" - S1="", > > 8251 set P0, PMC_C[733] - P0=Sub=PMC(0x8af24b8 pc:8023), > > 8254 invokecc > > clearly shows some wrong function call, like another one I found in the > pir source: > > > .namespace ['main'] > > > main() > > This shorthand call syntax doesn't work together (yet) with the new > set_args/get_params ... opcodes. > Ok, that seems reasonable, considering the newness of the new calling conventions. > So please verify that no function shortcut syntax is used in code > generation. > > leo > Thanks! Not knowing Haskell (who said _perl_ looks like line noise?) this patch to src/Pugs/CodeGen/PIR.hs gets it working to the point I can test an optimizer I'm tinkering with: @@ -320,6 +320,7 @@ varInit ('&':_) = text $ "PerlScalar" varInit x = error $ "invalid name: " ++ x + {-| Compiles the current environment to PIR code. -} genPIR :: Eval Val genPIR = do @@ -374,11 +375,17 @@ -- XXX wrong, should be lexical , InsNew tempPMC PerlScalar , "store_global".- [lit "$_", tempPMC] -]) ++ [ StmtRaw (text (name ++ "()")) | PSub name@('_':'_':_) _ _ _ <- globPIL ] ++ -[ StmtRaw (text "main()") -, StmtIns ("exit" .- [lit0]) -] -, DeclSub "main" [SubANON] [ StmtRaw $ emit mainPIR ] + ] ++ +{- ) ++ [ StmtRaw (text (name ++ "()")) | PSub name@('_':'_':_) _ _ _ <- globPIL ] ++ -} + [ InsNew tempPMC PerlScalar + , tempPMC <-- "find_name" $ [lit "__main"] + , "set_args".- [lit "()"] + , InsNew tempPMC2 PerlScalar + , "get_results" .-[lit "(0)", tempPMC2] + , "invokecc".-[tempPMC] + , "exit".-[lit0] + ]) +, DeclSub "__main" {- [SubANON] -} [] [ StmtRaw $ emit mainPIR ] ] ] ] where style = MkEvalStyle Brian
Re: eof opcode
Fair enough. However, shouldn't the rest of the opcodes with an IO object as their parameter be methods as well? Its not a lot of ops, but it would trim down the core a bit. Brian On Thu, 2004-11-25 at 08:00 +0100, Leopold Toetsch wrote: > Brian Wheeler <[EMAIL PROTECTED]> wrote: > > I noticed a hole in the io.ops where the PIO stuff wasn't covered. This > > patch creates an eof opcode which checks for end of file. > > Please just use the "eof" method of the PIO object: > > $I0 = $P0."eof"() > > leo -- brian wheeler <[EMAIL PROTECTED]>
eof opcode
I noticed a hole in the io.ops where the PIO stuff wasn't covered. This patch creates an eof opcode which checks for end of file. Brian Wheeler [EMAIL PROTECTED] cvs diff: Diffing . cvs diff: Diffing ops Index: ops/io.ops === RCS file: /cvs/public/parrot/ops/io.ops,v retrieving revision 1.51 diff -u -r1.51 io.ops --- ops/io.ops 29 Sep 2004 15:44:29 - 1.51 +++ ops/io.ops 24 Nov 2004 21:28:57 - @@ -42,6 +42,19 @@ +=item B(out INT, in PMC) + +Check for end-of-file condition in IO object $2 and return the status in $1 + +=cut + +inline op eof(out INT, in PMC) :base_io { + $1 = PIO_eof(interpreter,$2); + goto NEXT(); +} + + + =item B(out PMC, in INT, in STR) Create ParrotIO object in $1 as a copy of file descriptor $2. -- brian wheeler <[EMAIL PROTECTED]>
Re: Problems with 0.1.1 release on x86-64
Sigh. I'll get this right sometime! Brian Index: config/auto/jit.pl === RCS file: /cvs/public/parrot/config/auto/jit.pl,v retrieving revision 1.33 diff -u -r1.33 jit.pl --- config/auto/jit.pl 8 Mar 2004 08:49:05 - 1.33 +++ config/auto/jit.pl 19 Oct 2004 18:38:41 - @@ -171,9 +171,9 @@ else { Configure::Data->set( jitarchname => 'nojit', - jitcpuarch => 'i386', - jitcpu => 'I386', - jitosname => 'nojit', + jitcpuarch => $cpuarch, + jitcpu => $cpuarch, + jitosname => $osname, jitcapable => 0, execcapable => 0, cc_hasjit => '', Index: config/auto/memalign.pl === RCS file: /cvs/public/parrot/config/auto/memalign.pl,v retrieving revision 1.10 diff -u -r1.10 memalign.pl --- config/auto/memalign.pl 13 Oct 2004 14:37:59 - 1.10 +++ config/auto/memalign.pl 19 Oct 2004 18:38:41 - @@ -42,6 +42,13 @@ Configure::Data->set('malloc_header', 'stdlib.h'); } +if (Configure::Data->get('ptrsize') == Configure::Data->get('intsize')) { + Configure::Data->set('ptrcast','int'); + } +else { + Configure::Data->set('ptrcast','long'); + } + cc_gen('config/auto/memalign/test_c.in'); eval { cc_build(); }; unless ($@ || cc_run_capture() !~ /ok/) { Index: config/auto/memalign/test_c.in === RCS file: /cvs/public/parrot/config/auto/memalign/test_c.in,v retrieving revision 1.4 diff -u -r1.4 test_c.in --- config/auto/memalign/test_c.in 13 Jul 2003 18:52:37 - 1.4 +++ config/auto/memalign/test_c.in 19 Oct 2004 18:38:41 - @@ -9,6 +9,6 @@ int main(int argc, char **argv) { void *ptr = memalign(256, 17); - puts(ptr && ((int)ptr & 0xff) == 0 ? "ok" : "nix"); + puts(ptr && ((${ptrcast})ptr & 0xff) == 0 ? "ok" : "nix"); return 0; } Index: config/auto/memalign/test_c2.in === RCS file: /cvs/public/parrot/config/auto/memalign/test_c2.in,v retrieving revision 1.3 diff -u -r1.3 test_c2.in --- config/auto/memalign/test_c2.in 13 Jul 2003 18:52:37 - 1.3 +++ config/auto/memalign/test_c2.in 19 Oct 2004 18:38:41 - @@ -20,6 +20,6 @@ * arbitrary allocation size) */ int i = posix_memalign(&p, s, 177); - puts(((int)p & 0xff) == 0 && i == 0 ? "ok" : "nix"); + puts(((${ptrcast})p & 0xff) == 0 && i == 0 ? "ok" : "nix"); return i; }
Re: Problems with 0.1.1 release on x86-64
Sorry for the delay...work interfered with my playing and I had to transfer my CVS repo to my x86-64 machine. I don't know if I'd classify it as "silence thereafter..." as in the summary, but its pretty close :) Here's the diff against the current CVS. It doesn't mess with the default class that needs the split for the return & exception. Brian Wheeler [EMAIL PROTECTED] Index: config/auto/jit.pl === RCS file: /cvs/public/parrot/config/auto/jit.pl,v retrieving revision 1.33 diff -u -r1.33 jit.pl --- config/auto/jit.pl 8 Mar 2004 08:49:05 - 1.33 +++ config/auto/jit.pl 18 Oct 2004 05:25:57 - @@ -171,9 +171,9 @@ else { Configure::Data->set( jitarchname => 'nojit', - jitcpuarch => 'i386', - jitcpu => 'I386', - jitosname => 'nojit', + jitcpuarch => $cpuarch, + jitcpu => $cpuarch, + jitosname => $osname, jitcapable => 0, execcapable => 0, cc_hasjit => '', Index: config/auto/memalign.pl === RCS file: /cvs/public/parrot/config/auto/memalign.pl,v retrieving revision 1.10 diff -u -r1.10 memalign.pl --- config/auto/memalign.pl 13 Oct 2004 14:37:59 - 1.10 +++ config/auto/memalign.pl 18 Oct 2004 05:25:57 - @@ -42,6 +42,13 @@ Configure::Data->set('malloc_header', 'stdlib.h'); } +if (Configure::Data->get('ptrsize') == Configure::Data->get('intsize')) { + Configure::Data->set('ptrcast','int'); + } +else { + Configure::Data->set('ptrcast','long'); + } + cc_gen('config/auto/mema On Thu, 2004-10-14 at 06:37, Leopold Toetsch wrote: > Brian Wheeler wrote: > > > > * cast warnings in default.pmc. Changing static int cant_do_method to > > static long cant_do_method makes it compile without warnings, but its > > not the right fix. > > Better would be to split the return statement and the exception in the > generated code. > > > Below is a patch which fixes the first 3. > > Doesn't apply. Please rediff to current CVS and attach the patch. > > Thanks, > leolign/test_c.in'); eval { cc_build(); }; unless ($@ || cc_run_capture() !~ /ok/) { Index: config/auto/memalign/test_c.in === RCS file: /cvs/public/parrot/config/auto/memalign/test_c.in,v retrieving revision 1.4 diff -u -r1.4 test_c.in --- config/auto/memalign/test_c.in 13 Jul 2003 18:52:37 - 1.4 +++ config/auto/memalign/test_c.in 18 Oct 2004 05:25:57 - @@ -9,6 +9,6 @@ int main(int argc, char **argv) { void *ptr = memalign(256, 17); - puts(ptr && ((int)ptr & 0xff) == 0 ? "ok" : "nix"); + puts(ptr && ((${ptrcast})ptr & 0xff) == 0 ? "ok" : "nix"); return 0; } Index: config/auto/memalign/test_c2.in === RCS file: /cvs/public/parrot/config/auto/memalign/test_c2.in,v retrieving revision 1.3 diff -u -r1.3 test_c2.in --- config/auto/memalign/test_c2.in 13 Jul 2003 18:52:37 - 1.3 +++ config/auto/memalign/test_c2.in 18 Oct 2004 05:25:57 - @@ -20,6 +20,6 @@ * arbitrary allocation size) */ int i = posix_memalign(&p, s, 177); - puts(((int)p & 0xff) == 0 && i == 0 ? "ok" : "nix"); + puts(((${ptrcast})p & 0xff) == 0 && i == 0 ? "ok" : "nix"); return i; }
Problems with 0.1.1 release on x86-64
* PARROT_CPU_ARCH is defined as "i386" and PARROT_OS_NAME is "nojit" when jit determination fails. It now correctly (?) reports 'x86_64' and 'linux'. * Memory alignment tests warn about a pointer size difference in cast. A new configuration setting ptrcast is either 'int' or 'long' depending on what type a pointer is safe to cast to/from. I suspect that all cases will be 'long', but you never really know. * cast warnings in default.pmc. Changing static int cant_do_method to static long cant_do_method makes it compile without warnings, but its not the right fix. * Build fails with: gcc -shared -fPIC -g \ -o runtime/parrot/dynext/libnci.so src/nci_test.o /usr/bin/ld: src/nci_test.o: relocation R_X86_64_32 can not be used when making a shared object; recompile with -fPIC src/nci_test.o: could not read symbols: Bad value collect2: ld returned 1 exit status make: *** [runtime/parrot/dynext/libnci.so] Error 1 No idea how to fix this one...adding --ccflags=-fPIC doesn't help the compile. Below is a patch which fixes the first 3. Brian Wheeler [EMAIL PROTECTED] diff -ur ../parrot-0.1.1/classes/default.pmc ./classes/default.pmc --- ../parrot-0.1.1/classes/default.pmc 2004-10-06 10:55:36.0 -0500 +++ ./classes/default.pmc 2004-10-13 23:22:01.089477209 -0500 @@ -52,7 +52,7 @@ */ -static int +static long cant_do_method(Parrot_Interp interpreter, PMC * pmc, const char *methname) { internal_exception(ILL_INHERIT, diff -ur ../parrot-0.1.1/config/auto/jit.pl ./config/auto/jit.pl --- ../parrot-0.1.1/config/auto/jit.pl 2004-03-08 04:29:29.0 -0500 +++ ./config/auto/jit.pl2004-10-13 22:59:05.551927726 -0500 @@ -171,9 +171,9 @@ else { Configure::Data->set( jitarchname => 'nojit', - jitcpuarch => 'i386', - jitcpu => 'I386', - jitosname => 'nojit', + jitcpuarch => $cpuarch, + jitcpu => $cpuarch, + jitosname => $osname, jitcapable => 0, execcapable => 0, cc_hasjit => '', diff -ur ../parrot-0.1.1/config/auto/memalign/test_c2.in ./config/auto/memalign/test_c2.in --- ../parrot-0.1.1/config/auto/memalign/test_c2.in 2003-07-13 13:52:58.0 -0500 +++ ./config/auto/memalign/test_c2.in 2004-10-13 23:09:54.87570 -0500 @@ -20,6 +20,6 @@ * arbitrary allocation size) */ int i = posix_memalign(&p, s, 177); - puts(((int)p & 0xff) == 0 && i == 0 ? "ok" : "nix"); + puts(((${ptrcast})p & 0xff) == 0 && i == 0 ? "ok" : "nix"); return i; } diff -ur ../parrot-0.1.1/config/auto/memalign/test_c.in ./config/auto/memalign/test_c.in --- ../parrot-0.1.1/config/auto/memalign/test_c.in 2003-07-13 13:52:58.0 -0500 +++ ./config/auto/memalign/test_c.in2004-10-13 23:09:59.903596577 -0500 @@ -9,6 +9,6 @@ int main(int argc, char **argv) { void *ptr = memalign(256, 17); - puts(ptr && ((int)ptr & 0xff) == 0 ? "ok" : "nix"); + puts(ptr && ((${ptrcast})ptr & 0xff) == 0 ? "ok" : "nix"); return 0; } diff -ur ../parrot-0.1.1/config/auto/memalign.pl ./config/auto/memalign.pl --- ../parrot-0.1.1/config/auto/memalign.pl 2004-03-13 13:46:24.0 -0500 +++ ./config/auto/memalign.pl 2004-10-13 23:09:48.870037359 -0500 @@ -39,6 +39,14 @@ Configure::Data->set('malloc_header', 'stdlib.h'); } +if (Configure::Data->get('ptrsize') == Configure::Data->get('intsize')) { + Configure::Data->set('ptrcast','int'); + } +else { + Configure::Data->set('ptrcast','long'); +} + + cc_gen('config/auto/memalign/test_c.in'); eval { cc_build(); }; unless ($@ || cc_run_capture() !~ /ok/) {
Re: Python bytecode notes, part one
On Mon, 2004-07-05 at 19:27, Dan Sugalski wrote: UNARY_NEGATIVE: restore $Px; new $Py, Undef; $Py = $Px * -1; save $Py | $Px = -1 * $Px Wouldn't something this do what is desired? I'm just guessing, though. Brian
Re: Current CVS broken?
Yep, that did it! Thanks! Brian On Mon, 2003-06-09 at 23:11, Paul Fisher wrote: > Brian Wheeler <[EMAIL PROTECTED]> writes: > > > Any thoughts? > > Are you running RHL 9? If so, unset LANG and try rebuilding. > > Default RHL 9 w/ LANG="en_US.UTF-8": > > $ make jit_cpu.c > /usr/bin/perl jit2h.pl i386 jit_cpu.c > jit2h: 1 (+ 177 vtable) of 1050 ops are JITed. > > and with unset LANG... > > $ rm jit_cpu.c; unset LANG; make jit_cpu.c > /usr/bin/perl jit2h.pl i386 jit_cpu.c > jit2h: 204 (+ 177 vtable) of 1050 ops are JITed. > > Details on the broken Perl in RHL 9 are at > http://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=87682>.
Re: Current CVS broken?
On Mon, 2003-06-09 at 15:23, Leopold Toetsch wrote: > Brian Wheeler <[EMAIL PROTECTED]> wrote: > > Its been a while since I've looked at parrot, so I did a "cvs update > > -d", "perl Configure.pl", "make clean", "make" and build failed: > > $ make realclean > > should help. We don't have all dependencies in generated files sorted out > that far - sorry. > Doesn't, I'm afraid. I removed everything but the CVS directory at the top level, repopulated everything and tried to build it again with the same results Any thoughts? Brian > > It can't find Parrot_end_jit. Sure enough, its not there: > > Its in jit_cpu.c generated from core.jit. > > > Brian Wheeler > > leo
Current CVS broken?
Its been a while since I've looked at parrot, so I did a "cvs update -d", "perl Configure.pl", "make clean", "make" and build failed: $ make gcc -o parrot -L/usr/local/lib test_main.o blib/lib/libparrot.a -lnsl -ldl -lm -lpthread -lcrypt -lutil blib/lib/libparrot.a(jit_cpu.o)(.text+0x2ce0): In function `Parrot_jit_restart_op': : undefined reference to `Parrot_end_jit' collect2: ld returned 1 exit status make: *** [parrot] Error 1 It can't find Parrot_end_jit. Sure enough, its not there: $ nm blib/lib/libparrot.a | grep Parrot_end_jit U Parrot_end_jit $ grep --recursive Parrot_end_jit * include/parrot/jit_emit.h:static void Parrot_end_jit(Parrot_jit_info_t *, struct Parrot_Interp * ); include/parrot/jit_emit.h:Parrot_end_jit(jit_info, interpreter); jit/i386/jit_emit.h:static void Parrot_end_jit(Parrot_jit_info_t *, struct Parrot_Interp * ); jit/i386/jit_emit.h:Parrot_end_jit(jit_info, interpreter); Did I miss something obvious? Brian Wheeler [EMAIL PROTECTED]
RE: Parrot 0.0.9
On Tue, 2002-10-29 at 11:48, Brent Dax wrote: > [EMAIL PROTECTED]: > # > Well, on thinking a bit about this, there's no reason that > # we have to > # > worry--it's perfectly OK for us to declare, unconditionally, that > # > segment 0 is always bytecode, 1 line number info, and so on, with > # > everything after position X (for some value of X) left up > # in the air. > # > A bit dodgy, true, as it means that any new known segment > # types we add > # > in will be floating, but I don't think we're going to end > # up with too > # > many performance-critical pieces in the bytecode. (Arguably > # it's just > # > the bytecode itself, the symbols, and the constants, as the > # rest are > # > looked at under exceptional circumstances or on (rare) demand) > # > # *No* > # > # This really kills extendability, or makes it at least very > # ugly. It needs to prealloc a certain number of segments. Each > # of this has a fixed semantic. Extending means consuming on of > # the preallocated fields, or using some segment beyond the > # preallocated area but then it needs a type field. In fact the > # preallocated segments also have a > # type-field: the position in the packfile. > > How about this structure: > > HEADER > SEGMENT 0 > CHUNK 0 (DIRECTORY) > SIZE: > DATA: > CHUNK 0 ENTRY > TYPE: DIRECTORY (type 0) > OFFSET: Is this really necessary? Seems like a chicken-and-egg thing: to know which chuck the directory is in, you need to read the directory. However, since you've defined that the first chunk (0) is always the directory, there's really no need to have it in the directory since you know it has to be the first chunk. Out of curiosity, would I need a separate segments if I was going to have multiple versions of the program (say, one debugging and one optimized) in the same file? It looks that way. Will the segment/chunk ids's be consistent between builds & how do I know what they will be in advance (for dynamically loading the 'debugging' version on demand) ? Brian
Re: [netlabs #757] Problem mixing labels, comments and quote-marks
On Sat, 2002-07-13 at 12:32, Tom Hughes wrote: > In message <20020703012231$[EMAIL PROTECTED]> > Simon Glover (via RT) <[EMAIL PROTECTED]> wrote: > > > This code: > > > > A:# prints "a" > > print "a" > > end > > > > doesn't assemble; the assembler dies with the error message: > > > > Use of uninitialized value in hash element at assemble.pl line 844. > > Couldn't find operator '' on line 1. > > > > If you remove the ""s from the comment, it works fine. Likewise, if > > you put the label, op and comment on the same line, ie: > > > >A: print "a" # prints "a" > > end > > > > then it assembles and runs OK. > > Here's a patch that will fix this. I havn't committed it because I'm > not sure why the assember wasn't dropping comments that included quotes > so I'm giving people who know more about the assembler than me a chance > to comment first... I believe it wasn't dropping the comments with quotes as a side effect of not wanting to break things like: print "#" which breaks with the included patch. I basically had the same patch you do, but wasn't able to figure out how to handle the above case *and* do the right thing with # prints "a" Brian > > Tom > > -- > Tom Hughes ([EMAIL PROTECTED]) > http://www.compton.nu/ > > > Index: assemble.pl > === > RCS file: /cvs/public/parrot/assemble.pl,v > retrieving revision 1.77 > diff -u -r1.77 assemble.pl > --- assemble.pl 4 Jul 2002 18:36:17 - 1.77 > +++ assemble.pl 13 Jul 2002 17:30:48 - > @@ -433,7 +433,7 @@ > >$self->{pc}++; >return if $line=~/^\s*$/ or $line=~/^\s*#/; # Filter out the comments and blank >lines > - $line=~s/#[^'"]+$//; # Remove trailing comments > + $line=~s/#.*$//; # Remove trailing comments >$line=~s/(^\s+|\s+$)//g; # Remove leading and trailing whitespace ># ># Accumulate lines that only have labels until an instruction is found..
[PATCH] 3-arg chopn
I saw this was a TODO item in core.ops. Brian --- core.ops1 Jul 2002 17:18:04 - 1.176 +++ core.ops2 Jul 2002 19:41:44 - @@ -2074,9 +2074,9 @@ =item B(inout STR, in INT) -Remove $2 characters from the end of the string in $1. +=item B(out STR, in STR, in INT) -TODO: Create a three-argument version of this? Don't force in-place modification.' +Remove $2 characters from the end of the string in $1. =cut @@ -2085,6 +2085,11 @@ goto NEXT(); } +inline op chopn(out STR, in STR, in INT) { + $1 = string_copy(interpreter, $2); + (void)string_chopn($1,$3); + goto NEXT(); +}
[Patch] typeof op
This patch implements a typeof op which returns the integer type or string type of a PMC. The test I used is: new P0,.PerlInt typeof S0,P0 eq S0,"PerlInt",OK_1 print "not " OK_1: print "ok 1\\n" typeof I0,P0 eq I0,.PerlInt,OK_2 print "not " OK_2: print "ok 2\\n" end This should allow a program to know what kind of PMC they're playing with. Brian diff -u -r1.174 core.ops --- core.ops28 Jun 2002 18:50:05 - 1.174 +++ core.ops29 Jun 2002 18:49:32 - @@ -3884,6 +3884,27 @@ goto NEXT(); } + +=item B(out STR, in PMC) + +=item B(out INT, in PMC) + +Return the type of PMC in $2. + +=cut + +inline op typeof (out STR, in PMC) { + PMC *p = $2; + $1 = (p->vtable->name(interpreter, p)); + goto NEXT(); +} + +inline op typeof (out INT, in PMC) { + PMC *p = $2; + $1 = (p->vtable->type(interpreter, p)); + goto NEXT(); +} + =item B(out INT, in STR)
Re: .include directive for new assembler
On Sat, 2002-06-22 at 20:12, Jeff wrote: > brian wheeler wrote: > > > > Its not backwards, it does the right thing. > > Okay, I believe you now :) I was thinking that the insert was done at > the beginning of the -file-, not the insertion point of the file. If you > haven't committed, feel free to do so. I shouldn't have stuck my nose in > :) > no problem! I have a tendency to not actually test my code thoroughly! Brian > > The "at the beginning" part > > is correct since the source is treated like a queue, and the first item > > is shifted off. When an .include is found, the file is inserted at the > > beginning of the queue, since all of the code prior to the .include have > > been removed (shifted) off. > > > > So this (also contrived) example works as expected: > > > > test.pasm > > -- > > print "yo!\n" > > .include "foo.pasm" > > .include "foo2.pasm" > > print "you are here\n" > > end > > -- > > > > foo.pasm > > -- > > print "hello " > > -- > > > > foo2.pasm > > -- > > print "world\n" > > -- > > > > Does that make more sense? > > > > Brian > > > > > -- > > > Jeff <[EMAIL PROTECTED]>
Re: .include directive for new assembler
On Sat, 2002-06-22 at 13:06, Jeff wrote: > Clinton A Pierce wrote: > > > > At 09:37 PM 6/21/2002 -0500, brian wheeler wrote: > > >I've implemented a .include directive for the new assembler. It > > >basically changes the preprocessor to shift through the source file, and > > >when an include is found, the included file is unshifted to the > > >beginning. > > > > To the beginning? Do we have any pre-processor directives (.constant, > > etc...) that are sensitive to where they're done? > > > > Either way, I'm fine with it. As soon as it's in, I'll fix BASIC to remove > > the hand-rolled .include stuff it has now. > > I hadn't read the 'beginning' part, unfortunately. That actually could > change a lot of things. If it's unshifted onto the beginning of the > file, then each new file will become the main as it's unshifted onto the > beginning. For example, any code outside of a macro declaration would > get run before the main file, and potentially initialize in the wrong > order. > > To be fair, this is more of a fault with the current assembler not > having built-in support for subroutine boundaries. Correct me please, > Brian, but I'm envisioning a situation like this: > > .include "foo.pasm" > .include "bar.pasm" > > MAIN: > print "hi" > end > > Expanding to: > > Initialize_Bar: > set I0,32 > branch END_OF_BAR # This is, of course, something of a bogus mechanism > to begin with, but it should make my point. > .constant IO_VECTOR S31 > # Some subroutines > END_OF_BAR: > Initialize_Foo: > set I1,42 > branch END_OF_FOO # See above comment > .constant IO_VECTOR S30 > END_OF_FOO: > > MAIN: > print "hi" > end > > The code is admittedly highly contrived, but note that initialization is > being done in the wrong order with respect to the files (foo.pasm's init > should be run then bar.pasm), and IO_VECTOR is defined to be S30 rather > than S31. The Initialize_Bar and Initialize_Foo functions have just been > called in the wrong order. > > Now, I realize that modifying that bit of the assembler to count its > position in the array of lines is non-trivial, and that's entirely my > fault. However, having it attach files to the beginning of the list > would violate the principle of least surprise, and could cause problems > with conditional macros, which I've designed but haven't seen a need to > implement yet. > > If you wouldn't mind rewriting the patch so that it substitutes files > inline at the point of .include'ing I'll be happy to revert the old > patch and put your new one in. Sorry about the seeming change of mind, > but this is what I get for affirming stuff at obscenely late hours. Its not backwards, it does the right thing. The "at the beginning" part is correct since the source is treated like a queue, and the first item is shifted off. When an .include is found, the file is inserted at the beginning of the queue, since all of the code prior to the .include have been removed (shifted) off. So this (also contrived) example works as expected: test.pasm -- print "yo!\n" ..include "foo.pasm" ..include "foo2.pasm" print "you are here\n" end -- foo.pasm -- print "hello " -- foo2.pasm -- print "world\n" -- Does that make more sense? Brian > -- > Jeff <[EMAIL PROTECTED]>
.include directive for new assembler
I've implemented a .include directive for the new assembler. It basically changes the preprocessor to shift through the source file, and when an include is found, the included file is unshifted to the beginning. Should I commit it? Brian --- assemble.pl 17 Jun 2002 03:18:17 - 1.74 +++ assemble.pl 22 Jun 2002 02:39:15 - @@ -227,7 +227,9 @@ my $reg_re = qr([INSP]\d+); my $num_re = qr([-+]?\d+(\.\d+([eE][-+]?\d+)?)?); - for(@{$self->{cur_contents}}) { + my @todo=@{$self->{cur_contents}}; + while(scalar(@todo)) { +$_=shift(@todo); $line++; # @@ -263,16 +265,19 @@ elsif(/^\.include \s+ "([^"]+)" /x) {# .include "{file}" -# if(-e $1) { -#open FOO,"< $1"; -#while() { -# chomp; -#} -#close FOO; -# } -# else { -#print STDERR "Couldn't open '$1' for inclusion at line $line: $!.\n"; -# } + if(-e $1) { +open FOO,"< $1"; +my @include; +while() { + chomp; + push(@include,$_); +} +unshift(@todo,@include); +close FOO; + } + else { +print STDERR "Couldn't open '$1' for inclusion at line $line: $!.\n"; + } } elsif(/^\.macro\s+ ($label_re) \s*
Re: quicksort in pasm
Crud, I forgot to attach the quicksort in the last one... Brian On Sat, 2002-05-25 at 21:17, brian wheeler wrote: > On Fri, 2002-05-24 at 15:10, Sean O'Rourke wrote: > > > I was starting with a very simple test to decide how to determine where the > > > memory overuse was coming from, > > > > I'm actually looking at this now as well, though with zip2.pasm instead of > > quicksort. What I've found is that because zip constructs the result with > > incremental packing, and since the 3-argument pack is implemented through > > string_concat, which allocates a new string every time, it chews through > > vast quantities of memory. At some point, these buffers (which are now > > ~45k each) stop getting freed, and Parrot grows from < 10M to about 200M > > before my laptop dies a horrible death. > > > > Sort does things much differently, but I wouldn't be surprised if it's > > hittng the same things-not-being-reclaimed bug as zip. > > > > > ["interesting" sorting results] > > > Am I missing something somewhere? > > > > You've probably got it in EBCDIC mode or something. That or it's a > > "randomized" quicksort, which is supposed to be faster. > > > > /s > > Actually I was getting the wrong results because I was using ge & le to > test the bounds when I should have used gt & lt. > > The attached quicksort fixes it. Its still _huge_ when the memory runs. > > > Looking at the profile data, it looks like Parrot_pushs is the big time > user... > % cumulative self self total > time seconds secondscalls ns/call ns/call name > 66.67 0.02 0.02 singlebyte_decode > 33.33 0.03 0.01 646 15479.88 15479.88 Parrot_pushs > > > Of course, if I understood the calling convention, I might get away with > less pushs's :) > > Brian > > > > # # quicksort.pasm # # Author: Brian Wheeler ([EMAIL PROTECTED]) # # Usage: #./parrot quicksort.pbc < /file/to/quicksort # main: new P0,PerlArray set I1,0 m1: readlineS0,0 length I0,S0 eq I0,0,m2 chopn S0,1 set_keyed P0,I1,S0 inc I1 branch m1 m2: set I0,0 set I1,P0 dec I1 bsr quicksort m3: get_keyed S0,P0,I0 print S0 print "\n" inc I0 le I0,I1,m3 end # # quicksort # params: # I0 low # I1 high # P0 array to sort # returns: # nothing quicksort: pushi print "Starting quicksort with " print I0 print " and " print I1 print "\n" set I11,I1 # High=high; set I10,I0 # Low=low; le I1,I0,q1# if(high>low) { bsr partition # pivot=partition(p,low,high); set I0,I10 # quicksort(p,Low,pivot-1) set I1,I2 dec I1 bsr quicksort set I0,I2 # quicksort(p,pivot+1,High) inc I0 set I1,I11 bsr quicksort q1: popi# } ret # # partition # params: # I0 low # I1 high # P0 array to sort # returns: # I2 pivot partition: pushi pushs get_keyed S2,P0,I0# S2 is pivot_item set I2,I0 # I2 is pivot index set I10,I0 # I10 is left set I11,I1 # I11 is right p1: ge I10,I11,p1e # while(left < right) { p2: get_keyed S0,P0,I10 # while(p[left] <= pivot_item gt S0,S2,p2e gt I10,I11,p2e # && left <= right) { inc I10 # left++; branch p2 # } p2e: p3: get_keyed S1,P0,I11 # while(p[right] > pivot_item le S1,S2,p3e lt I11,I10,p3e # && right >= left) { dec I11 # right--; branch p3 # } p3e: ge
Re: quicksort in pasm
On Fri, 2002-05-24 at 15:10, Sean O'Rourke wrote: > > I was starting with a very simple test to decide how to determine where the > > memory overuse was coming from, > > I'm actually looking at this now as well, though with zip2.pasm instead of > quicksort. What I've found is that because zip constructs the result with > incremental packing, and since the 3-argument pack is implemented through > string_concat, which allocates a new string every time, it chews through > vast quantities of memory. At some point, these buffers (which are now > ~45k each) stop getting freed, and Parrot grows from < 10M to about 200M > before my laptop dies a horrible death. > > Sort does things much differently, but I wouldn't be surprised if it's > hittng the same things-not-being-reclaimed bug as zip. > > > ["interesting" sorting results] > > Am I missing something somewhere? > > You've probably got it in EBCDIC mode or something. That or it's a > "randomized" quicksort, which is supposed to be faster. > > /s Actually I was getting the wrong results because I was using ge & le to test the bounds when I should have used gt & lt. The attached quicksort fixes it. Its still _huge_ when the memory runs. Looking at the profile data, it looks like Parrot_pushs is the big time user... % cumulative self self total time seconds secondscalls ns/call ns/call name 66.67 0.02 0.02 singlebyte_decode 33.33 0.03 0.01 646 15479.88 15479.88 Parrot_pushs Of course, if I understood the calling convention, I might get away with less pushs's :) Brian
quicksort in pasm
0.001 0.00 0.02 mem_setup_allocator [41] 0.0 0.35 0.001 0.00 0.00 mem_sys_realloc [125] 0.0 0.35 0.001 0.00 0.02 new_hash [44] 0.0 0.35 0.001 0.00 0.00 new_sized_resource_pool [60] 0.0 0.35 0.001 0.00 0.01 new_tracked_header [55] 0.0 0.35 0.001 0.00 0.00 parseflags [126] 0.0 0.35 0.001 0.00 349.75 runops [3] 0.0 0.35 0.001 0.00 349.75 runops_cgoto_core [4] 0.0 0.35 0.001 0.00 349.75 runops_generic [5] 0.0 0.35 0.001 0.00 0.00 sized_index [127] 0.0 0.35 0.001 0.00 0.00 string_grow [64] 0.0 0.35 0.001 0.00 0.00 string_init [128] I don't see anything that really stands out (unlike earlier builds where string_make was taking the most time) Any thoughts? Brian # # quicksort.pasm # # Author: Brian Wheeler ([EMAIL PROTECTED]) # # Usage: #./parrot quicksort.pbc < /file/to/quicksort # main: new P0,PerlArray set I1,0 m1: readlineS0,0 length I0,S0 eq I0,0,m2 chopn S0,1 set_keyed P0,I1,S0 inc I1 branch m1 m2: set I0,0 set I1,P0 dec I1 bsr quicksort m3: get_keyed S0,P0,I0 print S0 print "\n" inc I0 le I0,I1,m3 end # # quicksort # params: # I0 low # I1 high # P0 array to sort # returns: # nothing quicksort: pushi print "Starting quicksort with " print I0 print " and " print I1 print "\n" set I11,I1 # High=high; set I10,I0 # Low=low; le I1,I0,q1# if(high>low) { bsr partition # pivot=partition(p,low,high); set I0,I10 # quicksort(p,Low,pivot-1) set I1,I2 dec I1 bsr quicksort set I0,I2 # quicksort(p,pivot+1,High) inc I0 set I1,I11 bsr quicksort q1: popi# } ret # # partition # params: # I0 low # I1 high # P0 array to sort # returns: # I2 pivot partition: pushi pushs get_keyed S2,P0,I0# S2 is pivot_item set I2,I0 # I2 is pivot index set I10,I0 # I10 is left set I11,I1 # I11 is right p1: ge I10,I11,p1e # while(left < right) { p2: get_keyed S0,P0,I10 # while(p[left] <= pivot_item gt S0,S2,p2e ge I10,I11,p2e # && left < right) { inc I10 # left++; branch p2 # } p2e: p3: get_keyed S1,P0,I11 # while(p[right] > pivot_item le S1,S2,p3e le I11,I10,p3e # && right > left) { dec I11 # right--; branch p3 # } p3e: ge I10,I11,p4 # if(left < right) { get_keyed S3,P0,I10 # swap(p,left,right); get_keyed S4,P0,I11 set_keyed P0,I11,S3 set_keyed P0,I10,S4 p4: # } branch p1 # } p1e:get_keyed S3,P0,I11 # p[low]=p[right]; set_keyed P0,I0,S3 set_keyed P0,I11,S2 # p[right]=pivot_item; saveI11 pops popi restore I2 # return right; ret
Non-vtable functionality on PMCs?
I've been trying to catch up with parrot again (darn it, babies take more time than I thought :) and I've come up with a question... how do you do "other" things to PMCs that aren't normal ops? In particular, I was wondering about shift/unshift, push/pop on the PerlArray PMC. Am I missing something obvious? Brian
Re: Rewriting the assembler
On Wed, 2002-02-27 at 14:07, Simon Cozens wrote: > I know some people have been talking about rewriting the assembler; I've > had some more thoughts on this over the past couple of days. > > First, I think that our assembler is going to be a reference implementation > for those producing bytecode-emitting compilers. It does not need to be > fast, but it does need to be clear and easy to understand. > No arguments there. > Some people have been talking about making the assembler more OO; in my > opinion, this is a mistake. OO programming has the great advantage that you > can abstract away a lot of the tricky bits, hiding the complexity. It however > has the disadvantage that you hide away the complexity. Let's make this a > nice, open, transparent program. > > The way I've thought about doing it so far is not unlike the Unix toolset > model; small components doing a dedicated task. Assembly is essentially a > filtering process, and Unix filter techniques can be brought to bear on it. > For instance, I've just written a little component which takes the assembly > and expands the ops, turning "set" into "set_p_ic" or whatever it may be. Interesting. > > Granted, these components will share some library code, such as that to > parse out a line of assembly source, but I think that specialized elements > working on text is the way to go here. What is wrong with a simple array (of arrays) that gets passed around? > > The real advantage of this method, other than making the overall design > and process of the assembler easy to understand, is that we can slot in > optimizations as additional filters at any stage of the assembler's operation. There certainly are advantages...but there are disadvantages too. A scalar holding a honking big chunk of text is going to be difficult to identify what source line was originally used. > > I'll produce a more specific PDD about how I'd like the assembler to look > if this idea makes any sense to anyone other than me. > Here's my hat, which I will now throw into the ring :) * Yes, its OO, just so we don't have a freaking ton of variables laying around which could potientially conflict with someone who wants to embed this into a compiler or something * sort of does what Simon wants, in that there are separate phases, which could have extra ones inserted. * doesn't actually dump bytecode, but will happily parse queens.pasm and most of the other things I've thrown at it, providing it doesn't have macros or label arithmetic. Two files are here: * a completely hideous "test-newassembler.pl" which calls the module. * the 350 line NewAssembler.pm thingy itself. This is just something I'm tinkering with. Vomit on it if you must :) Brian package Parrot::NewAssembler; use Carp; use Parrot::Config; use Parrot::Op; use Parrot::OpLib::core; use Parrot::PMC qw(%pmc_types); use Parrot::PackFile; use Parrot::Types; use Symbol; sub new { my($class,%args)=@_; my $self={ # output file information 'packfile'=>new Parrot::PackFile, 'bytecode'=>'', 'listing'=>'', # assembly 'PC'=>0, 'opcodes'=>{}, 'errors'=>[], # constants 'constants'=>{}, 'constant_data'=>[], # Label/fixup information 'label'=>{}, 'last_label'=>'__start__', 'fixup'=>{}, # preprocessor data 'macro'=>{}, 'equate'=>{}, 'include_path'=>[], }; # Initialize opcode table. foreach my $op (@$Parrot::OpLib::core::ops) { $self->{'opcodes'}{$op->full_name}=$op; } return bless $self,$class; } sub error { my($self,$msg,$file,$line)=@_; push(@{$self->{errors}},"ERROR ($file,$line): $msg"); } # # preprocess: preprocess assembler source, and convert into internal # format. # sub preprocess { my($self,$scalar_code,$file,$line)=@_; my $code=[]; my $counter=$line || 1; my @lines=split(/\n/,$scalar_code); while(scalar(@lines)) { my $l=shift(@lines); # line to pre-process my $rl=$l; # raw line $l=~s/\#.*//; $l=~s/^\s+//; $l=~s/\s+$//; # do equate substitution (if any) foreach (keys(%{$self->{equate}})) { $l=~s/\b$_\b/$self->{equate}{$_}/g; } # handle include directive if($l=~m/INCLUDE\s+['"](.+)["']/) { # an include directive has been found. my($filename)=$1; if(! -e $filename) { my $found=0; foreach my $path (@{$self->{include_path}}) { if(-e "$path/$filename") { $filename="$path/$filename"; $found=1; last; } } if(!$found) { $self->error("'$filename' not found.",$file,$line); return undef; }
Re: .NET CLR and Parrot
On Sat, 2002-02-23 at 13:12, Dan Sugalski wrote: > At 12:22 PM -0500 2/23/02, Melvin Smith wrote: > >At 11:53 AM 2/23/2002 +, Simon Cozens wrote: > >>I was very lucky recently to attend a talk by Ganesh Sittampalam > >>introducing Microsoft .NET and the Common Language Runtime. A lot of > >>what CLR is trying to do is quite similar to what we're doing with > >>Parrot, so I thought it would be a good idea to briefly recap what's > >>going on with CLR. > > > >I've read several specs on CLR and ILAsm; there are some good ideas. > > > > > >I wouldn't want Parrot to look like ILA but I wish we would at least > >have more dialogue on things like pbc format, adding directives to > >the language, etc. > > Then lets start. I've not been giving the assembler much attention > mainly because I've been assuming that it'll be marginalized quickly, > but I think that assumption's a bad one. > > We also need a PDD for the bytecode file format. > > >We need notation for globals, notation for metadata about objects and > >types, etc. etc. but frankly the last time I looked at the assembler I sort > >of got lost. > > You're not alone here. The assembler gives me headaches, and I think > it's stymied Simon recently as well. > ;) me too [uh oh, I wrote bits of it!] I've been playing around with a new assembler that might be a bit better laid out. Here's what I've got done so far: * object oriented: * you can load more than one assembler at once * assembly can be done in parts, i.e. $a=new Assembler; $a->assemble("set I0,3"); $a->assemble("set I1,4"); is the same as: $a=new Assembler; $a->assemble("set I0,3\nset I1,4"); * the preprocessor is a separate routine * directives must be in upper case (per PDD) * local labels, plus the ability to use local labels from outside the scope: global_label.local_label from anywhere will find the right one. * argument types are fully determined prior to looking up signatures I've not got these things finished * label arithmetic [is this really needed?] * macros * actual bytecode output I just kind of set it aside because I wasn't sure if it was really needed. I can dig it back up if there is interest... Brian
Re: We have PMCs. Time to start work.
On Fri, 2001-11-23 at 13:41, Simon Cozens wrote: > On Fri, Nov 23, 2001 at 06:04:29PM +, Simon Cozens wrote: > > * Rewrite mops.pasm to use integer PMCs, and compare the speeds. > > I couldn't wait. :) > > % ../../test_prog mops.pbc > Iterations:1 > Estimated ops: 2 > Elapsed time: 9.948440 > M op/s:20.103654 > > % ../../test_prog mops_p.pbc > Iterations:1 > Estimated ops: 2 > done > Elapsed time: 20.994231 > M op/s:9.526427 > I don't get it. What kind of machine was that on? Here are my numbers: [bdwheele@thor parrot]$ ./test_prog mops.pbc Iterations:1 Estimated ops: 2 Elapsed time: 11.016154 M op/s:18.155156 [bdwheele@thor parrot]$ ./test_prog mops_p.pbc Iterations:1 Estimated ops: 2 done Elapsed time: 14.288639 M op/s:13.997134 My I-regs one is slower than yours, and my P-regs one is faster...this is on a P4/1.7G under Linux 2.4.15pre8 Any ideas? Brian
Re: sizeof(INTVAL), sizeof(void*), sizeof(opcode_t)
On Tue, 2001-11-20 at 12:19, Ken Fox wrote: > James Mastros wrote: > > In byteswapping the bytecode ... > > > > I propose that we make INTVAL and opcode_t the same size, and gaurrenteed > > to be able to hold a void*. > > It sounds like you want portable byte code. Is that a goal? It seems like > we can have either mmap'able byte code or portable byte code, but not both. > Personally, I'd rather have portable byte code because memory is cheap > and self-modifiying byte code opens up a lot of optimization potential. I > know others disagree. Hmm. It wouldn't necessarily be portable, though it probably would be on machines with the same size & endianness. So, on an alpha, you'd have: sizeof(INTVAL)=sizeof(opcode_t)=sizeof(void *)=64 bits whereas on x86 (and other 32 bit machines) sizeof(INTVAL)=sizeof(opcode_t)=sizeof(void *)=32 bits and on the Parrot/C-64 VM you'd have sizeof(INTVAL)=sizeof(opcode_t)=sizeof(void *)=16 bits Is that right? Brian > > Are we looking at two different byte code formats? Dan? > > - Ken
Re: sizeof(INTVAL), sizeof(void*), sizeof(opcode_t)
On Mon, 2001-11-19 at 19:59, James Mastros wrote: > Hey all. > In parellel to splitting out features (yeah, I like that better then > "platforms" too) (which is going well this time, I think (I'm being a lot > better about checking against clean checkouts, but having problems > thinking of a good generic interface for open() and friends), I'm thinking > about a new packfile format. > I'm comeing across a common problem to both of them. That is, that the > sizeof things don't come in nice easy units. > In byteswapping the bytecode, it's a Very Good Thing if you can just > treat it as an array of some specifc type, so you don't have to figure out > how man args each opcode takes (this may, in fact, be impossible, if > oplibs are dynamicly loadable during the bytecode's runtime). > In the feature-splitting, for example in file opening, some interfaces > (POSIX open(), for example) want to return an integer (the fd), and some > (win32 CreateFile(), for example) want to return a void* (the file > handle). (This is a pointer to a kernel-allocated structure that will > cause a segfault to directly access, BTW, so no GC problems here.) > > I propose that we make INTVAL and opcode_t the same size, and gaurrenteed > to be able to hold a void*. > Seems reasonable to me, since jsr and jump are slated to use an I register to jump to a host-machine-address and start interpreting there. Brian
RE: Size of integer register vs sizeof(void *)
On Mon, 2001-11-19 at 12:43, Hong Zhang wrote: > > Are there any cases where a void * cannot be placed into an integer > > register? It seems like it shouldn't happen, especially since jump and > > jsr are supposed to take an integer register and they point to a > > host-machine-address... > > What register are you talking about? The 16-bit x86 has 16-bit integer > registers, but uses segment:offset as address. So the sizeof(void*) may > be 32-bit, plus you have to deal with far, near, huge, ... (total 6 types) > of pointers. > > Hong Sorry: should be "size of parrot integer vs sizeof(void *)" Should have clarified that :) Brian
Size of integer register vs sizeof(void *)
Are there any cases where a void * cannot be placed into an integer register? It seems like it shouldn't happen, especially since jump and jsr are supposed to take an integer register and they point to a host-machine-address... Brian
Re: Opcode numbers
On Sat, 2001-11-03 at 21:40, Gregor N. Purdy wrote: > James -- > > > We're going to have to think about assigning static opcode numbers, > > instead of the current order-defined. For one thing, we're looking at > > perpetual bytecode compatablity (no?). This isn't really a Big Deal, but we > > need to: > > 1) Define an ordering on things like open(i, s|sc, i|ic, i|ic). > > 2) Define the notation for setting the opcode number in the .ops file. > > 3) Define what the correct behavor is for when you have holes in the opcode > >numbering. > > None of these are issues with the approach I've been working on / > advocating. I'm hoping we can avoid these altogether. > I think this is a cool concept, but it seems like a lot of overhead with the string lookups. How about instead of the string lookups the ops remained a constant per version. Once the core is set up, it shouldn't change any, though we'll be adding new ops later (and possibly be removing some). Given that, and since the ops are created in a predictable fashion, why not create a version number for "new" ops in the module. For example, if after the core is stable we want to add a new debugging op called "dwim". We could add it to the file (in any location) as: AUTO_OP dwim() 1.1 { } Where 1.1 is the version number. All unversioned core ops (at last count 292) would be generated first in the order they are defined. Then all versioned ops in increasing version order in the order they are defined. This way, version 1 core ops would be in the table from 1-292 and 1.1 ops might be in 293-333. We could specify ".use core 1" to disallow dwim and friendsor to make backwards compatible code. By default the whole table would be available. It would provide some sanity checking without the overhead of doing a bunch of string lookups on every load. Of course, its not perfect... Thoughts? Or am I too tired to be sending email? :) Brian
RE: Ooops, sorry for that blank log message.
On Fri, 2001-10-26 at 09:57, Sam Tregar wrote: > On Fri, 26 Oct 2001, Brent Dax wrote: > > > What if I want my compiler to be lazy? Do you have the right to punish > > me for my laziness by making me add constant folding to my optimizer (or > > perhaps making me *write* an optimizer just to do constant folding)? > > Actually, a really lazy compiler will never use constants with anything > other than "set"! See the scheme compiler posted here last week for a > good example. > I've got a dumb question, and its probably because I've not been paying attention, so I apologise in advance. How does a program access more than 32 variables simultaneously? In real CPU architectures you've got main memory storage, but here we only have registers. Brian
Re: Ooops, sorry for that blank log message.
On Fri, 2001-10-26 at 01:32, Tom Hughes wrote: > In message <[EMAIL PROTECTED]> > Brian Wheeler <[EMAIL PROTECTED]> wrote: > > > Darn it, I fat fingered the log message. > > > > This is a fix which changes the way op variants are handled. The old > > method "forgot" the last variant, so thing(i,i|ic,i|ic) would > > generate: > > thing(i,i,i) > > thing(i,i,ic) > > thing(i,ic,i) > > > > but not > > > > thing(i,ic,ic) > > It didn't forget it, it went to some considerable trouble to > ignore it on the grounds that such an opcode is pointless as > alll the operands are constant. > > I did describe the algorithm used and the logic behind it on the > list when I implemented it. > Except there was a case where it was not generating needed opcodes. I was implementing open just to see if I could. The prototype would be: AUTO_OP open (i,s|sc,i|ic,i|ic) { where $1 is the new handle, $2 is the name of the file, $3 is the flags, and $4 is the mode of the file if it has to create it. the open_i_sc_ic_ic opcode isn't pointless :) Brian
Ooops, sorry for that blank log message.
Darn it, I fat fingered the log message. This is a fix which changes the way op variants are handled. The old method "forgot" the last variant, so thing(i,i|ic,i|ic) would generate: thing(i,i,i) thing(i,i,ic) thing(i,ic,i) but not thing(i,ic,ic) The new one does. Brian
[patch] changes to trace
This patch makes trace a little more useful. It prints the constant referred to, as well as the value of the register being accessed. This string reverse program trace 1 set S0,"Hello world" set S1,"" set S2,"" length I0,S0 dec I0 $loop: substr S2,S0,I0,1 concat S1,S2 dec I0 ge I0,0,$loop set S0,S1 end produces this output (I removed some spaces on PC=16 to stop the word wrap in my mailer): PC=2; OP=9 (set_s_sc); ARGS=(S0=(null), "Hello world") PC=5; OP=9 (set_s_sc); ARGS=(S1=(null), "") PC=8; OP=9 (set_s_sc); ARGS=(S2=(null), "") PC=11; OP=115 (length_i_s); ARGS=(I0=0, S0="Hello world") PC=14; OP=66 (dec_i); ARGS=(I0=11) PC=16; OP=117 (substr_s_s_i_ic); ARGS=(S2="", S0="Hello world", I0=10,1) PC=21; OP=110 (concat_s_s); ARGS=(S1="", S2="d") PC=24; OP=66 (dec_i); ARGS=(I0=10) PC=26; OP=43 (ge_i_ic_ic); ARGS=(I0=9, 0, -10) PC=16; OP=117 (substr_s_s_i_ic); ARGS=(S2="d", S0="Hello world", I0=9,1) PC=21; OP=110 (concat_s_s); ARGS=(S1="d", S2="l") PC=24; OP=66 (dec_i); ARGS=(I0=9) PC=26; OP=43 (ge_i_ic_ic); ARGS=(I0=8, 0, -10) PC=16; OP=117 (substr_s_s_i_ic); ARGS=(S2="l", S0="Hello world", I0=8,1) PC=21; OP=110 (concat_s_s); ARGS=(S1="dl", S2="r") PC=24; OP=66 (dec_i); ARGS=(I0=8) PC=26; OP=43 (ge_i_ic_ic); ARGS=(I0=7, 0, -10) PC=16; OP=117 (substr_s_s_i_ic); ARGS=(S2="r", S0="Hello world", I0=7,1) PC=21; OP=110 (concat_s_s); ARGS=(S1="dlr", S2="o") PC=24; OP=66 (dec_i); ARGS=(I0=7) PC=26; OP=43 (ge_i_ic_ic); ARGS=(I0=6, 0, -10) PC=16; OP=117 (substr_s_s_i_ic); ARGS=(S2="o", S0="Hello world", I0=6,1) PC=21; OP=110 (concat_s_s); ARGS=(S1="dlro", S2="w") PC=24; OP=66 (dec_i); ARGS=(I0=6) PC=26; OP=43 (ge_i_ic_ic); ARGS=(I0=5, 0, -10) PC=16; OP=117 (substr_s_s_i_ic); ARGS=(S2="w", S0="Hello world", I0=5,1) PC=21; OP=110 (concat_s_s); ARGS=(S1="dlrow", S2=" ") PC=24; OP=66 (dec_i); ARGS=(I0=5) PC=26; OP=43 (ge_i_ic_ic); ARGS=(I0=4, 0, -10) PC=16; OP=117 (substr_s_s_i_ic); ARGS=(S2=" ", S0="Hello world", I0=4,1) PC=21; OP=110 (concat_s_s); ARGS=(S1="dlrow ", S2="o") PC=24; OP=66 (dec_i); ARGS=(I0=4) PC=26; OP=43 (ge_i_ic_ic); ARGS=(I0=3, 0, -10) PC=16; OP=117 (substr_s_s_i_ic); ARGS=(S2="o", S0="Hello world", I0=3 1) PC=21; OP=110 (concat_s_s); ARGS=(S1="dlrow o", S2="l") PC=24; OP=66 (dec_i); ARGS=(I0=3) PC=26; OP=43 (ge_i_ic_ic); ARGS=(I0=2, 0, -10) PC=16; OP=117 (substr_s_s_i_ic); ARGS=(S2="l", S0="Hello world", I0=2,1) PC=21; OP=110 (concat_s_s); ARGS=(S1="dlrow ol", S2="l") PC=24; OP=66 (dec_i); ARGS=(I0=2) PC=26; OP=43 (ge_i_ic_ic); ARGS=(I0=1, 0, -10) PC=16; OP=117 (substr_s_s_i_ic); ARGS=(S2="l", S0="Hello world", I0=1,1) PC=21; OP=110 (concat_s_s); ARGS=(S1="dlrow oll", S2="e") PC=24; OP=66 (dec_i); ARGS=(I0=1) PC=26; OP=43 (ge_i_ic_ic); ARGS=(I0=0, 0, -10) PC=16; OP=117 (substr_s_s_i_ic); ARGS=(S2="e", S0="Hello world", I0=0,1) PC=21; OP=110 (concat_s_s); ARGS=(S1="dlrow olle", S2="H") PC=24; OP=66 (dec_i); ARGS=(I0=0) PC=26; OP=43 (ge_i_ic_ic); ARGS=(I0=-1, 0, -10) PC=30; OP=8 (set_s_s); ARGS=(S0="Hello world", S1="dlrow olleH") PC=33; OP=0 (end) If it looks reasonable, I'll commit it. Brian Index: interpreter.c === RCS file: /home/perlcvs/parrot/interpreter.c,v retrieving revision 1.29 diff -u -r1.29 interpreter.c --- interpreter.c 2001/10/14 23:47:39 1.29 +++ interpreter.c 2001/10/17 20:15:42 @@ -81,7 +81,38 @@ fprintf(stderr, "; ARGS=("); for(i = 1; i < interpreter->opcode_info[*pc].arg_count; i++) { if (i > 1) { fprintf(stderr, ", "); } -fprintf(stderr, "%ld", (long) *(pc + i)); +switch(interpreter->opcode_info[*pc].types[i]) { +case PARROT_ARG_IC: +fprintf(stderr, "%ld", (long) *(pc + i)); +break; +case PARROT_ARG_NC: +fprintf(stderr, "%f", interpreter->code->const_table->constants[*(pc ++ i)]->number); +break; +case PARROT_ARG_PC: +/* what is a PMC constant look like? */ +fprintf(stderr, "%ld", (long) *(pc + i)); +break; +case PARROT_ARG_SC: +fprintf(stderr, "\"%s\"", +interpreter->code->const_table->constants[*(pc + i)]->string->bufstart); +break; +case PARROT_ARG_I: +fprintf(stderr, "I%ld=%ld", (long) *(pc + i), (long) +interpreter->int_reg->registers[*(pc + i)]); +break; +case PARROT_ARG_N: +fprintf(stderr, "N%ld=%f", (long) *(pc + i), +interpreter->num_reg->registers[*(pc + i)]); +break; +case PARROT_ARG_P: +/* what does a PMC constant look like? */ +fprintf(stderr, "P%ld=???", (long) *(pc + i)); +break; +case PARROT_ARG_S: +if(interpreter->string_reg->registers[*
Re: sample code
On Tue, 2001-10-16 at 15:58, James Mastros wrote: > On Tue, 16 Oct 2001, Brian Wheeler wrote: > > _chr - create a string (S0) with the ascii value of I0 > > _ord - return (in I0) the ascii value of the first character in S0 > There's /got/ to be a better way to write these. I propose making opcodes > specificly to do these. This is an inner-loop kind of thing. Yeah, probably, but not with the opcodes we have so far. They were just kind of afterthoughts and to see if its actually possible to do it with the ops we have now :) Brian
Re: RAM?
On Tue, 2001-10-16 at 15:02, James Mastros wrote: > On Tue, 16 Oct 2001, Dan Sugalski wrote: > > That's one way to do it, sure. You can always look at a string as a bounded > > byte buffer. One of the core 'string' types is "series of 8-bit bytes". We > > couldn't manage JPEG images too well without that. ;) > Hm. How do you convert the bytes into integers so you can do arithmetic > with them? (In other words, how do you write unpack('c') and friends.) I wrote a library in parrot which implemented ord() to do it. I was going to do pack/unpack but I haven't had a chance yet :) > > > Seriously, though, what are you trying to accomplish with providing generic > > access to memory? Perhaps if you had a solid application in mind it'd make > > working out what would be needed to support it easier. > I'm thinking of porting GCC, of course . However, I'm thinking that > pretty much any c-like language is going to want somthing like this. > You are sick. :) Maybe you could compile the linux kernel down into parrot. I did think about writing a kernel module for writing drivers in parrot, but considering the response that the last person got for writing a script interface module to the kernel, I put the thought away :) Brian
sample code
I've written a library of sorts which contains useful routines such as: _absi - absolute value of I0 _absn - absolute value of I0 _chomp - chomp a string (S0) with a trailing newline _chr - create a string (S0) with the ascii value of I0 _exit - terminate with a return code of I0 _hex - return in I0 the decimal value of the hex string in S0 _index - return the index (in I0) of S1 in S0 starting with I0 _lc - lowercase S0 _lcfirst - lowercase the first character in S0 _ord - return (in I0) the ascii value of the first character in S0 _reverse - reverse the string in S0 _rindex - do a reverse index of S1 in S0, returing the value in I0 _tr - transliterate S0 with S1 as the source and S2 as the dest _uc - uppercase S0 _ucfirst - uppercase the first char in S0 Here's an example which would uppercase only the last character of a string: set S0,"Hello world" bsr _lc bsr _reverse bsr _ucfirst bsr _reverse print "This should have only the 'd' uppercase: " print S0 print "\n" Do you think this library (called "utils.pasm") should be included in the examples directory? Brian # # utils.pasm # # Copyright (C) 2001 Yet Another Society. All rights reserved. # This program is free software. It is subject to the same # license as The Parrot Interpreter. # # $Id$ # # This contains a bunch of routines which can be used in other programs by # adding a line like this: # # include "utils.pasm" # # Unfortunately, since we don't have real libraries, you get the whole thing # whether you want it or not, but its a start. # # # The calling convention is register based, and starts with I0 for the first # integer argument, S0 for the first string argument, etc. Return value(s) # are placed in register 0 of whatever type is being returned. # _absi - the absolute value of I0 _absi: gt I0,0,$done mul I0,I0,-1 $done: ret # _absn - the absolute value of N0 _absn: gt N0,0,$done mul N0,N0,-1.0 $done: ret # _chomp - remove the last character of S0, if it is a newline _chomp: clones pushi length I0,S0 dec I0 substr S1,S0,I0,1 ne S1,"\n",$done substr S0,S0,0,I0 $done: saveS0 popi pops restore S0 ret # _chr - return the character S0 for the argument I0 _chr: clonei pushs set S0,"" set S1,"\x0\x1\x2\x3\x4\x5\x6\x7\x8\x9\xA\xB\xC\xD\xE\xF\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E\x1F\x20\x21\x22\x23\x24\x25\x26\x27\x28\x29\x2A\x2B\x2C\x2D\x2E\x2F\x30\x31\x32\x33\x34\x35\x36\x37\x38\x39\x3A\x3B\x3C\x3D\x3E\x3F\x40\x41\x42\x43\x44\x45\x46\x47\x48\x49\x4A\x4B\x4C\x4D\x4E\x4F\x50\x51\x52\x53\x54\x55\x56\x57\x58\x59\x5A\x5B\x5C\x5D\x5E\x5F\x6A\x6B\x6C\x6D\x6E\x6F\x70\x71\x72\x73\x74\x75\x76\x77\x78\x79\x7A\x7B\x7C\x7D\x7E\x7F\x80\x81\x82\x83\x84\x85\x86\x87\x88\x89\x8A\x8B\x8C\x8D\x8E\x8F\x90\x91\x92\x93\x94\x95\x96\x97\x98\x99\x9A\x9B\x9C\x9D\x9E\x9F\xA0\xA1\xA2\xA3\xA4\xA5\xA6\xA7\xA8\xA9\xAA\xAB\xAC\xAD\xAE\xAF\xB0\xB1\xB2\xB3\xB4\xB5\xB6\xB7\xB8\xB9\xBA\xBB\xBC\xBD\xBE\xBF\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF\xD0\xD1\xD2\xD3\xD4\xD5\xD6\xD7\xD8\xD9\xDA\xDB\xDC\xDD\xDE\xD\xE0\xE1\xE2\xE3\xE4\xE5\xE6\xE7\xE8\xE9\xEA\xEB\xEC\xED\xEE\xEF\xF0\xF1\xF2\xF3\xF4\xF5\xF6\xF7\xF8\xF9\xFA\xFB\xFC\xFD\xFE\xFF" substr S0,S1,I0,1 saveS0 pops popi restore S0 ret # _exit - exit the program, indicating the status of I0 _exit: print "Program terminated with result code " print I0 print "\n" end # _hex - return the decimal value of the hex argument S0 _hex: clones pushi bsr _reverse bsr _uc length I28,S0 set S2,S0 set I29,0 set I30,0 set I31,1 set S0,"0123456789ABCDEF" set S1,"" $loop: substr S1,S2,I30,1 bsr _index mul I0,I0,I31 add I29,I29,I0 inc I30 mul I31,I31,16 lt I30,I28,$loop saveI29 popi pops restore I0 ret # _index - return the index of the substring S1 in the string S0, starting # with position I0. -1 is returned if it is not found _index: clones clonei set S2,"" length I31,S1 length I30,S0 $loop: substr S2,S0,I0,I31 eq S2,S1,$done inc I0 lt I0,I30,$loop set I0,-1 $done: saveI0 popi pops restore I0 ret # _lc - lowercase the string S0 _lc:clones set S2,"abcdefghijklmnopqrstuvwxyz" set S1,"ABCDEFGHIJKLMNOPQRSTUVWXYZ" bsr _tr saveS0 pops restore S0 ret # _lcfirst - lowercase the first character
Re: substr broken?
On Tue, 2001-10-16 at 13:04, Alex Gough wrote: > On Tue, 16 Oct 2001, Alex Gough wrote: > > On 16 Oct 2001, Brian Wheeler wrote: > > > > > I'm getting some weird results when using substr. Here's my test > > > program: > > > > It's probably something wrong with the constant table or the assembly > > phase, if the script is changed so that S1 is set to "-", say, it does > > more what I expect. > > Indeed it is. My CVS access isn't working (ask??) so here's a new test > to fail: Actually its the set s,sc operation. I've modified it to use string_copy. I also added your test and committed them a few minutes ago. Thanks! This one had been really stumping me! Brian > > Alex Gough > > Index: parrot/t/op/string.t > === > RCS file: /home/perlcvs/parrot/t/op/string.t,v > retrieving revision 1.11 > diff -u -r1.11 string.t > --- parrot/t/op/string.t 2001/10/14 00:43:50 1.11 > +++ parrot/t/op/string.t 2001/10/16 18:01:34 > @@ -1,6 +1,6 @@ > #! perl -w > > -use Parrot::Test tests => 23; > +use Parrot::Test tests => 24; > > output_is( <<'CODE', < set S4, "JAPH\n" > @@ -296,6 +296,21 @@ > ok > OUTPUT > > +output_is(<<'CODE', < + set S0, "" > + set S1, "" > + set S2, "foo" > + concat S1,S1,S2 > + print S1 > + print S0 > + print "\n" > +CODE > +foo > +OUTPUT > + > + > +# Helper subs > + > # Set all string registers to values given by &$_[0](reg num) > sub set_str_regs { >my $code = shift;
substr broken?
I'm getting some weird results when using substr. Here's my test program: set S0,"Hello world" print "Arg to Reverse: " print S0 print "\n" set S1,"" set S2,"" length I0,S0 dec I0 $loop: substr S2,S0,I0,1 print "string so far:" print S1 print ", " print S2 print "\n" concat S1,S2 dec I0 ge I0,0,$loop set S0,S1 end Here's the output: Arg to Reverse: Hello world string so far:d, d string so far:l, l string so far:r, r string so far:o, o string so far:w, w string so far: , string so far:o, o string so far:l, l string so far:l, l string so far:e, e string so far:H, H Why is S1 set on the first iteration, before the concat? Brian
Re: "missing" opcodes
Heheh, I should read all of my mail before I send new ones. I'll commit it shortly. Brian On Tue, 2001-10-16 at 10:36, Dan Sugalski wrote: > At 10:04 AM 10/16/2001 -0500, Brian Wheeler wrote: > >Thoughts? > > Go for it. This sort of thing's just fine. I know I made a "NO NEW OPCODES > WITHOUT CLEARANCE" statement a while ago, but that really needs clarification. > > The thing that's a no-no is new 'high-level' opcodes--basically anything > that'd require a new entry in parrot_assembly.pod. Opcode variants--tan > which takes a constant, say--are fine. We may, at some point, prune out the > less-used opcodes so we don't have a huge table of 'em if that turns out to > be a problem (like if there's a performance knee at 255/256 or 511/512 base > opcodes in the switch) but for the moment I'm not overly worried about the > excess of opcodes if it turns out that having special ones is worth it. > > It may, of course, turn out that: > > set N0, 12.4 > tan N1, N0 > > ends up being faster than > > tan N1, 12.4 > > in some cases (like if we almost never use tan of a constant but do use tan > of a register a lot, thus having to fault in the code for the constant tan > every time we use it) but that's something that'll have to do some > performance testing on to know for sure. > > Dan > > --"it's like this"--- > Dan Sugalski even samurai > [EMAIL PROTECTED] have teddy bears and even > teddy bears get drunk
[PATCH] "missing" opcodes
Here's a patch which adds the 'missing' opcodes from the earlier email. It also adds the 3 arg variant of concat. Dan/Simon/Anyone, if it seems ok, I'll commit it, but since it adds 52 op variants, I wasn't sure if it would be ok. Brian Ops follow +AUTO_OP add(i, i, ic) { +AUTO_OP add(n, n, nc) { +AUTO_OP cmod(i, i, ic) { +AUTO_OP cmod(i, ic, i) { +AUTO_OP cmod(n, n, nc) { +AUTO_OP cmod(n, nc, n) { +AUTO_OP div(i, i, ic) { +AUTO_OP div(i, ic, i) { +AUTO_OP div(n, n, nc) { +AUTO_OP div(n, nc, n) { +AUTO_OP mod(i, i, ic) { +AUTO_OP mod(i, ic, i) { +AUTO_OP mod(n, n, nc) { +AUTO_OP mod(n, nc, n) { +AUTO_OP mul(i, i, ic) { +AUTO_OP mul(n, n, nc) { +AUTO_OP pow(n, i, ic) { +AUTO_OP pow(n, ic, i) { +AUTO_OP pow(n, i, nc) { +AUTO_OP pow(n, ic, n) { +AUTO_OP pow(n, n, ic) { +AUTO_OP pow(n, nc, i) { +AUTO_OP pow(n, n, nc) { +AUTO_OP pow(n, nc, n) { +AUTO_OP sub(i, i, ic) { +AUTO_OP sub(i, ic, i) { +AUTO_OP sub(n, n, nc) { +AUTO_OP sub(n, nc, n) { +AUTO_OP chopn(s, i) { +AUTO_OP concat(s, sc) { +AUTO_OP concat(s, s, s) { +AUTO_OP concat(s, s, sc) { +AUTO_OP concat(s, sc, s) { +AUTO_OP substr(s, s, i, ic) { +AUTO_OP substr(s, s, ic, i) { +AUTO_OP substr(s, s, ic, ic) { +AUTO_OP substr(s, sc, i, i) { +AUTO_OP substr(s, sc, i, ic) { +AUTO_OP substr(s, sc, ic, i) { +AUTO_OP atan(n, i, ic) { +AUTO_OP atan(n, ic, i) { +AUTO_OP atan(n, i, nc) { +AUTO_OP atan(n, ic, n) { +AUTO_OP atan(n, n, ic) { +AUTO_OP atan(n, nc, i) { +AUTO_OP atan(n, n, nc) { +AUTO_OP atan(n, nc, n) { +AUTO_OP and(i, i, ic) { +AUTO_OP or(i, i, ic) { +AUTO_OP shl(i, i, i) { +AUTO_OP shr(i, i, i) { +AUTO_OP xor(i, i, ic) { Index: core.ops === RCS file: /home/perlcvs/parrot/core.ops,v retrieving revision 1.8 diff -u -r1.8 core.ops --- core.ops2001/10/16 14:12:41 1.8 +++ core.ops2001/10/16 15:35:36 @@ -661,8 +661,12 @@ =item B(i, i, i) +=item B(i, i, ic) + =item B(n, n, n) +=item B(n, n, nc) + Set $1 to the sum of $2 and $3. =cut @@ -670,16 +674,28 @@ AUTO_OP add(i, i, i) { $1 = $2 + $3; } + +AUTO_OP add(i, i, ic) { + $1 = $2 + $3; +} AUTO_OP add(n, n, n) { $1 = $2 + $3; } +AUTO_OP add(n, n, nc) { + $1 = $2 + $3; +} + =item B(i, i, i) +=item B(i, i, ic) + +=item B(i, ic, i) + NOTE: This "uncorrected mod" algorithm uses the C language's built-in mod operator (x % y), which is @@ -716,11 +732,23 @@ $1 = $2 % $3; } +AUTO_OP cmod(i, i, ic) { + $1 = $2 % $3; +} + +AUTO_OP cmod(i, ic, i) { + $1 = $2 % $3; +} + =item B(n, n, n) +=item B(n, n, nc) + +=item B(n, nc, n) + NOTE: This "uncorrected mod" algorithm uses the built-in C math library's fmod() function, which computes @@ -752,7 +780,15 @@ $1 = fmod($2, $3); } +AUTO_OP cmod(n, n, nc) { + $1 = fmod($2, $3); +} +AUTO_OP cmod(n, nc, n) { + $1 = fmod($2, $3); +} + + =item B(i) @@ -795,8 +831,16 @@ =item B(i, i, i) +=item B(i, i, ic) + +=item B(i, ic, i) + =item B(n, n, n) +=item B(n, n, nc) + +=item B(n, nc, n) + Set $1 to the quotient of $2 divided by $3. In the case of INTVAL division, the result is truncated (NOT rounded or floored). @@ -806,11 +850,27 @@ $1 = $2 / $3; } +AUTO_OP div(i, i, ic) { + $1 = $2 / $3; +} + +AUTO_OP div(i, ic, i) { + $1 = $2 / $3; +} + AUTO_OP div(n, n, n) { $1 = $2 / $3; } +AUTO_OP div(n, n, nc) { + $1 = $2 / $3; +} + +AUTO_OP div(n, nc, n) { + $1 = $2 / $3; +} + =item B(i) @@ -853,6 +913,10 @@ =item B(i, i, i) +=item B(i, i, ic) + +=item B(i, ic, i) + NOTE: This "corrected mod" algorithm is based on the C code on page 70 of [1]. Assuming correct behavior of C's built-in mod operator (%) with positive arguments, this algorithm implements a mathematically convenient @@ -899,11 +963,67 @@ $1 = r; } +AUTO_OP mod(i, i, ic) { + INTVAL y = $2; + INTVAL z = $3; + INTVAL s = 0; + INTVAL r; + if (z == 0) { +r = y; + } + else { +if (y < 0) { s += 2; y = -y; } +if (z < 0) { s += 1; z = -z; } + +r = y % z; + +switch (s) { + case 0 + 0:break; + case 0 + 1: r = r - z; break; + case 2 + 0: r = z - r; break; + case 2 + 1: r = -r;break; +} + } + + $1 = r; +} + +AUTO_OP mod(i, ic, i) { + INTVAL y = $2; + INTVAL z = $3; + INTVAL s = 0; + INTVAL r; + + if (z == 0) { +r = y; + } + else { +if (y < 0) { s += 2; y = -y; } +if (z < 0) { s += 1; z = -z; } + +r = y % z; + +switch (s) { + case 0 + 0:break; + case 0 + 1: r = r - z; break; + case 2 + 0: r = z - r; break; + case 2 + 1: r = -r;break; +} + } + + $1 = r; +} + + =item B(n, n, n) +=item B(n, n, nc) + +=item B(n, nc, n) + NOTE: This "corrected mod" algorithm is based on the formula of [1]: x
"missing" opcodes
After writing a couple of library functions, I realized that we have to do alot of data shuffling to do common tasks. Reserving a register to hold 0 or 1 and/or filling up registers with constants just takes up cpu time and could better be handled if the opcodes took constants directly as well as registers. substr was really annoying in this regard: set I0,0 set I1,1 substr S0,S1,I0,I1 pre-allocating the registers (I.e. I31=1, I30=0) helped a bit, but it took away from the size of the general purpose register pool, and obscured the code. So, with that in mind, here are some opcodes to fill in the blanks. I tried to make sure that none of the opcodes below could be computed in advance (i.e. there's no add(i,ic,ic)) and when the order of the args doesn't matter (in add, for example), the constant comes last. ** Suggested Opcodes to minimize data shuffling ** -- arithmetic operations [unification of add & inc, except in the 'inc I0' case] add(i,i,ic) add(n,n,nc) cmod(i,i,ic) cmod(i,ic,i) cmod(n,n,nc) cmod(n,nc,n) div(i,i,ic) div(i,ic,i) div(n,n,nc) div(n,nc,n) mod(i,i,ic) mod(i,ic,i) mod(n,n,nc) mod(n,n,nc) mul(i,i,ic) mul(n,n,nc) pow(n,i,ic) pow(n,ic,i) pow(n,i,nc) pow(n,ic,n) pow(n,n,ic) pow(n,nc,i) pow(n,n,nc) pow(n,nc,n) [unification of sub & dec, except in the 'dec I0' case] sub(i,i,ic) sub(n,n,nc) -- string operations chopn(s,i) concat(s,sc) substr(s,s,i,ic) substr(s,s,ic,i) substr(s,s,ic,ic) substr(s,sc,i,i) substr(s,sc,i,ic) substr(s,sc,ic,i) -- trancendental operations atan(n,i,ic) atan(n,ic,i) atan(n,i,nc) atan(n,ic,n) atan(n,n,ic) atan(n,nc,i) atan(n,n,nc) atan(n,nc,n) -- bitwise logical operations and(i,i,ic) or(i,i,ic) shl(i,i,i) shr(i,i,i) xor(i,i,ic) WHEW! That's alot of opcodes. However, it would lead to more compact (and probably faster) bytecode to be able to avoid having to load registers all the time with values you know in advance. Thoughts? Brian
Re: string weirdness
On Mon, 2001-10-15 at 21:12, Dan Sugalski wrote: > On 15 Oct 2001, Brian Wheeler wrote: > > > With the addition of clone, I started writing some generic routines > > which might be useful (index,lc,uc,reverse,abs,tr,etc)...and I came > > across some weirdness: > > > > doing: > >save S0 > >restore S1 > > > > (since there's no set S1,S0) > > > > binds the registers together, so a change to one is a change to > > both...which doesn't happen on int registers. > > Right. Save on a string register pushes the pointer to the string > structure in the register onto the stack. The same thing happens with > PMCs, or will when they're implemented. > > The assumption is that, when you push a register onto the stack, you'll > then stomp on the contents of the register. (Rather than what the register > points to...) Otherwise a push would need to create a copy of the string > structure and a copy of the string contents. > > Dan Ok, that's fair... With that in mind I've implemented "set Sx,Sy" which does a string copy. This is what I was after in the first place, but then noticed the unexpected behavior of save/restore of S registers. It makes sense now :) Brian
string weirdness
With the addition of clone, I started writing some generic routines which might be useful (index,lc,uc,reverse,abs,tr,etc)...and I came across some weirdness: doing: save S0 restore S1 (since there's no set S1,S0) binds the registers together, so a change to one is a change to both...which doesn't happen on int registers. In addition, I've got a test program which I cannot figure out why it fails...and I believe its related to the above problem. Here's the test code (sorry for the size): set S0,"Hello World" bsr _uc print "UC is: " print S0 print "\n" bsr _lc print "LC is: " print S0 print "\n" end # index - return the position (I0) of a substring (S1) within a string (S0) _index: clones pushi set I0,0 length I1,S1 length I2,S0 $loop: substr S2,S0,I0,I1 eq S1,S2,$done inc I0,1 lt I0,I2,$loop set I0,-1 branch $done $done: saveI0 popi pops restore I0 ret # tr - convert the string S0 by replacing chars in S1 with those in S2. _tr: clones pushi length I22,S2 length I21,S1 length I20,S0 set I0,0 substr S22,S2,I0,I22 # no set s,s substr S21,S1,I0,I21 substr S20,S0,I0,I20 set S28,"" # our result string. set I29,0 # zero set I28,1 # one set I27,0 $loop: substr S1,S20,I27,I28 substr S0,S21,I29,I21 bsr _index ne I0,-1,$found concat S28,S1 branch $next $found: substr S1,S22,I0,I28 concat S28,S1 $next: print "building string: " print S28 print ", I27= " print I27 print ", I20= " print I20 print "\n" inc I27,1 lt I27,I20,$loop saveS28 popi pops restore S0 ret _lc:clones set S1,"ABCDEFGHIJKLMNOPQRSTUVWXYZ" set S2,"abcdefghijklmnopqrstuvwxyz" bsr _tr saveS0 pops restore S0 ret _uc:clones set S2,"ABCDEFGHIJKLMNOPQRSTUVWXYZ" set S1,"abcdefghijklmnopqrstuvwxyz" bsr _tr saveS0 pops restore S0 ret Here's what it outputs: building string: H, I27= 0, I20= 11 building string: HE, I27= 1, I20= 11 building string: HEL, I27= 2, I20= 11 building string: HELL, I27= 3, I20= 11 building string: HELLO, I27= 4, I20= 11 building string: HELLO , I27= 5, I20= 11 building string: HELLO W, I27= 6, I20= 11 building string: HELLO WO, I27= 7, I20= 11 building string: HELLO WOR, I27= 8, I20= 11 building string: HELLO WORL, I27= 9, I20= 11 building string: HELLO WORLD, I27= 10, I20= 11 UC is: HELLO WORLD building string: dH, I27= 0, I20= 11 building string: dE, I27= 1, I20= 11 building string: dL, I27= 2, I20= 11 building string: dL, I27= 3, I20= 11 building string: dO, I27= 4, I20= 11 building string: d , I27= 5, I20= 11 building string: dW, I27= 6, I20= 11 building string: dO, I27= 7, I20= 11 building string: dR, I27= 8, I20= 11 building string: dL, I27= 9, I20= 11 building string: dD, I27= 10, I20= 11 LC is: dD S28 is somehow not growing the 2nd time a routine using _tr is called...which leads me to believe that some binding of S28 has occurred. Maybe its a dumb typo. I don't know, and I'm stumped. Anyone have any hints? Brian
clone operators
Because I was bored this evening, I implemented the clone operators. Dan? Brian # compute the factorialrecursively! # lets do it for the numbers 0 to 6 main: set I1,0 $loop: print "fact of " print I1 print " is: " set I0,I1 bsr fact print I0 print "\n" inc I1,1 eq I1,7,$done branch $loop $done: end # I0 is the number to compute fact: clonei lt I0,2,$is_one set I1,I0 dec I0,1 bsr fact mul I0,I0,I1 saveI0 branch $done $is_one: set I0,1 saveI0 $done: popi restore I0 ret ? test.pasm ? test.inc ? test.list ? fact.pasm ? interp_guts.c ? config.h ? basic_opcodes.c ? DCn.diff ? macro.pbc ? macro.list ? local_label.pbc ? test.pbc ? euclid.pbc ? macro.pasm ? test2.pbc ? test3.pbc ? mytest.pasm ? cequ.code ? assemble.pl-new_macros ? mytest.pbc ? op_info.c ? frame_test.pbc ? fact.pbc ? frame_test.pasm ? clone.diff ? include/parrot/op_info.h ? t/test.pbc ? t/test1.c ? t/test1 Index: core.ops === RCS file: /home/perlcvs/parrot/core.ops,v retrieving revision 1.4 diff -u -r1.4 core.ops --- core.ops2001/10/15 21:18:42 1.4 +++ core.ops2001/10/16 00:29:05 @@ -1678,6 +1678,35 @@ Parrot_push_p(interpreter); } + + +=item B() + +=item B() + +=item B() + +=item B() + +Save all the registers of the type indicated in the name of the operation. + +=cut + +AUTO_OP clonei() { + Parrot_clone_i(interpreter); +} + +AUTO_OP clonen() { + Parrot_clone_n(interpreter); +} + +AUTO_OP clones() { + Parrot_clone_s(interpreter); +} + +AUTO_OP clonep() { + Parrot_clone_p(interpreter); +} =back Index: register.c === RCS file: /home/perlcvs/parrot/register.c,v retrieving revision 1.10 diff -u -r1.10 register.c --- register.c 2001/10/02 14:01:30 1.10 +++ register.c 2001/10/16 00:29:06 @@ -38,6 +38,39 @@ } } +/*=for api register Parrot_clone_i + pushes a new integer register frame on the frame stack and + copies the last frame to the current frame +*/ +void +Parrot_clone_i(struct Parrot_Interp *interpreter) { +struct IRegChunk *chunk_base; + +chunk_base = CHUNK_BASE(interpreter->int_reg); +/* Do we have any slots left in the current chunk? */ +if (chunk_base->free) { +interpreter->int_reg = &chunk_base->IReg[chunk_base->used++]; +chunk_base->free--; +mem_sys_memcopy(&chunk_base->IReg[chunk_base->used-1], +&chunk_base->IReg[chunk_base->used-2], +sizeof(struct IReg)); +} +/* Nope, so plan B time. Allocate a new chunk of integer register frames */ +else { +struct IRegChunk *new_chunk; +new_chunk = mem_allocate_aligned(sizeof(struct IRegChunk)); +new_chunk->used = 1; +new_chunk->free = FRAMES_PER_INT_REG_CHUNK - 1; +new_chunk->next = NULL; +new_chunk->prev = chunk_base; +chunk_base->next = new_chunk; +mem_sys_memcopy(&new_chunk->IReg[0], +&chunk_base->IReg[chunk_base->used-1], +sizeof(struct IReg)); +interpreter->int_reg = &new_chunk->IReg[0]; +} +} + /*=for api register Parrot_pop_i pops an integer register frame off of the frame stack */ @@ -107,6 +140,42 @@ } } +/*=for api register Parrot_clone_s + pushes a new string register frame on the frame stack and + copies the last frame to the current frame +*/ +void +Parrot_clone_s(struct Parrot_Interp *interpreter) { +struct SRegChunk *chunk_base; + +chunk_base = CHUNK_BASE(interpreter->string_reg); +/* Do we have any slots left in the current chunk? */ +if (chunk_base->free) { +interpreter->string_reg = &chunk_base->SReg[chunk_base->used++]; +chunk_base->free--; +mem_sys_memcopy(&chunk_base->SReg[chunk_base->used-1], +&chunk_base->SReg[chunk_base->used-2], +sizeof(struct SReg)); +} +/* Nope, so plan B time. Allocate a new chunk of string register frames */ +else { +struct SRegChunk *new_chunk; +new_chunk = mem_allocate_aligned(sizeof(struct SRegChunk)); +new_chunk->used = 1; +new_chunk->free = FRAMES_PER_STR_REG_CHUNK - 1; +new_chunk->next = NULL; +new_chunk->prev = chunk_base; +chunk_base->next = new_chunk; +mem_sys_memcopy(&new_chunk->SReg[0], +&chunk_base->SReg[chunk_base->used-1], +sizeof(struct SReg)); +interpreter->string_reg = &new_chunk->SReg[0]; +/* Gotta NULL them out as some string + functions depend on NULL strings */ +Parrot
push* change [was: simple subs...]
Here's a small change to push* which copies the last context automatically. I was thinking about Dan's "clone" opcode, when I realized that most of the time you're going to want the values, and if you don't, you can always issue a clear. It passes all of the make tests, and it allows recursive programs to be written, such as the factorial below. Dan, what do you think? Brian # compute the factorialrecursively! # lets do it for the numbers 0 to 6 main: set I1,0 $loop: print "fact of " print I1 print " is: " set I0,I1 bsr fact print I0 print "\n" inc I1,1 eq I1,7,$done branch $loop $done: end # I0 is the number to compute fact: pushi lt I0,2,$is_one set I1,I0 dec I0,1 bsr fact mul I0,I0,I1 saveI0 branch $done $is_one: set I0,1 saveI0 $done: popi restore I0 ret Index: register.c === RCS file: /home/perlcvs/parrot/register.c,v retrieving revision 1.10 diff -u -r1.10 register.c --- register.c 2001/10/02 14:01:30 1.10 +++ register.c 2001/10/14 01:28:43 @@ -24,6 +24,9 @@ if (chunk_base->free) { interpreter->int_reg = &chunk_base->IReg[chunk_base->used++]; chunk_base->free--; +mem_sys_memcopy(&chunk_base->IReg[chunk_base->used-1], +&chunk_base->IReg[chunk_base->used-2], +sizeof(struct IReg)); } /* Nope, so plan B time. Allocate a new chunk of integer register frames */ else { @@ -34,6 +37,9 @@ new_chunk->next = NULL; new_chunk->prev = chunk_base; chunk_base->next = new_chunk; +mem_sys_memcopy(&new_chunk->IReg[0], +&chunk_base->IReg[chunk_base->used-1], +sizeof(struct IReg)); interpreter->int_reg = &new_chunk->IReg[0]; } } @@ -90,6 +96,9 @@ if (chunk_base->free) { interpreter->string_reg = &chunk_base->SReg[chunk_base->used++]; chunk_base->free--; +mem_sys_memcopy(&chunk_base->SReg[chunk_base->used-1], +&chunk_base->SReg[chunk_base->used-2], +sizeof(struct SReg)); } /* Nope, so plan B time. Allocate a new chunk of string register frames */ else { @@ -100,6 +109,9 @@ new_chunk->next = NULL; new_chunk->prev = chunk_base; chunk_base->next = new_chunk; +mem_sys_memcopy(&new_chunk->SReg[0], +&chunk_base->SReg[chunk_base->used-1], +sizeof(struct SReg)); interpreter->string_reg = &new_chunk->SReg[0]; /* Gotta NULL them out as some string functions depend on NULL strings */ @@ -159,6 +171,9 @@ if (chunk_base->free) { interpreter->num_reg = &chunk_base->NReg[chunk_base->used++]; chunk_base->free--; +mem_sys_memcopy(&chunk_base->NReg[chunk_base->used-1], +&chunk_base->NReg[chunk_base->used-2], +sizeof(struct NReg)); } /* Nope, so plan B time. Allocate a new chunk of float register frames */ else { @@ -169,6 +184,9 @@ new_chunk->next = NULL; new_chunk->prev = chunk_base; chunk_base->next = new_chunk; +mem_sys_memcopy(&new_chunk->NReg[0], +&chunk_base->NReg[chunk_base->used-1], +sizeof(struct NReg)); interpreter->num_reg = &new_chunk->NReg[0]; } } @@ -225,6 +243,9 @@ if (chunk_base->free) { interpreter->pmc_reg = &chunk_base->PReg[chunk_base->used++]; chunk_base->free--; +mem_sys_memcopy(&chunk_base->PReg[chunk_base->used-1], +&chunk_base->PReg[chunk_base->used-2], +sizeof(struct PReg)); } /* Nope, so plan B time. Allocate a new chunk of float register frames */ else { @@ -235,6 +256,9 @@ new_chunk->next = NULL; new_chunk->prev = chunk_base; chunk_base->next = new_chunk; +mem_sys_memcopy(&new_chunk->PReg[0], +&chunk_base->PReg[chunk_base->used-1], +sizeof(struct PReg)); interpreter->pmc_reg = &new_chunk->PReg[0]; /* Gotta NULL them out or we might GC Wrong things later */ Parrot_clear_p(interpreter);
Re: Simple sub support's now in!
On Fri, 2001-10-12 at 16:04, Dan Sugalski wrote: > At 04:00 PM 10/12/2001 -0500, Brian Wheeler wrote: > >On Fri, 2001-10-12 at 15:45, Dan Sugalski wrote: > > > At 03:50 PM 10/12/2001 -0500, Brian Wheeler wrote: > > > >Neat-o, but I do have a question... how do I pass parameters to > > > >recursive subroutines, and/or save registers and not clobber the > > > >caller's? > > > > > > Ah, this doesn't get into that. I'm still not sure what the calling > > > conventions will be. > > > > > > >:) hehe, looks like I'm too eager to abuse it. > > Heh. The more eager the better. > > >In any case, wouldn't calling conventions be a language-dependant thing, > >sort of like the way Pascal and C have different conventions...though > >there might be a common one for calling libraries, etc. > > No, we're going to have a standard calling convention. Various languages > are welcome to work around it if they like, and even go as far as to write > their own opcodes to implement it, but there will be a standard for > high-level subs. (Which is not to say it'll be the same as the low-level > code will use) > Fair enough. I thought more about cross-language stuff after I posted...and the pain of trying to get one language to call the other under DOS. > > > I see we don't have push-with-copy ops for the various register files. I > > > think I'll go fix that. > > > > > > >How do you do "pop, but I want to remember my return value"? > > Save to stack, pop, restore from stack, return. Assuming, of course, that > the caller and callee agree on the registers that hold the return values. > Hmm. I must be missing something. Can you show an example? Brian > Dan > > --"it's like this"--- > Dan Sugalski even samurai > [EMAIL PROTECTED] have teddy bears and even > teddy bears get drunk
Re: Simple sub support's now in!
Neat-o, but I do have a question... how do I pass parameters to recursive subroutines, and/or save registers and not clobber the caller's? Here's a factorial program I wrote: ### Compute the factorial recursively ### based on the scheme classic: ### (define fact (lambda(n) (if (= 1 n) 1 (* n (fact (- n 1)) set I0, 3 saveI0 bsr fact restore I1 print "Factorial of " print I0 print " is " print I1 print "\n" end fact: push_i restore I0 print "called fact! parameter=" print I0 print ", depth: " print I31 inc I31,1 eq I0,0,$is_one eq I0,1,$is_one set I1,I0 dec I1,1 saveI1 bsr fact restore I1 mul I0,I0,I1 saveI0 branch $done $is_one: set I0,1 saveI0 $done: pop_i ret But alas, it cannot work: since there's only a single stack, and there's no way to peek down into it (that I could find), there's no way to pass params via the stack...or not clobber another's registers. It might be interesting to have 3 stacks: * branches * interpreter contexts * data Though that might become a little tedious...hehe, but it would allow my above example to work :) Brian On Fri, 2001-10-12 at 15:00, Dan Sugalski wrote: > Okay, I finished the support for the generic stack. We can now: > >*) save and restore registers (using the save and restore opcodes) > >*) return from subs using the address on the top of the stack (using > the ret opcode) > >*) Branch to a label using the bsr opcode, and remember where we were > for later returning > > At the moment we've only got relative sub calls using the bsr (well, bsr_i > and bsr_ic opcodes, but close enough...) opcode. jsr will come as soon as > we can reasonably get absolute addresses. Soon, I think. > > Dan > > --"it's like this"--- > Dan Sugalski even samurai > [EMAIL PROTECTED] have teddy bears and even > teddy bears get drunk
Re: Fetching the PC? [RESEND]
Argh, my mailer crashed as I sent this, so I don't know if it went out. On Thu, 2001-10-11 at 21:23, Dan Sugalski wrote: > At 09:12 PM 10/11/2001 -0500, Brian Wheeler wrote: > >On Thu, 2001-10-11 at 20:49, Dan Sugalski wrote: > > > At 08:25 PM 10/11/2001 -0500, Brian Wheeler wrote: > > > >Since we're passing guilt around, there's an equate of '*' which is the > > > >current PC...and I didn't document it. You can do > > > > set I1,* > > > >and it will set I1 to the current PC. It doesn't allow any math, > > > >though. I thought about hooking up eval to various brackets but I never > > > >got the time before my job got busy... > > > > > > Absolute or relative PC? > > > >Well, its relative to the start of the bytecode...which I suppose would > >be absolute...unless multiple bytecode chunks are placed in the same > >memory block, in which case it'd be relative. Now I'm confused. :) > > Absolute addresses are, well, absolute addresses. Relative addresses are > offsets from the current location. > > I think the confusion's because the jump opcode's broken. When you say > >jump 12 > > It should jump to absolute address 12, not 12 bytes/words/opcodes from the > current position. > Ok, fair enough...but should it be jumping to: (char *)byte_code+12 or (char *)12 If its the former, it seems there will be some problems jumping to other bytecode blocks, and if its the latter, at the very least someone can do a "jump 0" and coredump the process. I could be missing stuff, because I've not had a chance to review the changes over the last few weeks > > > >Though I like Gregor's way of doing it: we know the addresses (more or > > > >less) at compile time, so we might as well not waste ops doing > > > >arithmetic that we know in advance... > > > > > > Fair enough, though we don't really know the absolute PC at assembly time, > > > as we're all position independent. Thinking further, having the getpc > > > opcode take an offset would let us do something like: > > > > > >getpc I0, FOO > > > > > > to put the absolute address of FOO into I0, suitable for jumps and jsrs. > > > > > > >This just comes out as a specialized add, right? In fact, isn't it > >this: > > set I0,* > > inc I0,FOO <-- assuming the assembler knew that this is an address > > Yup. Only set doesn't take * as a parameter--it can't, because there's no > way to know at assembly time what the real PC will be. > Yeah, I thought about that afterwards (the condition where real PC is actually a memory address) > >How are multiple bytecode chunks (i.e. libraries) going to be handled? > > They're just going to get mmapped in wherever the system puts 'em. > > >Are they going to be contiguous? > > Nope. > > >Are they going to be segmented somehow > >so there's a "far jump" which takes us out of the current block? > > Nope. Jumps and jsrs take absolute addresses, so they can go anywhere. > Branches are relative so fixing them up to bounce between segments would be > tough, but we're not going to do that. :) > > Fair enough :) Brian > Dan > > --"it's like this"--- > Dan Sugalski even samurai > [EMAIL PROTECTED] have teddy bears and even > teddy bears get drunk
Re: Fetching the PC?
On Thu, 2001-10-11 at 20:49, Dan Sugalski wrote: > At 08:25 PM 10/11/2001 -0500, Brian Wheeler wrote: > >Since we're passing guilt around, there's an equate of '*' which is the > >current PC...and I didn't document it. You can do > > set I1,* > >and it will set I1 to the current PC. It doesn't allow any math, > >though. I thought about hooking up eval to various brackets but I never > >got the time before my job got busy... > > Absolute or relative PC? Well, its relative to the start of the bytecode...which I suppose would be absolute...unless multiple bytecode chunks are placed in the same memory block, in which case it'd be relative. Now I'm confused. :) > > >Though I like Gregor's way of doing it: we know the addresses (more or > >less) at compile time, so we might as well not waste ops doing > >arithmetic that we know in advance... > > Fair enough, though we don't really know the absolute PC at assembly time, > as we're all position independent. Thinking further, having the getpc > opcode take an offset would let us do something like: > >getpc I0, FOO > > to put the absolute address of FOO into I0, suitable for jumps and jsrs. > This just comes out as a specialized add, right? In fact, isn't it this: set I0,* inc I0,FOO <-- assuming the assembler knew that this is an address How are multiple bytecode chunks (i.e. libraries) going to be handled? Are they going to be contiguous? Are they going to be segmented somehow so there's a "far jump" which takes us out of the current block? Brian
Re: Fetching the PC?
On Thu, 2001-10-11 at 19:49, Dan Sugalski wrote: > At 06:05 PM 10/11/2001 -0400, Gregor N. Purdy wrote: > >I'm guilty. > > > >I needed address arithmetic for Jako subroutine support. I also needed > >a quick and easy way to detect it in the .pasm file. I use the square > >brackes as a quotation device to make it easy to parse. Eventually > >we will need an assembler with a syntax that is *designed* rather > >than *evolved* (this is not intended as a disparaging comment), at > >which point the syntax for various constructs would be up for grabs. > > Are you sure it's not just stolen? Some of us do have extensive (if really, > really old) experience with macro assemblers :) > > >For example, I think registers should start with '%' or some other > >sigil, since I could very reasonably want to use a label that looked > >like a register reference. > > Heh. Well, too bad--use S0 and get string register 0. I'm OK with that. > > >Anyway, we definitely should have address arithmetic documented in > >the assembler docs, and I should have written it right at the outset, > >but I forgot. > > While I think I'd prefer * as the PC indicator, @ is OK and it won't > conflict with constant arithmetic someone might want to do. > Since we're passing guilt around, there's an equate of '*' which is the current PC...and I didn't document it. You can do set I1,* and it will set I1 to the current PC. It doesn't allow any math, though. I thought about hooking up eval to various brackets but I never got the time before my job got busy... > I think we need some opcode that'll fetch the current PC and stick it in a > register. Maybe: > > getpc I0 > > to put the PC for the next instruction into I0. > Sort of what happens above, though its one off. > Having "fake" opcodes would be useful, like so: > > getlabelabs I0, FOO > > to put the absolute address of the label FOO into I0, which'd probably > translate into: > > getPC I0 > add I0, I0, FOO > > or something like that. > Though I like Gregor's way of doing it: we know the addresses (more or less) at compile time, so we might as well not waste ops doing arithmetic that we know in advance... Brian > Dan > > --"it's like this"--- > Dan Sugalski even samurai > [EMAIL PROTECTED] have teddy bears and even > teddy bears get drunk
Re: Manifest constants?
On Tue, 2001-10-02 at 08:23, Dan Sugalski wrote: > >Is 'pi' a string to be looked up in a table at assemble time and > >converted to an "intrinsic constant table index" > > Yes. At some point the assembler needs to have a way to declare named > constants, we just haven't gotten there yet. > How about the attached patch? It adds these directives: DCI, DCN, DCP, DCS for "Define Constant (Integer, Number, PMC, String)" It (currently) works by abusing the equates. In the cases of Integer and PMC, it is exactly equivalent. In the case of Number and String, the value of the equate is [sc:#] or [nc:#] which is the index into the look up table. It could easily be expanded to store these constant names somewhere other than the equate hash. Since we're anticipating the release of 0.02, I didn't apply it, and I'd like feedback on it. Brian Index: Parrot/Assembler.pm === RCS file: /home/perlcvs/parrot/Parrot/Assembler.pm,v retrieving revision 1.3 diff -u -r1.3 Assembler.pm --- Parrot/Assembler.pm 2001/10/03 23:58:41 1.3 +++ Parrot/Assembler.pm 2001/10/04 00:29:23 @@ -670,7 +670,8 @@ sub has_asm_directive { return $_[0] =~ /^[_a-zA-Z]\w*\s+macro\s+.+$/i || - $_[0] =~ /^[_a-zA-Z]\w*\s+equ\s+.+$/i; + $_[0] =~ /^[_a-zA-Z]\w*\s+equ\s+.+$/i || +$_[0] =~ /^[_a-zA-Z]\w*\s+dc[inps]\s+.+$/i ; } @@ -678,8 +679,10 @@ =head2 handle_asm_directive -Processes macros and equ directives. equ directives get stored in an equ hash. -Macros store all program lines in an array. +Processes macro definitions, dc* declarations, and equ directives. equ +directives get stored in an equ hash. Macros store all program lines in an +array. dc* declarations create a constant and an equ which refers to the +constant. NOTE: This function modifies @program. @@ -691,6 +694,26 @@ my( $name, $data ) = ($1, $2); $equate{$name} = $data; return 1; + } + elsif( $line =~ /^([_a-zA-Z]\w*)\s+dc([inps])\s+(.+)$/i ) { + my( $name, $type, $data ) = ($1, $2, $3); + if( $type eq "i" ) { + # TODO: we cheat on integers, since they really don't go in the + # constants table. + $equate{$name} = $data; + } + elsif( $type eq "n" ) { + $equate{$name} = constantize_number($data); + } + elsif( $type eq "p") { + # TODO: what does a PMC constant look like? + $equate{$name} = $data; + } + elsif( $type eq "s") { + $data=~s/^"(.+)"$/$1/; + $equate{$name} = constantize_string($data); + } + return 1; } elsif( $line =~ /^([_a-zA-Z]\w*)\s+macro\s+(.+)$/i ) { # a macro definition
Re: [BUG] Mandlebrot core
On Wed, 2001-10-03 at 14:43, Leon Brocard wrote: > Leon Brocard sent the following bits through the ether: > > > It's all greek to me, Leon > > The following bytecode: > > 0075 [01d4]: 0032 0008 000a 0008 >gt_nc_ic N8, [nc:10], L2 > > is wrong. It shouldn't be [nc:10]. 10 is the numeric constant which > should have been in the constant table ([nc:10] refers to the 10th > constant, there are only constants 0-6 hence the crash), so I point > the finger at the assembler (of which I little). But I know more about > debugging coredumps now, honest. Its sort of the assembler, but an easy fix is to make the constant "10.0" instead of 10. Brian > > Leon > -- > Leon Brocard.http://www.astray.com/ > Nanoware...http://www.nanoware.org/ > > ... I tried to think but nothing happened!
Re: instructions per second benchmark (in parrot ;)
On Thu, 2001-09-20 at 16:46, Dan Sugalski wrote: > At 04:54 PM 9/20/2001 -0500, Brian Wheeler wrote: > >Since all benchmarks are crap anyway, I've written a test which tells > >the average number of instructions per second. On my athlon 700 I get > >3966053 instructions per second and on my PIII 866 I get 5081485 > >instructions per second. Do those sound like reasonable numbers? Of > >course, since time_i is one of the opcodes looped, it probably brings > >the numbers down. > > That's actually what test.pasm tests. :) I just checked in a new version > that prints labels. Yeah, I realized that _as soon as I posted it_. Doh! Heheh, for the longest time I thought test.pasm just did "stuff" for a while. Anyway, it started out as an implementation of the bogomips.c program and slowly became this. > > FWIW, my 600MHz Alpha clocks in at around 23M ops/sec. Nyah! ;-P > I get 10M ops/sec on the Athlon 700 using test.pasm. time_i is a killer, especially judging by the difference in cpu usage (test.pasm is nearly 100% user, and mine is 75/25 user/system) The 866 machine gets 15M. Maybe when/if I get my 1.26GHz machine at work I'll be able to match you :) Brian > Dan > > --"it's like this"--- > Dan Sugalski even samurai > [EMAIL PROTECTED] have teddy bears and even > teddy bears get drunk
instructions per second benchmark (in parrot ;)
Since all benchmarks are crap anyway, I've written a test which tells the average number of instructions per second. On my athlon 700 I get 3966053 instructions per second and on my PIII 866 I get 5081485 instructions per second. Do those sound like reasonable numbers? Of course, since time_i is one of the opcodes looped, it probably brings the numbers down. Brian # Compute instructions/sec using parrot asm # Brian Wheeler [EMAIL PROTECTED] # loop 6 times, throwing out the first one, and compute the average loops equ I32 counter equ I31 stime equ I30 etime equ I29 ctime equ I28 sum equ I27 mintime equ 2 main: eq loops,6,done set counter,0 timestime set etime,stime inc etime,mintime $again: inc counter timectime ne etime,ctime,$again set I1,3 set I2,mintime mul counter,counter,I1 div counter,counter,I2 # we do not count the first iteration since it has the # biggest error. (we probably started near the middle of a # second) inc loops eq loops,1,main add sum,sum,counter branch main done: set I1,5 div I2,sum,I1 print I2 print " instructions per second\n" end
Re: Changes to assemble.pl: Includes and Macros
On Thu, 2001-09-20 at 00:38, Simon Cozens wrote: > On Wed, Sep 19, 2001 at 11:29:08PM -0500, Brian Wheeler wrote: > > The only incompatibility I've introduced is now assemble.pl won't read > > from stdin...you have to give it a filename. Patches welcome! > > And, of course, these beautiful new features don't appear in the > assembler documentation. This could be because the assembler documentation > doesn't exist, but... :) > There's always time for each user to (re)read the source, but never enough time for a developer to write decent documentation :) Brian
Changes to assemble.pl: Includes and Macros
Please test this out to make sure I haven't done anything stupid! The syntax for including another file is: include 'filename' or include "filename" The file will be included as-is at that spot. The listing will reflect that a series of lines was included by printing # Start of filename # End of filename at the beginning and end of the included file. Macros are defined like: println macro thing print thing print "\n" endm and invoked like println "My happy string" in the listing, expanded macros appear with ">" in front of the code. You should be able to include up to any depth, but I didn't test it. In addition, macros should be able to use other macros. I did some basic tests, but nothing extensive. The only incompatibility I've introduced is now assemble.pl won't read from stdin...you have to give it a filename. Patches welcome! Enjoy! Brian
Re: Assembler missing byte code segment header?
As near as I can tell, its always been just the bytecode following without a length specifier. I was going to play with it, but since we're still deciding on the file format, I thought I'd leave it alone. Brian On Mon, 2001-09-17 at 16:06, Gregor N. Purdy wrote: > All -- > > I'm not certain about this, but it sure doesn't look right to me. > (note I did this after turning off fingerprint saving in the > assembler for clarity): > > $ perl -e 'print "set_i I0, 4\nend\n"' | assemble.pl | od -x > 000 55a1 0131 0002 > 020 0004 > 034 > > * The first four bytes are the magic number > * The next four bytes are all zero for "no fixup table here" > * The next four bytes are all zero for "no const table here" > * The next four bytes should be pack('l', 16) for "16 bytes >of byte code follow (NOTE: THESE SEEM TO BE MISSING) > > Now, the disassembler and interpreter don't seem to care, but I > thought the current format involved 3 or 4 segments in the > length-payload pair format. The fourth (optional) segment according > to the docs is a place to store the source code. > > > Regards, > > -- Gregor > _ > / perl -e 'srand(-2091643526); print chr rand 90 for (0..4)' \ > >Gregor N. Purdy [EMAIL PROTECTED] >Focus Research, Inc.http://www.focusresearch.com/ >8080 Beckett Center Drive #203 513-860-3570 vox >West Chester, OH 45069 513-860-3579 fax > \_/
Re: Wanted: Subroutine call example
On Mon, 2001-09-17 at 11:20, Gregor N. Purdy wrote: > > I agree that jsr/ret are what I really want, but I'm dying to play > with baby subroutines in jako, and I think I could play enough games > with a properly understood jump_i and some assembler magic to make > them work. I now have jump.pasm working right (the key was to think > in terms of word offsets, not byte offsets). :) I think I read somewhere that impatience was a good thing... > > > > It would also be helpful to have jump_ic and absjump_ic, rather than > > > having to load these things into registers. > > > > You can use branch_ic for this. absjump is probably a bad idea since it > > appears the return value of the opcode function is really a memory > > address of the host machine. If an absolute value (relative to the > > beginning of the opcode stream) was used, then it would have to be > > relocated at load-time and wouldn't be able to remain read-only. > > My mental model for this thing is "Parrot is a CPU architecture" and > a lot of what I'm trying to make it do is behave as much as possible > in that way. So, if we are talking about a CPU architecture (ISA), > lacking the full complement of relative and absolute jumps, register > and constant comparisons, etc. is a bummer. I think we're ok on relative ops, but you're right, we're short of absolute. > > Mucking with things like memory addresses of host machines, etc. is > important, but only at the interpreter level, not at the ISA design > level (IMHO). > > I don't see a relocation issue anyway, since the way I look at the > interpreter code is: > > PC === (code - start_code) > > So if an instruction says "absjump 43", then I know that I need > to do: > > code = (start_code + 43) > > and if an instruction says "(rel)jump 43" then I know that I need > to do: > > code += 43 > > neither of these is a problem in my eyes. > Ok, they're not really problems, per se, but the way the interpreter is constructed, the ops *always* return the offset to the next PC. > > Overall, I think you're barking up the wrong tree. jsr/ret are for > > subroutines. Your subroutine isn't going to work for more than one > > caller, since the return offset is hardcoded, you might as well write > > jump.pasm as: > > Not quite. First, recall jump.pasm (attached). It does work now that > I've got the offsets correct. Now, there is nothing to stop us from > moving the setting of I2 to earlier (jump2.pasm, attached). With > appropriate labels and address arithmetic (jump3.pasm, attached) it > is even almost readable. > > Now, if we also have good macros and '* == PC' and calculated labels, > we can have some real fun (jump4.pasm, attached). > Ok, fair enough. I've implemented '*' as the PC in the assembler. I've not done macros yet, but I remember seeing a patch somewhere. > Of course, this doesn't allow us to do recursion, wastes registers, > etc. But, it would allow me to play with the syntax of the language > with a simple implementation until cooler ops are available. > :) there are real architectures that couldn't (easily, if at all) do recursion. I think the PDP-8 was one of them. > > We're going to have to wait for Dan (or, you could manually apply the > > patch I posted earlier which adds a simple jsr/ret implementation) > > Probably it would be best to wait. But my Impatience is showing... > Like I said...I think I've heard its a good thing :) Brian
Re: Wanted: Subroutine call example
On Sun, 2001-09-16 at 14:26, Gregor N. Purdy wrote: > Brian -- > > > Its not going to work, if I understand it correctly. I tried doing it > > (even set up the symbol '*' to mean the current PC) and do it, but it > > seems the ops take a relative offset. Take jump_i, for example: > > Taking this into account, I modivied jump.pasm and fixed the offsets. > I tested it this way... > > $ assemble.pl --output t/jump.pbc --listing t/jump.list t/jump.pasm > $ test_prog t/jump.pbc > Jump test. > Jumping to subroutine... > $ perl disassemble.pl t/jump.pbc > > > I also did 'vi t/jump.list'. With these hard-coded relative offsets, I > think the program should be producing the desired output, but it > doesn't. If it did, I could do some tricks in jakoc and/or assemble.pl > to create a pseudo-op for jumping to any label, and another for > setting up the return-offset based on the end-label of the destination > block and the label we want it to return to. It sure would be nice, > though, to have a real 'absjump_i' opcode (really start-of-bytecode- > relative rather than current-program-counter relative). > This won't be needed once jsr/ret become available. ret is (really) the absjump_i you're looking for since its only purpose is to return to a previous place. Everything else can be setup as relative to the PC. > It would also be helpful to have jump_ic and absjump_ic, rather than > having to load these things into registers. You can use branch_ic for this. absjump is probably a bad idea since it appears the return value of the opcode function is really a memory address of the host machine. If an absolute value (relative to the beginning of the opcode stream) was used, then it would have to be relocated at load-time and wouldn't be able to remain read-only. > > Anyway, I'm attaching the revised jump.pasm example. Maybe I still > don't have the offsets calculated right, but I stared at jump.list > and the disassembly listing for a while and I *think* they are > right. > > I wish I had a tracing mode where I could watch exactly where it > was jumping to and what ops it was executing. I still wonder if > somehow its jumping out of the bytecode and landing on an op zero > (end). Jumping out of the bytecode *should* raise an error, though, > no? > > If I'm not using this right, I sure would like to see an example of > the correct use of jump_i... I've not figured out what jump_i is for :) I suspect its for calculated jumps (i.e. the switch statement) rather than for subroutines. Overall, I think you're barking up the wrong tree. jsr/ret are for subroutines. Your subroutine isn't going to work for more than one caller, since the return offset is hardcoded, you might as well write jump.pasm as: MAIN: print "Jump test.\n" print "Jumping to subroutine...\n" branch SUB RET:print "Returned from subroutine!\n" end SUB:print "Entered subroutine...\n" branch RET because the bytecodes dumped out are virtually identical. We're going to have to wait for Dan (or, you could manually apply the patch I posted earlier which adds a simple jsr/ret implementation) Brian
Re: Wanted: Subroutine call example
On Sun, 2001-09-16 at 11:51, Gregor N. Purdy wrote: > All -- > > > Anyone care to post a subcall.pasm example file that shows the > > implementation of a subroutine and a call to it? I was thinking > > of starting from euclid.pasm (since it has two args), but I'm > > not sure I understand what the calling convention is really > > supposed to look like... > > Here are a couple of attempts using just jump_i. I've played a bit > with the assembler locally to try to make it allow 'set I0, LABEL', > followed by 'jump I0', but I think it always calculates relatative > offsets for labels as needec by branch, et al. Do we need an > assembler syntax for 'absolute location of this label'? > > I think jump.pasm should work (I've hardcoded the jump-to locations > as hex numbers), but it doesn't. > > The jumpsub.pasm example is closer to what I'd really like to do... > > Any thoughts? > > Its not going to work, if I understand it correctly. I tried doing it (even set up the symbol '*' to mean the current PC) and do it, but it seems the ops take a relative offset. Take jump_i, for example: /* JUMP Ix */ MANUAL_OP jump_i { RETURN(INT_REG(P1)); } becomes /* JUMP Ix */ IV *jump_i(IV cur_opcode[], struct Parrot_Interp *interpreter) { IV return_offset = 1; return cur_opcode + INT_REG(cur_opcode[1]); return cur_opcode + return_offset; } So everything has to be done as a relative offset to the current PC. I had a patch early on which implemented jsr_ic and ret, but they were rejected pending Dan's work...and rightly so, because I cheated and returned the PC I wanted, bypassing the RETURN() macro :) Brian > Regards, > > -- Gregor > _ > / perl -e 'srand(-2091643526); print chr rand 90 for (0..4)' \ > >Gregor N. Purdy [EMAIL PROTECTED] >Focus Research, Inc.http://www.focusresearch.com/ >8080 Beckett Center Drive #203 513-860-3570 vox >West Chester, OH 45069 513-860-3579 fax > \_/
RFC: bytecode file format (again)
After reading various posts about the bytcode file format it occurred to me that we need to determine what we need :) Meta-information: * Magic cookie * version * endian/size markers * index of all chunks for fast lookup Things we need to store: * bytecode * external symbols/list of modules required * public symbols * source code * raw data (think: predefined PMCs, DATA sections ,etc) * other stuff :) So, in the spirit of KISS, what about this format: Offset Size Description 0 4 Magic Cookie 'PAR0' 4 8 Endian mark (0x0123456789ABCDEF) 12 4 Version (Major * 0x1 + Minor) 16 4 Size Marker 20 12 Padding/Reserved for future use 32 n Index (n=# of entries * 16) 32+n m Data blocks, stored sequentially Each index entry 16 bytes, and is laid out like:: Offset Size Description 0 4 Type (bytecode, comment, source, fixup, etc) 4 4 ID (to differentiate blocks of the same type) 8 4 Offset (offset from beginning of file of data) 12 4 Length (in bytes) Block IDs are used when we need to differentiate between different blocks of the same type, i.e. constant data or multiple sets of bytecode in the same file as in libraries. Given this overall format, the format of each data block varies depending on its type. Type_id Name Ignore/Empty/Invalid - This block is ignored 0001 Bytecode 0002 String Constants 0003 External Symbols 0004 Public Symbols 0005 Source Code 0006 Debugging Info Thoughts on *this* can of worms? Brian
Local labels in assemble.pl
I've commited a change which allows local labels to be used in parrot. The labels are local relative to the last non-local label defined (i.e. local labels are forgotten when a non-local is defined). Here's my test program: main: print "test 1\n" branch $ok $ng:print "ng 1\n" branch test2 $ok:print "ok 1\n" test2: print "test 2\n" branch $ok $ng:print "ng 2\n" branch done $ok:print "ok 2\n" done: end It should make automatic code generation easier for jako. If a listing is produced, local labels are _not_ shown. Brian
Re: RFC: Bytecode file format
On Fri, 2001-09-14 at 15:42, Dan Sugalski wrote: > At 03:10 PM 9/14/2001 -0500, Brian Wheeler wrote: > >I've been thinking alot about the bytecode file format lately. Its > >going to get really gross really fast when we start adding other > >(optional) sections to the code. > > > >So, with that in mind, here's what I propose: > > > >* All data sizes are in longwords (4 bytes) because that's just the way > >things are :) > > Nope. At the very least, a bytecode file needs to start with: > > 8-byte word:endianness (magic value 0x123456789abcdef0) > byte: word size > byte[7]:empty > word: major version > word: minor version > > Where all word values are as big as the word size says they are. > > The magic value can be something else, but it should byteswap such that if > you read it in you can tell whether it was a big-endian write or a > little-endian write. > > Followed by the offsets to the various segments in the file. The directory > (basically the offsets) *must* be at the very beginning, right after the > header. > > Also, don't choose a stream format that makes writing the file easy--choose > one that makes reading easy and fast. Two or three passes before writing > are fine. > And...in another message > Heh. Doing our bytecode files in IFF format would be really nifty. We > could even see about raiding the old Fred Fish library for C source to > read and write 'em... :) Fred Fish...that's a name I've not heard in a long time. I still have a floppy or two with "fish disk ###" stuck on them. Ok, I'll admit I was initially considering IFF but it seemed too scary...but then it could be residual feelings from when I was a youngster and tried to write them with BASIC :) Ok, what if we did IFF with these caveats: * all chunks must be padded to 4 bytes (instead of IFF's 2) * no nesting of FORMs Chunks we'd need are: Name: 'PINF' - Parrot Information Size: 28 bytes + size of directory Optional: No Data: longmagic cookie (or will PINF) be enough? 8-byte word:endianness (magic value 0x123456789abcdef0) byte: word size byte[7]:empty word: major version word: minor version longcount of directory entries --- directory goes here --- -- each entry as follows -- longtype of chunk longoffset Name: 'PBYT' - Parrot Bytecode Size: Varies Optional: Sure. :) Data: bytes of the bytecode Name: 'PSTR' - Parrot String Table Size: Varies Optional: Yes Data: longCount of string entries --- each string as follows --- longbyte length n bytes + pad string data Name: 'PFIX' - Parrot Fixup Table Size: Varies Optional: Yes Data: --- beats me...how are we doing fixups? --- Name: 'PNOT' - Parrot Notes Block Size: Varies Optional: Yes Data: free-form text for 'notes' about the file. How's this? Brian
Re: RFC: Bytecode file format
On Fri, 2001-09-14 at 15:44, Buddha Buck wrote: > At 03:10 PM 09-14-2001 -0500, Brian Wheeler wrote: > >I've been thinking alot about the bytecode file format lately. Its > >going to get really gross really fast when we start adding other > >(optional) sections to the code. > > > >So, with that in mind, here's what I propose: > > > > > >What do you guys think? > > Have you taken a look at the old Amiga IFF format? It consisted mainly of > "chunks" identified by a 32-bit type code and a chunk-length code. While > most implementations were for specific multi-media applications (chunks > defining sound formats, chunks defining image formats, etc), the standard > itself was data-neutral. :) That was one of the references I used. I looked up these formats on wotsit before proposing: * IFF Consistant, but you've got to scan the whole file to find out ifa chunk is even there. * Doom WAD Clean format, mini-directory at the end. * ELF Overkill, I think, but does give some ideas of the types of sections we'll be needing. > > I believe that Microsoft is using a derivative of that format for some of > its files, and I think that TIFF files are another instantiation. > > It may be worth looking at to avoid re-inventing wheels. > > Agreed. Brian
RFC: Bytecode file format
I've been thinking alot about the bytecode file format lately. Its going to get really gross really fast when we start adding other (optional) sections to the code. So, with that in mind, here's what I propose: * All data sizes are in longwords (4 bytes) because that's just the way things are :) * The file is composed of a header (which is really just a magic cookie) , a series of data chunks, and a directory (of sorts) Offset Length Description 0 1 Magic Cookie (0x013155a1) 1 n Data n+1 m Directory Table m+n+1 1 Offset of beginning of directory table (i.e. n+1) The directory is after the data so offsets can be determined as the data is written. The directory offset is at the very end, so it can be determined before the directory is written, and easily found by loaders. Each Directory Entry consists of 3 longs: Offset Length Description 0 1 ID/Type of chunk 1 1 Size of chunk 2 1 Offset from beginning of file Chunk types we've seen so far are: Bytecode, Fixup, String Table. Just guessing, but we'll probably need things likes symbol tables, source chunks, notes, etc. What do you guys think? Brian
Re: Calls for a Macro Assembler Guy
On Fri, 2001-09-14 at 10:20, Dan Sugalski wrote: > Okay, we've had a number of people in favor of a good macro assembler for > Parrot. Given that, do we have anyone who'll volunteer to define, maintain, > and extend the thing? (Presuming we jump off of the current assembler, > which seems reasonable) > > There probably isn't a huge amount to do with the thing--maintain macro > substitutions, handle local labels, manage sub definitions, and suchlike > things. > Wouldn't it largely be just filtering the input through cpp? Brian > Anyone? > > Dan > > --"it's like this"--- > Dan Sugalski even samurai > [EMAIL PROTECTED] have teddy bears and even > teddy bears get drunk
segfaults in today's cvs
I've tracked it down to string problems. Looks like: init_bytecode(program_code); calls read_constants_table(&program_code); calls Parrot_string_constants[i++] = string_make(*program_code /* ouch */, buflen, encoding, flags, type); calls string_compute_strlen(s); which segfaults. Any ideas what's going on? Brian
patch: assembly listings from assembler
This patch does a couple of things: * uses Getopt::Long for options. -c is now --checksyntax. I wasn't sure how to keep compatible (patches welcome!) * options include: --help --version --verbose --output=file --listing=file --checksyntax * produces verbose listing of what the assembler saw :) Only one nitpick with it: unknown symbols are given as 0x, unfortunately, this includes symbols which may be defined later in the file (i.e. forward jumps). Brian Index: assemble.pl === RCS file: /home/perlcvs/parrot/assemble.pl,v retrieving revision 1.14 diff -r1.14 assemble.pl 7a8 > use Getopt::Long; 9,12c10,33 < my $opt_c; < if (@ARGV and $ARGV[0] eq "-c") { < shift @ARGV; < $opt_c = 1; --- > my %options; > GetOptions(\%options,('checksyntax', > 'help', > 'version', > 'verbose', > 'output=s', > 'listing=s')); > > if($options{'version'}) { > print $0,'Version $Id$ ',"\n"; > exit; > } > > if($options{'help'}) { > print "$0 - Parrot Assembler > Options: >--checksyntaxCheck assembler syntax only, no output >--help This text >--listingDump assembly listing to file >--output File to dump bytecode into >--verboseShow what's going on >--versionShow assembler version > "; > exit; 14a36,44 > if(exists($options{'output'}) && $options{'output'} eq "") { > print STDERR "You must provide a file with --output flag!\n"; > exit; > } > > if(exists($options{'listing'}) && $options{'listing'} eq "") { > print STDERR "You must provide a file with --listing flag!\n"; > exit; > } 54a85,87 > my $listing="PARROT ASSEMBLY LISTING - ".scalar(localtime)."\n\n"; > > 62a96 > my $sline=$_; 65c99,104 < next if(/^\#/ || $_ eq ""); --- > if(/^\#/ || $_ eq "") { > if($options{'listing'}) { > $listing.=sprintf("%4d %08x %-44s %s\n", $line, $op_pc, '',$sline); > } > next; > } 121a161 > log_message("substituting $op for $opcode"); 161a202,209 > if($options{'listing'}) { > # add line to listing. > my $odata; > foreach (unpack('l*',substr($bytecode,$op_pc))) { > $odata.=sprintf("%08x ",$_); > } > $listing.=sprintf("%4d %08x %-44s %s\n", $line, $op_pc, $odata,$sline); > } 163a212 > $listing.="\n" if($options{'listing'}); 173c222,229 < # FIXUP --- > # FIXUP (also, dump listing symbols) > if($options{'listing'}) { > $listing.="DEFINED SYMBOLS:\n"; > foreach (sort(keys(%label))) { > $listing.=sprintf("\t%08x %s\n",$label{$_},$_); > } > } > 184c240,244 < exit; --- > $listing.="\nUNDEFINED SYMBOLS:\n"; > foreach (sort(keys(%fixup))) { > $listing.="\t$_\n"; > } > exit; # some day, unresolved symbols won't be an error! 195a256,258 > if($options{'listing'}) { > $listing.="\nSTRING CONSTANTS\n"; > } 196a260 > my $counter=0; 204c268,271 <} --- > } > $listing.=sprintf("\t%04x %08x [[%s]]\n",$counter,length($_),$_) > if($options{'listing'}); > $counter++; 217,218c284,298 < if(!$opt_c) { < print $output; --- > if(!$options{'checksyntax'}) { > if($options{'output'} ne "") { > open O,">$options{'output'}" || die $!; > print O $output; > close O; > } else { > print $output; > } > } > > > if($options{'listing'}) { > open L,">$options{'listing'}" || die $!; > print L $listing; > close L; 224a305,311 > } > > sub log_message { > my($message)=@_; > if($options{'verbose'}) { > print STDERR "INFO ($line): $message\n"; > }
patch: bitops with constants
This patch allows you to do thingies like: and I1,I2,0x 'and', 'or', and 'xor' have been adapted to use this. Also, shl and shr can take an integer register as the amount to shift. Brian ? pasm.pl ? patch ? test2.pbc ? test3.pbc ? euclid0.pbc ? euclid1.pbc ? euclid.pbc ? const_bitops.diff ? bitops.pbc ? t/bitops.pasm Index: basic_opcodes.ops === RCS file: /home/perlcvs/parrot/basic_opcodes.ops,v retrieving revision 1.14 diff -u -r1.14 basic_opcodes.ops --- basic_opcodes.ops 2001/09/13 16:16:38 1.14 +++ basic_opcodes.ops 2001/09/13 19:13:43 @@ -562,6 +562,12 @@ INT_REG(P1) = INT_REG(P2) & INT_REG(P3); } +/* AND_i_ic */ +AUTO_OP and_i_ic { + INT_REG(P1) = INT_REG(P2) & P3; +} + + /* NOT_i */ AUTO_OP not_i { INT_REG(P1) = ! INT_REG(P2); @@ -572,17 +578,37 @@ INT_REG(P1) = INT_REG(P2) | INT_REG(P3); } +/* OR_i_ic */ +AUTO_OP or_i_ic { + INT_REG(P1) = INT_REG(P2) | P3; +} + /* SHL_i_ic */ AUTO_OP shl_i_ic { INT_REG(P1) = INT_REG(P2) << P3; } +/* SHL_i */ +AUTO_OP shl_i { + INT_REG(P1) = INT_REG(P2) << INT_REG(P3); +} + /* SHR_i_ic */ AUTO_OP shr_i_ic { INT_REG(P1) = INT_REG(P2) >> P3; } +/* SHR_i */ +AUTO_OP shr_i { + INT_REG(P1) = INT_REG(P2) >> INT_REG(P3); +} + /* XOR_i */ AUTO_OP xor_i { INT_REG(P1) = INT_REG(P2) ^ INT_REG(P3); +} + +/* XOR_i_ic */ +AUTO_OP xor_i_ic { + INT_REG(P1) = INT_REG(P2) ^ P3; } Index: opcode_table === RCS file: /home/perlcvs/parrot/opcode_table,v retrieving revision 1.13 diff -u -r1.13 opcode_table --- opcode_table2001/09/13 16:16:38 1.13 +++ opcode_table2001/09/13 19:13:44 @@ -152,9 +152,15 @@ # Bitwise Ops and_i 3 I I I +and_i_ic 3 I I i not_i 2 I I or_i 3 I I I +or_i_ic3 I I i +shl_i 3 I I I shl_i_ic 3I I i +shr_i 3 I I I shr_i_ic 3I I i xor_i 3 I I I +xor_i_ic 3 I I i +
Yet another patch: assemble.pl - better opcode guessing
This does a better job at guessing the correct opcode: the constant is compared to a regex and determined which kind it is, instead of saying "its just some sort of constant". This fixes the guessing problems with my (print_ic print_sc print_nc) patch. Brian Index: assemble.pl === RCS file: /home/perlcvs/parrot/assemble.pl,v retrieving revision 1.12 diff -r1.12 assemble.pl 95,97d94 < } elsif(m/^\d+$/) { < # a constant of some sort < push @arg_t,'(ic|nc|sc)'; 99,100c96,109 < # a label < push @arg_t,'ic'; --- > # a constant of some sort > if(m/^\[(\d+)\]$/) { > # string > push @arg_t,'sc'; > } elsif(m/^((-?\d+)|(0b[01]+)|(0x[0-9a-f]+))$/i) { > # integer > push @arg_t,'ic'; > } elsif(m/^[a-z][\w]*$/i) { > # label > push @arg_t,'ic'; > } else { > # numeric > push @arg_t,'nc'; > } 109a119 > print STDERR "test: $test\n"; 112c122 < if($op=~/$test/) { --- > if($op eq $test) { 121,122c131 < error("Wrong arg count--got ".scalar(@args)." needed < ".$opcodes{$opcode}{ARGS}); --- > error("Wrong arg count--got ".scalar(@args)." needed >".$opcodes{$opcode}{ARGS}); 131,132c140 < if($rtype eq "I" || $rtype eq "N" || $rtype eq "P" || $rtype eq < "S") { --- > if($rtype eq "I" || $rtype eq "N" || $rtype eq "P" || $rtype eq "S") { 145a154,156 > } elsif($rtype eq 's') { > $args[$_]=~s/[\[\]]//g; > $pc+=$sizeof{$rtype}; 220,222c231,238 < return $constants{$s} if exists $constants{$s}; < push @constants, $s; < return $constants{$s} = $#constants; --- > # handle \ characters in the constant > my %escape = ('a'=>"\a",'n'=>"\n",'r'=>"\r",'t'=>"\t",'\\'=>'\\',); > $s=~s/\\([anrt\\])/$escape{$1}/g; > if(!exists($constants{$s})) { > push(@constants,$s); > $constants{$s}=$#constants; > } > return "[".$constants{$s}."]"; 223a240 >
patch: print op cleanups, new ops, assembler support of \(whatever)
This patch gives the assembler support of '\a','\n','\r','\t', and '\\' in string constants. In addition, it changes (for all registers) "I reg %li is ..." to just the value of the register. Printing constants is also supported, but alas, you have to specify the type (print_sc, print_ic, print_nc). Brian ? pasm.pl ? patch ? test2.pbc ? test3.pbc ? euclid0.pbc ? euclid1.pbc ? euclid.pbc ? print_cleanups.diff ? bitops.pbc ? bitops+assembler.patch ? t/bitops.pasm Index: assemble.pl === RCS file: /home/perlcvs/parrot/assemble.pl,v retrieving revision 1.12 diff -r1.12 assemble.pl 112c112 < if($op=~/$test/) { --- > if($op=~/^$test$/) { 121,122c121 < error("Wrong arg count--got ".scalar(@args)." needed < ".$opcodes{$opcode}{ARGS}); --- > error("Wrong arg count--got ".scalar(@args)." needed >".$opcodes{$opcode}{ARGS}); 131,132c130 < if($rtype eq "I" || $rtype eq "N" || $rtype eq "P" || $rtype eq < "S") { --- > if($rtype eq "I" || $rtype eq "N" || $rtype eq "P" || $rtype eq "S") { 219a218,220 > # handle \ characters in the constant > my %escape = ('a'=>"\a",'n'=>"\n",'r'=>"\r",'t'=>"\t",'\\'=>'\\',); > $s=~s/\\([anrt\\])/$escape{$1}/g; 223a225 > Index: basic_opcodes.ops === RCS file: /home/perlcvs/parrot/basic_opcodes.ops,v retrieving revision 1.13 diff -r1.13 basic_opcodes.ops 120c120 < printf("I reg %li is %li\n", P1, INT_REG(P1)); --- > printf("%li", INT_REG(P1)); 121a122,127 > > /* PRINT ic */ > AUTO_OP print_ic { > printf("%li", P1); > } > 212c218 < printf("N reg %li is %f\n", P1, NUM_REG(P1)); --- > printf("%f", NUM_REG(P1)); 214a221,225 > /* PRINT nc */ > AUTO_OP print_nc { > printf("%f", P1); > } > 317c328,334bufstart); --- > printf("%.*s",(int)string_length(s),(char *) s->bufstart); > } > > /* PRINT sc */ > AUTO_OP print_sc { > STRING *s = Parrot_string_constants[P1]; > printf("%.*s",(int)string_length(s),(char *) s->bufstart); 318a336 > Index: opcode_table === RCS file: /home/perlcvs/parrot/opcode_table,v retrieving revision 1.12 diff -r1.12 opcode_table 56a57 > print_sc 1 s 86a88 > print_ic 1 i 88a91 > print_nc 1 n
Re: patch: assemble.pl choosing wrong op sometimes...
On Thu, 2001-09-13 at 09:52, Jarkko Hietaniemi wrote: > On Thu, Sep 13, 2001 at 09:54:35AM -0500, Brian Wheeler wrote: > > I caught it trying to use inc_i_ic instead of inc_i in a test program I > > was running. this patch fixes it. > > > > Brian > > > > > > Index: assemble.pl > > === > > RCS file: /home/perlcvs/parrot/assemble.pl,v > > retrieving revision 1.12 > > diff -r1.12 assemble.pl > > 112c112 > > < if($op=~/$test/) { > > --- > > > if($op=~/^$test$/) { > > Unless $test is really a pattern here this is $op eq $test... > It is a pattern. If you have a constant, it doesn't know what kind, so the pattern it creates is "(ic|nc|sc)" Brian
patch: assemble.pl choosing wrong op sometimes...
I caught it trying to use inc_i_ic instead of inc_i in a test program I was running. this patch fixes it. Brian Index: assemble.pl === RCS file: /home/perlcvs/parrot/assemble.pl,v retrieving revision 1.12 diff -r1.12 assemble.pl 112c112 < if($op=~/$test/) { --- > if($op=~/^$test$/) {
Re: pasm.pl: a different parrot assembler
On Thu, 2001-09-13 at 02:23, Simon Cozens wrote: > On Wed, Sep 12, 2001 at 11:23:27PM -0500, Brian Wheeler wrote: > > I've been having tons of problems with labels in the current > > assembler...so I wrote my own. It should provide all of the features > > that the current assembler has. I'ved tested and ran all of the current > > t/*.pasm files. > > > > Here it is...feedback is always welcome. > > Cool. Now, if it only handled abbreviated op names, I'd apply it... :) > > Simon You, sir, are a very picky man. :) Here it is...I tested it on this euclid.pasm: MAIN: set I1, 96 set I2, 64 set I3, 0 set S1, "Algorithm E (Euclid's algorithm)" print S1 E1: mod I4, I1, I2 E2: eq I4, I3, DONE, E3 E3: set I1, I2 set I2, I4 branch E1 DONE: print I2 end Enjoy! Brian #! /usr/bin/perl -w # # pasm.pl - take a parrot assembly file and spit out a bytecode file # This is based heavily on assemble.pl # Brian Wheeler ([EMAIL PROTECTED]) use strict; my $opt_c; if (@ARGV and $ARGV[0] eq "-c") { shift @ARGV; $opt_c = 1; } # define data types my(%pack_type)=('i'=>'l','n'=>'d'); my(%real_type)=('I'=>'i','i'=>'i', 'N'=>'i','n'=>'n', 'S'=>'i','s'=>'i', 'D'=>'i'); # compute sizes my(%sizeof); foreach (keys(%real_type)) { $sizeof{$_}=length(pack($pack_type{$real_type{$_}},0)); } # get opcodes from guts. open GUTS, "interp_guts.h"; my %opcodes; while () { next unless /\tx\[(\d+)\] = ([a-z_]+);/; $opcodes{$2}{CODE} = $1; } close GUTS; # get opcodes and their arg lists open OPCODES, ") { next if /^\s*#/; chomp; s/^\s+//; next unless $_; my ($name, $args, @types) = split /\s+/, $_; my @rtypes=@types; @types=map { $_ = $real_type{$_}} @types; $opcodes{$name}{ARGS} = $args; $opcodes{$name}{TYPES} = [@types]; $opcodes{$name}{RTYPES}=[@rtypes]; } close OPCODES; # read source and assemble my $pc=0; my $op_pc=0; my ($bytecode,%label,%fixup,%constants,@constants); my $line=0; while(<>) { $line++; chomp; s/^\s*//; s/\s*$//; next if(/^\#/ || $_ eq ""); if(m/^((\S+):)?\s*(.+)?/) { my($label,$code)=($2,$3); if(defined($label) && $label ne "") { if(exists($label{$label})) { error("'$label' already defined!"); } if(exists($fixup{$label})) { # backpatch everything with this PC. while(scalar(@{$fixup{$label}})) { my $op_pc=shift(@{$fixup{$label}}); my $offset=shift(@{$fixup{$label}}); substr($bytecode,$offset,4)=pack('l',($pc-$op_pc)/4); } delete($fixup{$label}); } $label{$label}=$pc; # store it. } next if(!defined($code)); 1 while $code=~s/\"([^\\\"]*(?:\\.[^\\\"]*)*)\"/constantize($1)/eg; $code=~s/,/ /g; my($opcode,@args)=split(/\s+/,$code); $opcode=lc($opcode); if (!exists $opcodes{$opcode}) { # try to determine _real_ opcode. my @arg_t=(); foreach (@args) { if(m/^([INPS])\d+$/) { # a register. push @arg_t,lc($1); } elsif(m/^\d+$/) { # a constant of some sort push @arg_t,'(ic|nc|sc)'; } else { # a label push @arg_t,'ic'; } } my $test; my($first,$last)=($arg_t[0],$arg_t[-1]); if($first ne $last) { $test="$opcode\_$first\_$last"; } else { $test="$opcode\_$first"; } my($found_op)=0; foreach my $op (grep($_=~/^$opcode/,keys(%opcodes))) { if($op=~/$test/) { $opcode=$op; $found_op=1; last; } } error("No opcode $opcode in <$_>") if(!$found_op); } if (@args != $opcodes{$opcode}{ARGS}) { error("Wrong arg count--got ".scalar(@args)." needed ".$opcodes{$opcode}{ARGS}); } $bytecode .= pack "l", $opcodes{$opcode}{CODE}; $op_pc=$pc; $pc+=4; foreach (0..$#args) { my($rtype)=$opcodes{$opcode}{RTYPES}[$_]; my($type)=$opcodes{$opcode}{TYPES}[$_]; if($rtype eq "I" || $rtype eq "N" || $rtype eq "P" || $rtype eq "S") { # its a register argument $args[$_]=~s/^[INPS](\d+)$/$1/i; $pc+=$sizeof{$rtype} } elsif($rtype eq "D") {
jsr_ic & ret support
This diff adds jsr_ic and ret to the interpreter. I don't know if my way of returning is legal, and I know there's probably issues with 64 bit machines, but it works...and that's the important part :) Right now it only has a depth of 32 and no bounds checking, but its enough to get started. Brian Index: basic_opcodes.ops === RCS file: /home/perlcvs/parrot/basic_opcodes.ops,v retrieving revision 1.8 diff -r1.8 basic_opcodes.ops 331a332,345 > > // JSR > MANUAL_OP jsr_ic { > interpreter->call_stack[interpreter->call_sp]=cur_opcode+2; > interpreter->call_sp++; > RETURN(P1); > } > > // RET > MANUAL_OP ret { > interpreter->call_sp--; > // probably not legal :) > return interpreter->call_stack[(interpreter->call_sp)]; > } Index: interpreter.h === RCS file: /home/perlcvs/parrot/interpreter.h,v retrieving revision 1.3 diff -r1.3 interpreter.h 31a32,34 > > unsigned int call_sp; > unsigned long call_stack[32]; Index: opcode_table === RCS file: /home/perlcvs/parrot/opcode_table,v retrieving revision 1.9 diff -r1.9 opcode_table 75a76,77 > jsr_ic1 D > ret 0
pasm.pl: a different parrot assembler
I've been having tons of problems with labels in the current assembler...so I wrote my own. It should provide all of the features that the current assembler has. I'ved tested and ran all of the current t/*.pasm files. Here it is...feedback is always welcome. Brian #! /usr/bin/perl -w # # pasm.pl - take a parrot assembly file and spit out a bytecode file # This is based heavily on assemble.pl # Brian Wheeler ([EMAIL PROTECTED]) use strict; my $opt_c; if (@ARGV and $ARGV[0] eq "-c") { shift @ARGV; $opt_c = 1; } # define data types my(%pack_type)=('i'=>'l','n'=>'d'); my(%real_type)=('I'=>'i','i'=>'i', 'N'=>'i','n'=>'n', 'S'=>'i','s'=>'i', 'D'=>'i'); # compute sizes my(%sizeof); foreach (keys(%real_type)) { $sizeof{$_}=length(pack($pack_type{$real_type{$_}},0)); } # get opcodes from guts. open GUTS, "interp_guts.h"; my %opcodes; while () { next unless /\tx\[(\d+)\] = ([a-z_]+);/; $opcodes{$2}{CODE} = $1; } close GUTS; # get opcodes and their arg lists open OPCODES, ") { next if /^\s*#/; chomp; s/^\s+//; next unless $_; my ($name, $args, @types) = split /\s+/, $_; my @rtypes=@types; @types=map { $_ = $real_type{$_}} @types; $opcodes{$name}{ARGS} = $args; $opcodes{$name}{TYPES} = [@types]; $opcodes{$name}{RTYPES}=[@rtypes]; } close OPCODES; # read source and assemble my $pc=0; my $op_pc=0; my ($bytecode,%label,%fixup,%constants,@constants); my $line=0; while(<>) { $line++; chomp; s/^\s*//; s/\s*$//; next if(/^\#/ || $_ eq ""); if(m/^((\S+):)?\s*(.+)?/) { my($label,$code)=($2,$3); if(defined($label) && $label ne "") { if(exists($label{$label})) { error("'$label' already defined!"); } if(exists($fixup{$label})) { # backpatch everything with this PC. while(scalar(@{$fixup{$label}})) { my $op_pc=shift(@{$fixup{$label}}); my $offset=shift(@{$fixup{$label}}); substr($bytecode,$offset,4)=pack('l',($pc-$op_pc)/4); } delete($fixup{$label}); } $label{$label}=$pc; # store it. } next if(!defined($code)); 1 while $code=~s/\"([^\\\"]*(?:\\.[^\\\"]*)*)\"/constantize($1)/eg; $code=~s/,/ /g; my($opcode,@args)=split(/\s+/,$code); if (!exists $opcodes{lc $opcode}) { error("No opcode $opcode in <$_>"); } if (@args != $opcodes{$opcode}{ARGS}) { error("Wrong arg count--got ".scalar(@args)." needed ".$opcodes{$opcode}{ARGS}); } $bytecode .= pack "l", $opcodes{$opcode}{CODE}; $op_pc=$pc; $pc+=4; foreach (0..$#args) { my($rtype)=$opcodes{$opcode}{RTYPES}[$_]; my($type)=$opcodes{$opcode}{TYPES}[$_]; if($rtype eq "I" || $rtype eq "N" || $rtype eq "P" || $rtype eq "S") { # its a register argument $args[$_]=~s/^[INPS](\d+)$/$1/i; $pc+=$sizeof{$rtype} } elsif($rtype eq "D") { # a destination if(!exists($label{$args[$_]})) { # we have not seen it yet...put it on the fixup list push(@{$fixup{$args[$_]}},$op_pc,$pc); $args[$_]=0x; } else { $args[$_]=($label{$args[$_]}-$op_pc)/4; } $pc+=$sizeof{$rtype}; } else { $args[$_]=oct($args[$_]) if($args[$_]=~/^0/); $pc+=$sizeof{$rtype}; } $bytecode .= pack $type, $args[$_]; } } } my $output; # build file in memory # MAGIC COOKIE $output=pack($pack_type{i},0x13155a1); # FIXUP if(keys(%fixup)) { print STDERR "SQUAWK! These symbols were referenced but not defined:\n"; foreach (sort(keys(%fixup))) { print STDERR "\t$_ at pc: "; foreach my $pc (@{$fixup{$_}}) { print STDERR sprintf("%08x ",$pc); } print STDERR "\n"; } exit; } else { # dump empty header $output.=pack($pack_type{i},0); } # CONSTANTS if(@constants) { my($const); # Then spit out how many constants there are, so we can allocate $const .= pack($pack_type{i}, scalar @constants); # Now emit each constant for (@constants) {
Resend: working patch for bitops
This seems to be working: * fixes for label-only lines in assembler * recognition of 0x, 0b, etc in constants * and, not, or, shl, shr, xor Enjoy! Brian Patch follows Index: assemble.pl === RCS file: /home/perlcvs/parrot/assemble.pl,v retrieving revision 1.8 diff -u -r1.8 assemble.pl --- assemble.pl 2001/09/12 09:54:46 1.8 +++ assemble.pl 2001/09/13 00:46:50 @@ -70,11 +70,12 @@ # Now assemble $pc = 0; my $line = 0; -while ($_ = shift @code) { +foreach (@code) { $line++; chomp; +next if(m/^\s*$/); # blank lines s/,/ /g; - + my ($opcode, @args) = split /\s+/, $_; if (!exists $opcodes{lc $opcode}) { @@ -93,6 +94,8 @@ } elsif($rtype eq "D") { # a destination $args[$_]=fixup($args[$_]); + } else { + $args[$_]=oct($args[$_]) if($args[$_]=~/^0/); } $output .= pack $type, $args[$_]; } Index: basic_opcodes.ops === RCS file: /home/perlcvs/parrot/basic_opcodes.ops,v retrieving revision 1.8 diff -u -r1.8 basic_opcodes.ops --- basic_opcodes.ops 2001/09/12 18:39:12 1.8 +++ basic_opcodes.ops 2001/09/13 00:46:51 @@ -329,3 +329,33 @@ // NOOP AUTO_OP noop { } + +// AND_i +AUTO_OP and_i { + INT_REG(P1) = INT_REG(P2) & INT_REG(P3); +} + +// NOT_i +AUTO_OP not_i { + INT_REG(P1) = ! INT_REG(P2); +} + +// OR_i +AUTO_OP or_i { + INT_REG(P1) = INT_REG(P2) | INT_REG(P3); +} + +// SHL_i_ic +AUTO_OP shl_i_ic { + INT_REG(P1) = INT_REG(P2) << P3; +} + +// SHR_i_ic +AUTO_OP shr_i_ic { + INT_REG(P1) = INT_REG(P2) >> P3; +} + +// XOR_i +AUTO_OP xor_i { + INT_REG(P1) = INT_REG(P2) ^ INT_REG(P3); +} Index: opcode_table === RCS file: /home/perlcvs/parrot/opcode_table,v retrieving revision 1.9 diff -u -r1.9 opcode_table --- opcode_table2001/09/12 18:39:12 1.9 +++ opcode_table2001/09/13 00:46:51 @@ -102,3 +102,9 @@ clear_n0 clear_p0 +and_i 3 I I I +not_i 2 I I +or_i 3 I I I +shl_i_ic 3 I I i +shr_i_ic 3 I I i +xor_i 3 I I I
patches to assembler & opcode_table
The assembler patches: * handle blank lines containing a label * handle constants in decimal,octal, or hex. Opcode table patch (and basic_opcodes.ops): * adds and, or, not, xor, shl, and shr. Builds ok, but coredumps in the interpreter. Any hints on what I did wrong welcome! Attached is also a test program for the new ops (which coredumps). Brian ? op.h ? config.h ? patch ? interp_guts.h ? basic_opcodes.c ? test_prog ? test2.pbc ? test3.pbc ? euclid.pbc ? bitops.pbc ? bitops+assembler.patch ? t/bitops.pasm Index: assemble.pl === RCS file: /home/perlcvs/parrot/assemble.pl,v retrieving revision 1.8 diff -u -r1.8 assemble.pl --- assemble.pl 2001/09/12 09:54:46 1.8 +++ assemble.pl 2001/09/12 20:24:07 @@ -62,19 +62,17 @@ push @code, $_; $pc += 1+@args; } - emit_magic(); emit_fixup_section(); emit_constants_section(); - # Now assemble $pc = 0; my $line = 0; -while ($_ = shift @code) { +foreach (@code) { $line++; chomp; s/,/ /g; - +next if(/^\s*$/); my ($opcode, @args) = split /\s+/, $_; if (!exists $opcodes{lc $opcode}) { @@ -93,6 +91,8 @@ } elsif($rtype eq "D") { # a destination $args[$_]=fixup($args[$_]); + } else { + $args[$_]=oct($args[$_]) if($args[$_]=~/^0/); } $output .= pack $type, $args[$_]; } Index: basic_opcodes.ops === RCS file: /home/perlcvs/parrot/basic_opcodes.ops,v retrieving revision 1.8 diff -u -r1.8 basic_opcodes.ops --- basic_opcodes.ops 2001/09/12 18:39:12 1.8 +++ basic_opcodes.ops 2001/09/12 20:24:07 @@ -329,3 +329,34 @@ // NOOP AUTO_OP noop { } + +// AND_I +AUTO_OP and_i { + INT_REG(P1)=INT_REG(P2) & INT_REG(P3); +} + +// NOT_I +AUTO_OP not_i { + INT_REG(P1)=!INT_REG(P2); +} + +// OR_I +AUTO_OP or_i { + INT_REG(P1)=INT_REG(P2) | INT_REG(P3); +} + +// SHL_I +AUTO_OP shl_i_ic { + INT_REG(P1)=INT_REG(P2) << P3; +} + + +// SHR_I +AUTO_OP shr_i_ic { + INT_REG(P1)=INT_REG(P2) >> P3; +} + +// XOR +AUTO_OP xor_i { + INT_REG(P1)=INT_REG(P2) ^ INT_REG(P3); +} Index: opcode_table === RCS file: /home/perlcvs/parrot/opcode_table,v retrieving revision 1.9 diff -u -r1.9 opcode_table --- opcode_table2001/09/12 18:39:12 1.9 +++ opcode_table2001/09/12 20:24:07 @@ -102,3 +102,10 @@ clear_n0 clear_p0 +# Bitops +and_i 3 I I I +not_i 2 I I +or_i 3 I I I +shl_i_ic 3 I I i +shr_i_ic 3 I I i +xor_i 3 I I I # bitops.pasm : test bitops. # and, not, or, shl, shr, xor # Brian Wheeler ([EMAIL PROTECTED]) MAIN: set_i_icI1,0b set_i_icI2,0b set_i_icI3,0b10101010 set_i_icI4,0b01010101 set_i_icI5,0 set_s_scS1,"OK" set_s_scS2,"NG" set_s_scS3,"Test Number" # test 'and' set_s_scS5,"AND Test" print_s S5 AND_T1: inc_i I5 print_s S3 print_i I5 and_i I6,I1,I2 print_i I6 eq_i_ic I6,I2,A1_OK,A1_NG A1_OK: print_s S1 branch_ic AND_T2 A1_NG: print_s S2 branch_ic AND_T2 AND_T2: inc_i I5 print_s S3 print_i I5 and_i I6,I1,I3 eq_i_ic I6,I3,A2_OK,A3_NG A2_OK: print_s S1 branch_ic AND_T3 A3_NG: print_s S2 branch_ic AND_T3 AND_T3: inc_i I5 print_s S3 print_i I5 and_i I6,I3,I4 eq_i_ic I6,I2,A3_OK,A3_NG A3_OK: print_s S1 branch_ic OR_T A3_NG: print_s S2 branch_ic OR_T # test or OR_T: set_s_scS5,"OR Test" print_s S5 set_i_icI5,0 end
Another Patch...
This patch (which is pretty big) does: * Changes the opcode_table file to provide additional information about the operands. Case shouldn't be a problem since that data never becomes a C symbol [this is pretty much as before] * Padding errors solved: assemble.pl and bytecode.c were padding the constants incorrectly. It should have been 4-(size % 4), not just (size % 4). It is now fixed in both places. * assembler has less special cases, and should be easier to hang error checking on * disassembler dumps constant table and the format is a bit prettier, including register names, etc. Test2.pbc dumps as this: # Constants: 1 entries (32 bytes) # ID FlagsEncoding Type Size Data : 000b Hello World # Code Section : set_i_ic I2, 1 000c: set_i_ic I1, 0 0018: set_s_sc S1, [string ] 0024: eq_i_ic I1, I2, 0060, 0038 0038: length_s_i S1, I1 0044: print_s S1 004c: chopn_s_ic S1, 1 0058: branch_ic0024 0060: end Let me know what you guys think! Brian [Crap, there's some wordwrapping below. Too bad you can plug emacs into evolution :) ] Index: assemble.pl === RCS file: /home/perlcvs/parrot/assemble.pl,v retrieving revision 1.6 diff -u -r1.6 assemble.pl --- assemble.pl 2001/09/10 21:26:08 1.6 +++ assemble.pl 2001/09/11 03:14:32 @@ -9,7 +9,16 @@ my %pack_type; %pack_type = (i => 'l', n => 'd', - ); + ); + +my %real_type=('i'=>'i', + 'n'=>'n', + 'N'=>'i', + 'I'=>'i', + 'S'=>'i', + 's'=>'i', + 'D'=>'i'); + my $sizeof_packi = length(pack($pack_type{i},1024)); open GUTS, "interp_guts.h"; @@ -26,8 +35,11 @@ s/^\s+//; next unless $_; my ($name, $args, @types) = split /\s+/, $_; +my @rtypes=@types; +@types=map { $_ = $real_type{$_}} @types; $opcodes{$name}{ARGS} = $args; $opcodes{$name}{TYPES} = [@types]; +$opcodes{$name}{RTYPES}=[@rtypes]; } my $pc = 0; @@ -65,23 +77,17 @@ die "wrong arg count--got ". scalar @args. " needed " . $opcodes{$opcode}{ARGS}; } -$args[0] = fixup($args[0]) -if $opcode eq "branch_ic" and $args[0] =~ /[a-zA-Z]/; - -#if ($opcode eq "eq_i_ic" or $opcode eq "lt_i_ic") { -if ($opcode =~ /^(eq|ne|lt|le|gt|ge)_i_ic$/) { -$args[2] = fixup($args[2]) if $args[2] =~ /[a-zA-Z]/; -$args[3] = fixup($args[3]) if $args[3] =~ /[a-zA-Z]/; -} -if ($opcode eq "if_i_ic") { -$args[1] = fixup($args[1]) if $args[1] =~ /[a-zA-Z]/; -$args[2] = fixup($args[2]) if $args[2] =~ /[a-zA-Z]/; -} - print pack "l", $opcodes{$opcode}{CODE}; foreach (0..$#args) { - $args[$_] =~ s/^[INPS]?(\d+)$/$1/i; - my $type = $pack_type{$opcodes{$opcode}{TYPES}[$_]}; + my($rtype)=$opcodes{$opcode}{RTYPES}[$_]; + my($type)=$opcodes{$opcode}{TYPES}[$_]; + if($rtype eq "I" || $rtype eq "N" || $rtype eq "P" || $rtype eq "S") { + # its a register argument + $args[$_]=~s/^[INPS](\d+)$/$1/i; + } elsif($rtype eq "D") { + # a destination + $args[$_]=fixup($args[$_]); + } print pack $type, $args[$_]; } $pc += 1+@args; @@ -112,7 +118,10 @@ for (@constants) { $size += 4*$sizeof_packi; $size += length($_); -$size += length($_) % $sizeof_packi; # Padding + my($pad)=length($_) % $sizeof_packi; + if($pad) { + $size+=$sizeof_packi-$pad; + } } $size += $sizeof_packi if @constants; # That's for the number of constants @@ -127,6 +136,9 @@ print pack($pack_type{i},0) x 3; # Flags, encoding, type print pack($pack_type{i},length($_)); # Strlen followed by that many bytes. print $_; -print "\0" x (length($_) % $sizeof_packi); # Padding; + my $pad=(length($_) % $sizeof_packi); + if($pad) { + print "\0" x ($sizeof_packi-(length($_) % $sizeof_packi)); # Padding; + } } } Index: bytecode.c === RCS file: /home/perlcvs/parrot/bytecode.c,v retrieving revision 1.4 diff -u -r1.4 bytecode.c --- bytecode.c 2001/09/10 21:47:26 1.4 +++ bytecode.c 2001/09/11 03:14:33 @@ -79,6 +79,7 @@ IV encoding = GRAB_IV(program_code); IV type = GRAB_IV(program_code); IV buflen = GRAB_IV(program_code); + int pad; len -= 4 * sizeof(IV); @@ -87,9 +88,11 @@ len -= buflen; /* Padding */ -if (buflen % sizeof(IV)) { -len -= buflen % sizeof(IV); -(char*)*program_code += buflen % sizeof(IV); + pad=buflen % sizeof(IV); + if(pad) { + pad=sizeof(IV)-pad; + len -= pad; + (char*)*program_code += pad; }
Re: Patch to assembler/disassembler + parrot asm inconsistancies
On Mon, 2001-09-10 at 20:52, Dan Sugalski wrote: > At 07:25 PM 9/10/2001 -0400, Bryan C. Warnock wrote: > >I think Dan mentioned this, but it looks like the suffixes can be derived > >from the args being passed in. That would greatly simply the assembler to > >just the function names: set, eq, add, branch. > > > >Were there problems with the scheme, is someone working on it, or did it > >fall through the cracks? (I'm very much in favor of such a change, and will > >pick it up if no one else is working on it.) > > No, I dont' think so, and yes, respectively. (Or, rather, we did the easy > literal stuff first and planned on smartening up the assembler later. It's > in the TODO even... :) > > Hint, Nudge, Wink: the last patch I sent that hits on the assembler should make it very easy to add better assembly checking For example, register checking could be done like this: if($rtype eq "I" || $rtype eq "N" || $rtype eq "P" || $rtype eq "S") { # its a register argument if($args[$_]=~m/^[INPS](\d+)$/) { my($arg_num)=$1; if($arg_num > 32) { # bad register number } else { $args[$_]=$arg_num; } } else { # non-register being used! } } ... Brian
Re: Patch to assembler/disassembler + parrot asm
On Mon, 2001-09-10 at 19:54, Dan Sugalski wrote: > At 07:45 PM 9/10/2001 -0500, Brian Wheeler wrote: > >If eq_i_ic is really treated as /eq(_i)+_ic/ then this code still > >doesn't work: > > > >eq_i_ic I1,I2,NEXT,DONE > > > >because that'd be like eq_i_i_ic_ic, right? > > Right. But don't forget, I screwed up the eq op--it ought to have a single > destination. :) > DOH! That's the trick :) > >I assume that opcodes aren't going to have variable arguments at this level, > >so there should be a one-to-one mapping between function and opcode, right? > > Each opcode number has a single function, yes. The same "high-level" > opcode, for example eq or add, might map to two or more different 'real' > opcodes based on the types of the args. There won't be any runtime > morphing--it's more "The assembler sees the first arg of foo as a numberic > register and the second as a constant, so it must be foo_n_nc". > > >A thought (though gross): if we restrict mneumonics to not use the > >underscore, > >then anything after _ can be the op signature. > > Too gross. We don't need to go there. :) > Just checking. > >Also, doing it this way takes out the special cases for the comparison and > >jump ops: the fixups are known to be done with things that have type 'a' > > The jump ops will be easy to figure--either they'll take a register, a > constant number, or a label. We don't allow labels that could be confused > with registers. (No I0: anywhere...) > > I've had more thoughts about my first patch. The case issue isn't an issue since it only touches the generation tools, not the C code that's generated (at least, not directly). It also provides the additional information needed to let the assembler choose the correct opcode, and the disassembler to dump things nicely :) I've also fixed up the supporting tools. As a test case, I rebuilt test_prog, assembled test.parm, ran it and dissassembled it. Looks for for that one, at least :) Please consider this new patch. Brian Index: assemble.pl === RCS file: /home/perlcvs/parrot/assemble.pl,v retrieving revision 1.6 diff -u -r1.6 assemble.pl --- assemble.pl 2001/09/10 21:26:08 1.6 +++ assemble.pl 2001/09/11 02:02:15 @@ -9,7 +9,16 @@ my %pack_type; %pack_type = (i => 'l', n => 'd', - ); + ); + +my %real_type=('i'=>'i', + 'n'=>'n', + 'N'=>'i', + 'I'=>'i', + 'S'=>'i', + 's'=>'i', + 'D'=>'i'); + my $sizeof_packi = length(pack($pack_type{i},1024)); open GUTS, "interp_guts.h"; @@ -26,8 +35,11 @@ s/^\s+//; next unless $_; my ($name, $args, @types) = split /\s+/, $_; +my @rtypes=@types; +@types=map { $_ = $real_type{$_}} @types; $opcodes{$name}{ARGS} = $args; $opcodes{$name}{TYPES} = [@types]; +$opcodes{$name}{RTYPES}=[@rtypes]; } my $pc = 0; @@ -65,23 +77,17 @@ die "wrong arg count--got ". scalar @args. " needed " . $opcodes{$opcode}{ARGS}; } -$args[0] = fixup($args[0]) -if $opcode eq "branch_ic" and $args[0] =~ /[a-zA-Z]/; - -#if ($opcode eq "eq_i_ic" or $opcode eq "lt_i_ic") { -if ($opcode =~ /^(eq|ne|lt|le|gt|ge)_i_ic$/) { -$args[2] = fixup($args[2]) if $args[2] =~ /[a-zA-Z]/; -$args[3] = fixup($args[3]) if $args[3] =~ /[a-zA-Z]/; -} -if ($opcode eq "if_i_ic") { -$args[1] = fixup($args[1]) if $args[1] =~ /[a-zA-Z]/; -$args[2] = fixup($args[2]) if $args[2] =~ /[a-zA-Z]/; -} - print pack "l", $opcodes{$opcode}{CODE}; foreach (0..$#args) { - $args[$_] =~ s/^[INPS]?(\d+)$/$1/i; - my $type = $pack_type{$opcodes{$opcode}{TYPES}[$_]}; + my($rtype)=$opcodes{$opcode}{RTYPES}[$_]; + my($type)=$opcodes{$opcode}{TYPES}[$_]; + if($rtype eq "I" || $rtype eq "N" || $rtype eq "P" || $rtype eq "S") { + # its a register argument + $args[$_]=~s/^[INPS](\d+)$/$1/i; + } elsif($rtype eq "D") { + # a destination + $args[$_]=fixup($args[$_]); + } print pack $type, $args[$_]; } $pc += 1+@args; Index: disassemble.pl === RCS file: /home/perlcvs/parrot/disassemble.pl,v retrieving revision 1.3 diff -u -r1.3 disassemble.pl --- disassemble.pl 200
Re: Patch to assembler/disassembler + parrot asm
another thought... > > A thought (though gross): if we restrict mneumonics to not use the underscore, > then anything after _ can be the op signature. > > The opcode_table could use these characters for different data types: > integer i > integer constant j > numeric n > numeric constant o > address a > strings > string constant t > > The file could be reorganized as: > > set 2 i j > set 2 i i > set 2 n o > set 2 s t > what if the table had another column (optional) at the end: set 2 i j set_i which gave the name of the C function which implemented it. that way the assembly ops are independant of the C function names and multiple ops could map to a single C routine (if needed) I've got to know...what's the significance of the magic number? :) Brian
Re: Patch to assembler/disassembler + parrot asm
> > At 05:23 PM 9/10/2001 -0500, Brian Wheeler wrote: > >First off, here's an inconsistancy I found: In test.pasm > > > >REDO: eq_i_ic I2, I4, DONE, NEXT > > > >appears. Shouldn't this be comparing to a constant, not a register? > > Nope, though if I let you in on the actual secret it's help. > > That should really be eq_i_ic_ic. (Well, actually there should be only one > label, and we fall through otherwise. It's a bug in implementation and > assembly, not opcode name... :) The intention is the last _x covers the > last arg, the next to last covers the next to last arg, and so on. When we > run out, we repeat the innermost type. > > The ultimate intention is that you'd write that as a plain: > > eq I2, I4, DONE, NEXT > > or probably > > eq I2, I4, DONE > > and either way the assembler would know DONE was a constant and we needed i > registers since that was specified, and emit the eq_i_ic opcode. BUT, I'm more confused now :) If eq_i_ic is really treated as /eq(_i)+_ic/ then this code still doesn't work: eq_i_ic I1,I2,NEXT,DONE because that'd be like eq_i_i_ic_ic, right? I assume that opcodes aren't going to have variable arguments at this level, so there should be a one-to-one mapping between function and opcode, right? A thought (though gross): if we restrict mneumonics to not use the underscore, then anything after _ can be the op signature. The opcode_table could use these characters for different data types: integer i integer constantj numeric n numeric constanto address a string s string constant t The file could be reorganized as: set 2 i j set 2 i i set 2 n o set 2 s t The perl scripts which create the interfaces (process_opfunc.pl, etc) could use this information to create 4 opcodes: set_ij set_ii set_no set_st When the assembler comes across 'set I1,I2', It knows the set_ii form is the one to use. The disassembler can dump it as 'set_ii I1,I2' or (I suppose) as 'set I1,I2' Also, doing it this way takes out the special cases for the comparison and jump ops: the fixups are known to be done with things that have type 'a' > > >It > >became a little obvious when I made a few changes to the > >assembler/disassembler to give more details about the data (and to allow > >shortcuts like "add I1,I2,I3" to go to "add_i I1,I2,I3", etc) > > > >The opcode_table patch changes the argument encoding to use these terms: > ># i Integer constant > ># I Integer register > ># n Numeric constant > ># N Numeric register > ># s String constant? > ># S String register > ># D Destination > > I was using a trailing c to note a constant since we're using the opcode > name as a C function name, and we're not counting on case-sensitivity in > symbols. > fair enough... > Other than that (well, and Simon has a patch in to the repository to yank > out the opcode numbers entirely from opcode_table) it looks keen. > I'll take a peek and see what all depends on the opcode_table file... Brian > Dan > > --"it's like this"--- > Dan Sugalski even samurai > [EMAIL PROTECTED] have teddy bears and even > teddy bears get drunk >
Patch to assembler/disassembler + parrot asm inconsistancies
First off, here's an inconsistancy I found: In test.pasm REDO: eq_i_ic I2, I4, DONE, NEXT appears. Shouldn't this be comparing to a constant, not a register? It became a little obvious when I made a few changes to the assembler/disassembler to give more details about the data (and to allow shortcuts like "add I1,I2,I3" to go to "add_i I1,I2,I3", etc) There's 3 pieces: opcode_table, disassemble.pl and assemble.pl The opcode_table patch changes the argument encoding to use these terms: # i Integer constant # I Integer register # n Numeric constant # N Numeric register # s String constant? # S String register # D Destination The opcodes definitions were changed accordingly. Disassemble.pl takes the new definitions and prints things out a little prettier (test.pbc): time_i I1 0008 set_i_ic I2 0 0014 set_i_ic I3 1 0020 set_i_ic I4 1000 002c eq_i_ic I2 4 0058 0040 0040 add_i I2 I2 I3 0050 branch_ic 002c 0058 time_i I5 0060 print_i I1 0068 print_i I5 0070 print_i I2 0078 sub_i I2 I5 I1 0088 print_i I2 0090 set_i_ic I1 3 009c mul_i I4 I4 I1 00ac iton_n_i N1 I4 00b8 iton_n_i N2 I2 00c4 print_i I4 00cc print_n N1 00d4 print_i I2 00dc print_n N2 00e4 div_n N1 N1 N2 00f4 print_n N1 It also skips the magic number, and skips (but doesn't handle) the constant data. String registers aren't handled either...yet assemble.pl: this just contains workarounds to the new opcode_table format. Brian Lastly, here's the patch: Index: assemble.pl === RCS file: /home/perlcvs/parrot/assemble.pl,v retrieving revision 1.5 diff -u -r1.5 assemble.pl --- assemble.pl 2001/09/10 17:30:29 1.5 +++ assemble.pl 2001/09/10 22:04:29 @@ -10,6 +10,15 @@ %pack_type = (i => 'l', n => 'd', ); +my %real_type=('i'=>'i', + 'n'=>'n', + 'N'=>'i', + 'I'=>'i', + 'S'=>'i', + 's'=>'i', + 'D'=>'i'); + + my $sizeof_packi = length(pack($pack_type{i},1024)); @@ -20,6 +29,7 @@ s/^\s+//; next unless $_; my ($code, $name, $args, @types) = split /\s+/, $_; +@types=map { $_ = $real_type{$_}} @types; $opcodes{$name} = {CODE => $code, ARGS => $args, TYPES => [@types] Index: disassemble.pl === RCS file: /home/perlcvs/parrot/disassemble.pl,v retrieving revision 1.1 diff -u -r1.1 disassemble.pl --- disassemble.pl 2001/08/29 12:07:02 1.1 +++ disassemble.pl 2001/09/10 22:04:29 @@ -7,10 +7,20 @@ my %unpack_type; %unpack_type = (i => 'l', + I => 'l', n => 'd', + N => 'l', + D => 'l', + S => 'l', + s => 'l', ); my %unpack_size = (i => 4, n => 8, + I => 4, + N => 4, + D => 4, + S => 4, + s => 4, ); open OPCODES, "; +my $constants=<>; + $constants=<>; + +my $offset=0; while (<>) { $code = unpack 'l', $_; $args = $opcodes[$code]{ARGS}; -print $opcodes[$code]{NAME}; +my $op_offset=$offset; +print sprintf("%08x",$offset)," ",$opcodes[$code]{NAME}; +$offset+=4; + if ($args) { foreach (1..$args) { local $/ = \$unpack_size{$opcodes[$code]{TYPES}[$_-1]}; $data = <>; - print " ", unpack $unpack_type{$opcodes[$code]{TYPES}[$_-1]}, $data;+ $offset+=$ {$/ }; + if($opcodes[$code]{TYPES}[$_-1] eq "N" || + $opcodes[$code]{TYPES}[$_-1] eq "I") { + print " ",$opcodes[$code]{TYPES}[$_-1],unpack $unpack_type{$opcodes[$code]{TYPES}[$_-1]}, $data; + } elsif($opcodes[$code]{TYPES}[$_-1] eq "D") { + # handle destination + print " ",sprintf("%08x",$op_offset+unpack($unpack_type{$opcodes[$code]{TYPES}[$_-1]},$data)*4); + } else { + print " ", unpack $unpack_type{$opcodes[$code]{TYPES}[$_-1]}, $data; + } } } print "\n"; Index: opcode_table === RCS file: /home/perlcvs/parrot/opcode_table,v retrieving revision 1.5 diff -u -r1.5 opcode_table --- opcode_table2001/09/10 15:48:36 1.5 +++ opcode_table2001/09/10 22:04:30 @@ -9,61 +9,70 @@ # not the type of the register or anything. So N3 is still an i, since that # 3 specifying the register should be packed as an integer. +# Revised arg types: +# i Integer constant +# I Integer register +# n Numeric constant +# N Numeric register +# s String constant? +#
Re: Math functions? (Particularly transcendental ones)
On Mon, 2001-09-10 at 09:16, Bryan C. Warnock wrote: > On Monday 10 September 2001 10:28 am, Brian Wheeler wrote: > > > > I was thinking about NOP this morning, and I realized that it might very > > well be necessary. If someone was writing a "simple" assembler for > > parrot, it might be useful for padding. > > Pad what? > How about preserving offsets during an optimization phase: add i3,i1,1 add i3,i3,8 could become add i3,i1,9 nop without having to recompute offsets for later bytecode. In the same way, you could also use it for reserving space for things like debugging code, like adding 10 nops if debugging is turned off, and using those 10 instructions for debugging if it is turned onmaintaining the relative addresses of things. Of course, one could just recompile using the parrot assembler, so this would only be for those tinkering with their own assembler, I suppose. Honestly, I don't care either way, since add i0,i0,0 is the same (basically) as a nop, but takes a little more cpu. One could always #define nop add i0,i0,0 :) Brian
Re: Math functions? (Particularly transcendental ones)
On Mon, 2001-09-10 at 08:47, Dan Sugalski wrote: > At 08:07 PM 9/9/2001 -0400, Uri Guttman wrote: > > > "DS" == Dan Sugalski <[EMAIL PROTECTED]> writes: > > > > DS> Yeah, I can't think of a good reason for a noop. We might have one > > DS> anyway, though, just in case one comes along anyway. > > > >in a hardware cpu they were commonly used to fill an instruction slot to > >keep a pipeline filled, or to follow a branch decision, or to pad a long > >running op. > > Yup, I realize that. I wasn't sure that we might not have some sort of > in-memory opcode whiteout thing we need to do, in which case it'd be useful > and potentially faster than recalculating a bunch of jump addresses. > > > >> Here's a dumb question: will parrot allow bytecode which is stored in a > > >> perl scalar to be executed? > > > > DS> Yup, in a restricted sandbox too, if you want. That way we'll be > > DS> able to serialize code to bytestreams, spit them across the 'net, > > DS> and execute them on the other end. > > > >will the op code table need to be sent over if it is code from a module > >which defines new op codes? > > Basically we'll build a small "freeze to disk" section and send it over the > wire instead of freezing to disk. It'll have all the standard stuff--fixup > section, constants section, and code. > I was thinking about NOP this morning, and I realized that it might very well be necessary. If someone was writing a "simple" assembler for parrot, it might be useful for padding. Brian > > Dan > > --"it's like this"--- > Dan Sugalski even samurai > [EMAIL PROTECTED] have teddy bears and even > teddy bears get drunk
Re: Math functions? (Particularly transcendental ones)
On Sat, 2001-09-08 at 22:24, Uri Guttman wrote: > >>>>> "BW" == Brian Wheeler <[EMAIL PROTECTED]> writes: > > > BW> =item and tx, ty, tz * > > BW> Bitwise And all bits in y with z and store the result in register x. > BW> (x = y & z) > > just a minor thought on parrot assembler argument order. dan seems to > have picked the result register to be first. my experience with the > pdp-11 left me with the result on the right. i think the vax (dan's > favorite) followed dec's tradition and kept the result on the right. is > this ordering final? > gcc output (x86 linux, vax linux): destination on right. NASM, MASM, TASM (x86): destination on left cc/list/mach (vax/vms): destination on right gcc output (alpha linux): destination on left Looking at the opcodes as presented in the PDD, they're hauntingly like the alpha codes (maybe Dan's favorite isn't the vax, but the alpha :) If I hadn't let my hobbyist licenses expire on my vms/alpha machine, I'd check it too... > BW> =item eqv tx, ty, tz * > > BW> Bitwise Equivalence all bits in y with z and store the result in > BW> register x. > BW> [do we really need this one? BASIC has it, but I don't know if I've > BW> ever found a use for it. Its (p&q)|(!p&!q), I think] > > that is just !(y xor z). we can provide the op but perl as we know it > has no operator that will map to it. maybe another front end would want > it so let's have it. it is trivial to code up and won't take any space. > > BW> =item imp tx, ty, tz * > > BW> Bitwise Implication (of) all bits in y with z and store the result in > BW> register x. [another questionable one. Again, its a BASIC thing. The > BW> boolean expression for it is (!p&!q)|q ] > > wow, that is the old boolean 'if' (if p then q) operator. i have never > seen that needed. let the coder do it themselves at the language level. > I agree with not including these two, but they were ones that I remembered having in BASIC from many moons ago. > BW> =item rol tx, ty, tz * > > BW> Roll y left z bits and store the result in x. > BW> [what are the valid values for z?] > > isn't that rotate left? rotate should require z to be the word size or > less. or we can define it to work modulo the word size. which reminds > me, is there going to be a simple language way to get the word size > (other than %Config)? Yep, rotate. I really need to wake up before I post email. Well, what if a PMC or Numeric registers are passed to rol, rol, shr, shl? What would be the size then? We could disallow them since they really don't make sense...or do they? Is the VM going go have a fixed size register set regardless of platform? If not, then the shift will behave differently on each platform the bytecode is run...and that's bad, especially if someone (cleverly) decides that a "SHL I0,I0,32" clears the register. > > BW> =item shl tx, ty, tz * > > BW> Shift y left z bits and store the result in x. (x = y << z) > BW> [what are the valid values of z?] > > is 0 shifted in from the right? > I would assume so, since that's the usual behavior. > > why the abbreviated op names? we don't want another creat(2) > controversy. :) i don't think we need to limit our op code names to any > length. old assemblers did so because of the odd speed up tricks. since > the parrot assembler will rarely be used directly and it will be written > in perl. shift_l is clearer. but then, i and many others learned short > assembler op names without trouble. we should have some consensus on op > name styles. underscores? longer names encouraged? etc. > Honestly I pulled the mneumonics from the x86 set, though the ROL/ROR ops are the same as m68k. > > BW> =item shr tx, ty ,tz * > > BW> Shift y right z bits and store the result in y. (x = y >> z) > BW> [what are the valid values of z?] > > arithemetic or logical shift? does the sign bit propogate down? > Beats me. I was just thinking of the args as n-bit unsigned numbers. > BW> =item xor tx, ty, tz * > > BW> Bitwise Exclusive-Or of all bits in y with z and store the result in > BW> register x. (x = y ^ z) > > should be near eqv as it is just its negation. > I just put them in alphabetical order, as that was the easiest :) Of course, if its name becomes "EOR" (as in m68k & 6502) then alphabetically it is right there :) Brian > uri >
RE: Math functions? (Particularly transcendental ones)
On Sat, 2001-09-08 at 21:43, Wizard wrote: > Questions regarding Bitwise operators: > > > =item rol tx, ty, tz * > ... > > =item ror tx, ty, tz * > > Are these with or without carry? > If not, is there a need for a RCL/RCR (with carry...and carry where)? > I'd think without, since I've not seen any references to a carry flag in the VM docs (I may have missed it though). > And what about a negate operator (neg)? > Wouldn't that just be "mul tx,ty,tz" where tz (heck, even ty) is -1? Out of curiosity, will there be a NOP instruction? I guess we really wouldn't need one, since things like ADD I0,I0,0 is effectively a NOP and the Dan has indicated he wanted to keep the bytecode read-only, so there'd be no need for it after optimization (after the bytecode has been fixed to the disk or in memory). Here's a dumb question: will parrot allow bytecode which is stored in a perl scalar to be executed? http://www.allegedlyfunny.com/opcodes.html I think DWIM might be a bit much, but HCF (Halt, Catch Fire) might be fun :) Brian
Re: Math functions? (Particularly transcendental ones)
On Sat, 2001-09-08 at 15:28, Dan Sugalski wrote: > At 03:14 PM 9/8/2001 -0500, Brian Wheeler wrote: > >On Sat, 2001-09-08 at 11:00, Dan Sugalski wrote: > > > Okay, I'm whipping together the "fancy math" section of the interpreter > > > assembly language. I've got: > > > > > > sin, cos, tan : Plain ones > > > asin, acos, atan : arc-whatevers > > > shinh, cosh, tanh : Hyperbolic whatevers > > > log2, log10, log : Base 2, base 10, and explicit base logarithms > > > pow : Raise x to the y power > > > > > > Can anyone think of things I've forgotten? It's been a while since I've > > > done numeric work. > > > > > > Dan > > > > > > >While not math, per se, there are bitops (and, or, not, xor, eqv) and > >shifts (though they can be simulated by "mul tx,ty,(2^bits)" and "div > >tx,ty,(2^bits)") > > Those are next. I'm getting into the math bits first. > > If you want to make a list of bit operations, though, that'd be fine. > > >I doubt rolls would be useful :) > > You never know--someone might want to build an RC5 encoder/decoder in perl. > Rolls are useful there. > > >Are there going to be string ops as well, or would add and mul work on > >string registers? > > String ops are separate. I don't want to overload things too badly until I > really have to. > > Dan > > --"it's like this"--- > Dan Sugalski even samurai > [EMAIL PROTECTED] have teddy bears and even > teddy bears get drunk Here's a list of bitops I came up with. I don't know how useful EQV and IMP are, but I included them anyway. Brian =head2 Bitwise Operations These ops handle manipulating bitwise data in registers. All registers must be of the same type, PMC, integer, or number. =over 4 =item and tx, ty, tz * Bitwise And all bits in y with z and store the result in register x. (x = y & z) =item eqv tx, ty, tz * Bitwise Equivalence all bits in y with z and store the result in register x. [do we really need this one? BASIC has it, but I don't know if I've ever found a use for it. Its (p&q)|(!p&!q), I think] =item imp tx, ty, tz * Bitwise Implication (of) all bits in y with z and store the result in register x. [another questionable one. Again, its a BASIC thing. The boolean expression for it is (!p&!q)|q ] =item not tx, ty * Bitwise Not all bits in y and store the result in x. (x = !y) =item or tx, ty, tz * Bitwise Or all bits in y with z and store the result in register x. (x = y | z) =item rol tx, ty, tz * Roll y left z bits and store the result in x. [what are the valid values for z?] =item ror tx, ty, tz * Roll y right z bits and store the result in x. [what are the valid values for z?] =item shl tx, ty, tz * Shift y left z bits and store the result in x. (x = y << z) [what are the valid values of z?] =item shr tx, ty ,tz * Shift y right z bits and store the result in y. (x = y >> z) [what are the valid values of z?] =item xor tx, ty, tz * Bitwise Exclusive-Or of all bits in y with z and store the result in register x. (x = y ^ z)
Re: Math functions? (Particularly transcendental ones)
On Sat, 2001-09-08 at 11:00, Dan Sugalski wrote: > Okay, I'm whipping together the "fancy math" section of the interpreter > assembly language. I've got: > > sin, cos, tan : Plain ones > asin, acos, atan : arc-whatevers > shinh, cosh, tanh : Hyperbolic whatevers > log2, log10, log : Base 2, base 10, and explicit base logarithms > pow : Raise x to the y power > > Can anyone think of things I've forgotten? It's been a while since I've > done numeric work. > > Dan > While not math, per se, there are bitops (and, or, not, xor, eqv) and shifts (though they can be simulated by "mul tx,ty,(2^bits)" and "div tx,ty,(2^bits)") I doubt rolls would be useful :) Are there going to be string ops as well, or would add and mul work on string registers? Brian
parrot question
While waiting for Parrot (dammit, I took the wrong week off), I've been scanning the various documents and samples which have been floating around on the list. Is there a document describing Parrot syntax yet? Or is that a "will be released on monday" thing as well? Brian Wheeler [EMAIL PROTECTED]