Re: archive search?
On Jan-11, Peter Christopher wrote: > > Last I checked (which was moments ago) it was either (a) > impossible to, or (b) I couldn't figure out how to, search the parrot > mailing list archive. I.e. I can't search > www.nntp.perl.org/group/perl.perl6.internals > for key words. Could someone give me the heads up on how to search this > list? If the search can't be done, is there a way that I can download > the archive en mass: so that I can grep to my hearts delight? Use Google groups search. Specifically: go to groups.google.com. To search for "aardvark", enter group:perl.perl6.internals aardvark
Re: Parrot & Strong typing
On Dec-01, Dan Sugalski wrote: > > C, for example, is weakly typed. That is, while you tell the system > that a variable is one thing or another (an int, or a float), you're > perfectly welcome to treat it as another type. This is *especially* > true of values you get to via pointers. For example, this snippet > (and yes, it's a bit more explicit than it needs to be. Cope, you > pedants :): > > char foo[4] = "abcd"; > printf("%i", *(int *)&foo[0]); > > tells the C compiler that foo is a 4 character string with the value > "abcd", but in the next statement we get a pointer to the start of > the string, tell the compiler "No, really, this is a pointer to an > int. Really!" and then dereference it as if the string "abcd" really > *was* an integer. If C were strongly typed you couldn't do that. A wholly off-topic comment about how to make use of this: When running gdb on a C (or better, STL-happy C++) program, it's nice to be able to set conditional breakpoints. b myfile.c:328 cond 1 somevar == 17 but gdb can often get confused or very slow if you try to do something similar with a char* value: cond 1 strcmp(mystring,"badness") == 0 It goes crazy making a function call every time that breakpoint is reached. I can get gdb to segfault this way without too much trouble. So instead, use this trick: cond 1 *(int*)mystring == *(int*)"badness" and it'll go back to doing a simple integer comparison. And it's very fast about that. Note that because you're using a probably 32-bit integer, that isn't really looking at the whole string; it'll have exactly the same effect if you say cond 1 *(int*)mystring == *(int*)"badn" I often use this in combination with std::basic_string types in C++, since the templatized types and other implementation-dependent weirdnesses end up making things much harder than if you were using simple char*'s. So it would look something like: cond 1 *(int*)mystring.data() == *(int*)"badn" Or maybe it's slightly safer to do this, I dunno: cond 1 *(int*)mystring.c_str() == *(int*)"badn" Sorry for the diversion. Um... if I had to say something on-topic, I'd point out that Perl's type system isn't complete ("not THAT strong"), since there some corners in the language where you can sneak around it. pack("p"), some system calls, and other things I can't think of. But maybe the very oddness of those things is evidence that Perl does indeed have a strong type system. ("Strong" in the technical, not comparative, sense.) Not that anyone seems to be able to agree on the exact definition of "strong typing".
Re: [perl #31208] dynclasses/README's instructions fail on OS X
On Nov-10, Will Coleda via RT wrote: > This is now obsolete, neh? > > > > I hack round this with > > > > > > $ cp dynclasses/foo.dump . > > > > Alternativley, change line 609 of pmc2c2.pl to read > > > > unshift @include, ".", "$FindBin::Bin/..", $FindBin::Bin; > > > > adding "." to search path I believe so. I never used either of those workarounds when I was messing with that stuff, so at least on the configurations I was using (i.e., your machine and mine), it wasn't needed.
Re: [perl #32137] stack walking failing to detect pointer in local variable on x86 Linux
This doesn't address the deeper problem, but we could also simplify the whole function by just doing: static size_t find_common_mask(size_t val1, size_t val2) { size_t mask = ~0; size_t diff = val1 ^ val2; while (diff & mask) mask <<= 1; return mask; } Bit twiddling is such fun. And error prone. So I won't commit this; I'll just attach the patch. Index: src/dod.c === RCS file: /cvs/public/parrot/src/dod.c,v retrieving revision 1.138 diff -u -r1.138 dod.c --- src/dod.c 26 Oct 2004 15:01:29 - 1.138 +++ src/dod.c 26 Oct 2004 19:11:48 - @@ -921,27 +921,13 @@ static size_t find_common_mask(size_t val1, size_t val2) { -int i; -int bound = sizeof(size_t) * 8; +size_t mask = ~0; +size_t diff = val1 ^ val2; -/* Shifting a value by its size (in bits) or larger is undefined behaviour. - so need an explict check to return 0 if there is no prefix, rather than - attempting to rely on (say) 0x << 32 being 0 */ -for (i = 0; i < bound; i++) { -if (val1 == val2) { -return ~(size_t)0 << i; -} -val1 >>= 1; -val2 >>= 1; -} -if (val1 == val2) { -assert(i == bound); -return 0; -} +while (diff & mask) +mask <<= 1; -internal_exception(INTERP_ERROR, -"Unexpected condition in find_common_mask()!\n"); -return 0; +return mask; } /*
Re: [perl #32137] stack walking failing to detect pointer in local variable on x86 Linux
This doesn't address the deeper problem, but we could also simplify the whole function by just doing: static size_t find_common_mask(size_t val1, size_t val2) { size_t mask = ~0; size_t diff = val1 ^ val2; while (diff & mask) mask <<= 1; return mask; } Bit twiddling is such fun. And error prone. So I won't commit this; I'll just attach the patch. Index: src/dod.c === RCS file: /cvs/public/parrot/src/dod.c,v retrieving revision 1.138 diff -u -r1.138 dod.c --- src/dod.c 26 Oct 2004 15:01:29 - 1.138 +++ src/dod.c 26 Oct 2004 19:11:48 - @@ -921,27 +921,13 @@ static size_t find_common_mask(size_t val1, size_t val2) { -int i; -int bound = sizeof(size_t) * 8; +size_t mask = ~0; +size_t diff = val1 ^ val2; -/* Shifting a value by its size (in bits) or larger is undefined behaviour. - so need an explict check to return 0 if there is no prefix, rather than - attempting to rely on (say) 0x << 32 being 0 */ -for (i = 0; i < bound; i++) { -if (val1 == val2) { -return ~(size_t)0 << i; -} -val1 >>= 1; -val2 >>= 1; -} -if (val1 == val2) { -assert(i == bound); -return 0; -} +while (diff & mask) +mask <<= 1; -internal_exception(INTERP_ERROR, -"Unexpected condition in find_common_mask()!\n"); -return 0; +return mask; } /*
Re: Cross-compiling Parrot
On Oct-17, Dan Sugalski wrote: > At 9:49 AM -0400 10/17/04, Jacques Mony wrote: > >Hello, > > > >I'm trying to port parrot to the unununium operating system, which > >uses a modified version of 'diet lib c'. Can anyone tell me if this > >is actually possible to force the use of this library using the > >current Configure.pl script or if I will need to change it a lot... > >or even replace it with my own? > > There's a pretty good bet you're going to have to alter the configure > script quite a bit, but it shouldn't require a full rewrite. Teaching > it to read from a pre-done configuration data file would be a good > place to start, which'd let us feed in the cross-compilation > settings. (And we could then leverage for the upgrade settings too) It's not exactly that, but you can set pretty much anything you want in a config/init/hints/local.pl file.
Re: Python and Perl interop
Perl5 has the notion of contexts, where an expression may behave very differently in string, boolean, or list context. Perl6 intends to expand that notion. What if the whole context notion were moved down into Parrot? Every function call and every MMD dispatch could have an additional context parameter, and 'add' could then behave very differently in Pythonic context as opposed to Perl Boolean context? I guess I'm not actually saying anything here, other than suggesting the possibility of unifying this problem with Perl's context problem.
ICU causing make to fail w/o --prefix=`pwd`
If I just do perl Configure.pl make right now, it builds the parrot executable ok but then fails when it tries to compile the library .imc files. It's looking for the icu data dir in $(prefix)/blib/lib/2.6.1. It works if I do perl Configure.pl --prefix=$(pwd) make or set PARROT_ICU_DATA_DIR, but this seems like an unfriendly default for developers. I have a similar problem with the search path for loadable modules. I think I probably broke this, btw, when I repaired 'make install'. I had previously bandaged over the problem by defaulting ${prefix} to the top-level directory. But I'm not sure how to fix it.
Re: dynamically loadable modules
On Oct-08, Andy Dougherty wrote: > > Sorry -- offhand I don't have any sense of any "standard" names, and I > won't have time till late next week to look at it at all. The most > important thing is to *DOCUMENT CAREFULLY* exactly what the names are and > what they mean. > > Whatever names you add, please list them in config/init/data.pl along with > a nice good long verbose description of exactly what they are. See > the existing entries for cc, link, and ld for some good examples. See > ld_shared and ld_shared_flags for some bad examples. Well, my documentation isn't very verbose, but it's there. I suppose I should expand it a little bit. As for ld_shared and ld_shared_flags -- I really couldn't figure out what the intention was, because the implications of the names didn't match how they were actually being used. I guessed that ld_shared would be the command used to link shared libraries, and ld would be the command to link executables. But they weren't. So I deleted both and replaced them with a single ld_share_flags. If we need a different linker for shared libs or loadable modules on some platform, we can reintroduce it when we encounter the problem.
Re: dynamically loadable modules
Ok, it's in. I did not add the 'cd dynclasses; make' to the default target; I though I'd see what regular builds I broke first. Testers wanted, especially on platforms other than darwin and linux.
Re: dynamically loadable modules
On Oct-07, Dan Sugalski wrote: > At 9:55 PM +0200 10/7/04, Leopold Toetsch wrote: > >Steve Fink <[EMAIL PROTECTED]> wrote: > > > > > Clearly, I'm not very experienced with dealing with these things across > >> platforms, so I was hoping somebody (Andy?) might have a better sense > >> for what these things are called. > > > >AOL ;) > > Heh. If Andy doesn't have a good name, let's call them shareable and > loadable libraries, LD_SHARE_FLAGS and LD_LOAD_FLAGS for the flags, > and SHARE_EXT and LOAD_EXT for the extensions. (All subject to > wholesale pitching if there's a better name :) Well, its bit longer than the $(SO) that we have everywhere now, but it works fine for me. If I can disentangle my patch from some other stuff that somehow crept in (I've no idea how; I'm using a virgin tree for this), I'll commit it under those names for now.
dynamically loadable modules
I've been struggling with getting Darwin to build the dynclasses/ stuff, and I seem to have it working now. (Oh, and it fixes the Linux build too.) It's a fairly large change, and I would like to use standard naming conventions throughout, but I haven't really found any convincing, definitive source of terminology. The issue is that on some platforms, dynamically loadable modules and shared libraries are different things. Although you may know them under different names (and one of those names is often used for the other.) A dynamically loadable module is something you explicitly load after startup time, via dlopen() or some similar interface. A shared library is implicitly loaded at startup time; you can see a list of these under unix with ldd. Under Linux, they both end in a .so extension and are built the same way. Under Darwin, shared libraries end in .dylib and d-l-modules end in whatever you want them to. The former is compiled with something like -dynamiclib and the latter with -bundle (or something; I don't remember exactly). So what I need is names for these. At the moment, I'm mostly using $(SO) for shared lib extensions, $(DYNMOD) for d-l-modules. The buildflags I gneerally call $(LD_SHARED) or something with shared for shared libs, and something like $(LD_DYNMOD_FLAGS) for d-l-modules. Clearly, I'm not very experienced with dealing with these things across platforms, so I was hoping somebody (Andy?) might have a better sense for what these things are called.
Re: [perl #31849] [PATCH] Slightly quieter and more informative compilation
On Oct-06, Leopold Toetsch wrote: > Andy Dougherty <[EMAIL PROTECTED]> wrote: > > There are some changes e.g. when different CFLAGS settings are used, or > for compiling classes. When there is a problem with compiling, just type > another 'make' and you'll get again "Compiling with ...". I think this may be gmake-specific, but in the past I have used ifeq (,$(VERBOSE)) BREVITY=@ else BREVITY= endif c.o: $(BREVITY)$(CC) $(CFLAGS)... To make it make version independent, we could do it as SUPPRESS=@ c.o: $(SUPPRESS)$(CC) $(CFLAGS)... and then run make as 'make SUPPRESS=' when we wanted to see the messages.
Re: dynclasses build (was: Towards 0.1.1 - timetable)
On Oct-06, Leopold Toetsch wrote: > William Coleda <[EMAIL PROTECTED]> wrote: > > Any chance of getting: > > > 'cd dynclasses; make' > > > working on OS X by then? > > It's broken on Linux too. The problem seems to be that non-existing > shared libs are used for the final "perl build.pl copy" phase. These > libs seem to bundled into lib-*$(SO). Whoops, I thought that was only after my local changes. > For OS X, it could be that $LD_SHARED_FLAGS = "-fPIC" is necessary. Nope, it's something more like "-bundle -undefined suppress". But the real problem is that we don't yet distinguish between building shared libraries and dynamically loadable modules. (Linux doesn't make this distinction.) I'm working on it. Will gave me an account on his Darwin box, and I have it building and installing. I'm still trying to track down a problem where it doesn't seem to call the init function. (Or something. I ran out of time when I got to that point.)
Re: Towards 0.1.1 - timetable
On Oct-05, Leopold Toetsch wrote: > Wed 6.10. 18:00 GMT - feature freeze > Sat 9.10. 8:00 GMT - code freeze - no checkins please > > - Parrot 0.1.1 will go out on Saturday. > - nice release name wanted 0.1.1 - Hydroparrot 0.1.2 - Helioparrot 0.1.3 - Parrolith 0.1.4 - Perylous 0.1.5 - Porn (um... Borot?) 0.1.6 - Carrot 0.1.7 - Nitroparrot 0.1.8 - Parrot Oxide 0.1.9 - Fluoridated Parrot 0.1.10 - Neon Parrot 0.1.11 - Podium
Re: [perl #31850] [PATCH] Remove obsolete files from MANIFEST.generated
On Oct-05, Andy Dougherty wrote: > > This patch removes two files that are no longer generated from > MANIFEST.generated. Thanks, applied.
Re: [perl #31849] [PATCH] Slightly quieter and more informative compilation
On Oct-05, Andy Dougherty wrote: > > The following patch makes compilation both slightly quieter and also > slightly more informative. > > Or, with less "spin", it fixes bad advice I gave previously. Specifically, > I had previously noted that it's generally helpful if the Makefile prints > out the commands it is trying to execute so that it's easier to track down > problems when they fail. When I suggested that the '@' be removed from > the .pbc.imc rule, I suggested at the same time that the '@' also be > removed from the compilation step above it. > > Alas, that wasn't quite the right place to do it. What's ultimately of > interest is the command that actually gets called. Accordingly, this > patch puts the '@' back in front of the invocation > of tools/dev/cc_flags.pl, but changes the print statement in > tools/dev/cc_flags.pl to show the actual compilation command being issued. > > Again, I have found that information to be useful on numerous > occasions. Also, considering how noisy the whole ICU build is, I > think the extra clutter for parrot's sources is not a significant > additional burden. Makes sense to me. Applied, thanks. Tell me again why Andy doesn't have commit privs yet?
Re: [perl #31807] make install not portable
On Oct-02, Nicholas Clark wrote: > $ make install > /home/nick/Install/bin/perl5.8.4 tools/dev/install_files.pl --buildprefix= > --prefix=/home/nick/Install/parrot --exec-prefix=/home/nick/Install/parrot > --bindir=/home/nick/Install/parrot/bin --libdir=/home/nick/Install/parrot/lib > --includedir=/home/nick/Install/parrot/include MANIFEST MANIFEST.generated | sh > > We have perl. Which is guaranteed to be on all platforms we build on. > So why are we making a big list of commands and then feeding them to Unix > shell? Which isn't going to be on all platforms that we build on. Because long ago, when I implemented 'make install', I only did the minimum necessary to get RPM building working. And RPMs don't work so well on Windows, or so I hear. The "generate a script and pipe it through sh" approach is something I frequently use when cooking things up quickly, because you can develop without the pipe, and then when everything looks good, add it on and know what's going on. It's a nice way to get visibility into what something's doing. And the script itself uses forward slashes everwhere, so why bother pretending to be portable? Not that any of this matters. I've committed fixes to hopefully make it work portably. Of course, the MANIFEST.generated file is woefully incomplete, so the files actually installed aren't terribly useful.
Re: Why lexical pads
On Sep-24, Aaron Sherman wrote: > On Fri, 2004-09-24 at 10:03, KJ wrote: > > > So, my question is, why would one need lexical pads anyway (why are they > > there)? > > They are there so that variables can be found by name in a lexically > scoped way. One example, in Perl 5, of this need is: > > my $foo = 1; > return sub { $foo ++ }; > > Here, you keep this pad around for use by the anon sub (and anyone else > who still has access to that lexical scope) to find and modify the same > $foo every time. In this case it doesn't look like a "by-name" lookup, > and once optimized, it probably won't be, but remember that you are > allowed to say: > > perl -le 'sub x {my $foo = 1; return sub { ${"foo"}++ } }$x=x();print $x->(), > $x->(), $x->()' > > Which prints "012" because of the ability to find "foo" by name. Umm maybe I"m confused, but I'd say that your example prints "012" because of the *inability* to find "foo" by name. If it could find "foo" by name, it would be printing 123. Your snippet is actually finding the global $main::foo, not the lexical $foo. But I agree that it is doing a name lookup in the string eval case. Although if you try it, you get puzzling results: perl -le 'sub x {my $foo = 1; return sub { eval q($foo++) } };$x=x();print $x->(), $x->(), $x->()' prints 012 again. Which confused me, because Perl *can* do named lookups of lexicals. The problem, apparently, is that it's doing the lookup but not finding it. If you add in a nonsensical use of $foo to make sure it sticks around to be found, it works: perl -le 'sub x {my $foo = 1; return sub { $foo; eval q($foo++) } };$x=x();print $x->(), $x->(), $x->()' Now apparently the closure captures the lexical $foo, and thus the eval is able to find it. On the other hand, your original example still doesn't work, and I think that's because symbolic references do not do pad lookups: perl -le 'sub x {my $foo = 1; return sub { $foo; ${"foo"}++ } }$x=x();print $x->(), $x->(), $x->()' still prints 012. Yep. From perlref: Only package variables (globals, even if localized) are visible to symbolic references. Lexical variables (declared with my()) aren't in a symbol table, and thus are invisible to this mechanism. For example:
Re: Compile op and building compilers
On Sep-20, Dan Sugalski wrote: > > Now, the issue is how to actually build a compiler. Right now a > compiler is a simple thing -- it's a method hanging off the __invoke > vtable slot of the PMC. I'm not sure I like that, as it seems really, > really hackish. Hacks are inevitable, of course, but it seems a bit > early for that. (We ought to at least wait until we do a beta > freeze...) On the other hand it does make a certain amount of sense > -- it's a compilation subroutine we're getting, so we ought to invoke > it, and I can certainly live with that. > > Time to weigh in with opinions, questions, and whatnot. There's not > much reason to JFDI and make the decisions final, so weigh away and > we'll just nail it all down on wednesday. My preference, as I've stated before, is to leave compilers as invoke-able PMCs -- and further, I think that compilers will sometimes be coroutines, or return multiple continuations, or play other such tricks available via C (if appropriate for what they do). Which is easy if you forget about compilation as being something special, but instead just say it's invocable and thereby inherit all of the PIR syntactic sugar for Subs. On the other hand, that opinion assumes that compilers are used in the funky ways that I am thinking of, which involves a lot of switching between languages, using other languages' facilities for implementing pieces of your language, etc. If Parrot is primarily going to be mixing languages by having one language call another's libraries, then I can see some utility in having a separate C op, even if its only purpose is to explicitly declare that compilers must take only a single string and produce a callable PMC, and no more. (Though I wonder if you might want sometimes use a filename rather than a string-containing-enormous-chunk-of-code). Screwballs like me would then make our languages compiled via a different mechanism, and we wouldn't play in the same sandbox as "regular" compilers. However, then we'd need to decide whether those types of compilers should be registered via compreg, or whether anything registered via compreg is required to do something meaningful when invoked with a single string argument containing code (or whatever C ends up doing; that's just what it does now.) A question: when last we talked about this, you mentioned that you didn't envision it being useful for compilers to take arguments. I think you were only talking about configuration, but in any case, what sorts of mechanisms do you feel are appropriate for setting options, pointing to libraries or include paths, etc? Also, is Parrot supposed to provide a rich enough set of core functionality that compilers will never need to communicate directly with the "host" language? As a simple example, say you have an embedded language that wants to add a new local variable. Parrot has pads for this purpose, but what if you need to specify some sort of rich type information or register it with some host language-specific registry singleton of some sort? I don't know if these sorts of things are useful, but they're easily within the scope of imagination. :-)
Re: [perl #31682] [BUG] Dynamic PMCS [Tcl]
On Sep-22, Will Coleda wrote: > ld: /Users/coke/research/parrot/blib/lib/libparrot.dylib is input for the dynamic > link editor, is not relocatable by the static link editor again > compile foo.c failed (256) > > As for the next error... huh? Not surprising. What architecture and linker are you using? Does 'make shared' at the toplevel work for you? If so, can you send the output of it (so I can see the command it runs)? Or better yet, do have an example of a valid link line? I don't have any remotely interesting systems to test on, so I don't know how much help I can be, but I'll take a shot.
Re: Problems Re-Implementing Parrot Forth
On Sep-17, Matt Diephouse wrote: > Having mentally absorbed the forth.pasm code, I thought I'd rewrite it > in PIR and try to make it a full parrot compiler and, hopefully, a bit > more approachable. I've already begun on this (see attached file). > Unfortunately, I've run into a few issues/questions along the way. > > In the following, the term "eval" refers to the use of the > compreg/compile opcodes. > > o In current builds evaling non PASM/PIR code segfaults (or gives a > bus error on OS X) as a result of the last patch to op/core.ops (See > #31573). Any plans here? Leo made the change as a result of a discussion I kicked off, so it's probably my fault. :-| But more to the point, I abandoned that particular interface at the same time, and at least until the bug is fixed, I recommend you do the same. Instead of using the compile op, just call your PMC as a subroutine: $P0 = compreg "forth" $P1 = $P0("...forth code...") Personally, I plan to stick to this interface unless C becomes useful, but other people (eg, Dan) disagree that my usage is correct. > o Evaling PIR code requires a named subroutine and the use of the > 'end' opcode. Someone mentioned on IRC that this might not be the > desired behavior. Is it? Leo talked about adding an @ANON attribute to subs, but I don't believe he has implemented it yet. For now, just use a fixed name every time you compile; the symbol table entry will be overwritten every time you compile, but the compilation should return the sub object for you to keep track of, so you don't need to care about the symbol table. > o Calling subroutines from an eval creates a copy of the user stack, > so all changes are lost (rendering my Forth code unusable). Is this > behavior correct? If so, how should I go about this? Dunno. Don't use an Eval? (Use a plain Sub or something instead) > Any clarifications or statuses, as well as any comments on the PIR > code, would be very much appreciated. I'm more or less stalled at this > point until I get some answers/help. It would be nice to get this to > the point where it can be used to test parrot (as well as serve as an > example to anyone wanting to write a compiler). I have finally gotten around to committing my example compiler in languages/regex. Read the README for usage instructions. Or for a painful account of my attempts to accomplish what you're doing, just for a different language, see http://0xdeadbeef.net/wiki/wiki.pl?FinkBlog parrot/examples/japh/japh15.pasm is a good source of example code too.
Re: [PATCH] dynamic pmc libraries
On Sep-09, Brent 'Dax' Royal-Gordon wrote: > > Tiny nit: for consistency with other Configure source files, this > should probably be named dynclasses_pl.in. No big deal, though. Consistency is good, and you're the authority. Change committed.
Re: [perl #31493] Overlapping memory corruption
On Sep-09, Leopold Toetsch wrote: > Steve Fink (via RT) wrote: > > >I won't go through all the details of what I looked at (though I'll > >post them in my blog eventually), but what's happening is that this > >line (from perlhash.pmc's clone() implementation) is corrupting the > >flags field: > > > >((Hash*)PMC_struct_val(dest))->container = dest; > > Ah, yep. PMC_struct_val(dest) doesn't hold the hash yet, it is created > in hash_clone() only after this line. > > >The problem is that the dest PMC contains a Hash structure in its > >struct_val field > > No. That's the pointer of the free_list, pointing to the previous PMC in > that size class. > Putting above line after the hash_clone() fixes that bug. Hey, your reason is much better than my reason. Still, why do the _noinit stuff and duplicate the creation code? Why not just call pmc_new as in my replacement code?
Re: Semantics for regexes - copy/snapshot
On Sep-09, [EMAIL PROTECTED] wrote: > On Wed, 8 Sep 2004, Chip Salzenberg wrote: > > > According to [EMAIL PROTECTED]: > > > So how many stores do we expect for > > >($a = "xxx") =~ s/a/b/g > > > and which of the possible answers would be more useful? > > > > I think it depends on C<($a = "aaa") =~ s/a/b/g>. > > I would agree with you in general, but since we're generally after speed, > surely we want to allow for optimisations such as "don't store unless > something's changed"; this would also be compatible with the boolean context > value of s///. I vote for leaving all of these sorts of cases undefined. Well, partially defined -- I'd rather we didn't allow ($a = "aaa") =~ s/a/b/g to turn $a into "gawrsh". At the very least, define the exact number of output and stores for "strict aka slow mode", but have an optional optimization flag that explicitly drops those guarantees. It would allow for more flexibility in implementations.
Re: [PATCH] dynamic pmc libraries
On Sep-07, Leopold Toetsch wrote: > Steve Fink <[EMAIL PROTECTED]> wrote: > > > This patch introduces something that feels suspiciously like libtool, > > despite the fact that libtool has never been very kind to me. But for > > now I am targeting this only at the dynamic PMC generation problem; this > > solution could be expanded to ease porting of other parts of the build > > procedure, but I think other people are already working on that. > > Looks good. > > > I am not committing this patch directly because I know that other people > > are currently actively working on the dynamic PMC stuff and the build > > system, and I didn't want to step on anyone's toes. > > So please give it a try. Ok, it's in. See dynclasses/README for brief usage instructions. I'll probably be committing a couple of dynamic PMCs soon (when I can figure out why they're complaining about not having a destroy() defined, when they don't have the active_destroy flag set.)
Re: TODOish fix ops
On Sep-06, Jens Rieks wrote: > Leopold Toetsch wrote: > > So first: > > - do we keep these opcodes? > >If yes some permutations are missing. > > - if no,? we should either not include experimental.ops in the default > > opcode set or move it to dynops. > I have not used them yet, but I think that they can be useful. > Has anyone else except Leo and Dan used them? I use them for debugging printouts, when I want to print the status of something without defining a bunch of labels and contorting the control flow. I also use them for simple non-short-circuiting ors and ands. Nothing terribly important or irreplaceable.
[PATCH] dynamic pmc libraries
Mattia Barbon recently implemented the capability to group multiple dynamic PMCs into a single library. It took me a while, but the correct way of using it finally percolated through my thick skull. One remaining problem is that the build process is very platform-dependent. This patch doesn't fix that, but it does eliminate the gmake dependency. Another problem is that you have to specifically write Makefile rules to build your group of dynamic PMCs into a library, and that is very difficult to do portably. This patch introduces something that feels suspiciously like libtool, despite the fact that libtool has never been very kind to me. But for now I am targeting this only at the dynamic PMC generation problem; this solution could be expanded to ease porting of other parts of the build procedure, but I think other people are already working on that. The patch adds an additional target to config/gen/makefiles.pl: instead of just converting config/gen/makefiles/dynclasses.in to dynclasses/Makefile, it also converts config/gen/makefiles/dynclasses.pl.in to dynclasses/build.pl, and changes that Makefile to call build.pl to do all the real work. It is thus able to pick up config/init/data.pl's notions of all of the ${cc}, ${ld}, etc. definitions, but leaves the description of which PMCs to build with the original Makefile (which probably isn't the greatest place, but I'm trying to change as little as possible.) My guess is that this will not immediately cause dynamic PMCs to start working on the platforms where they do not currently work, but it should make it easier to get them to work. It also implements a new pmclass attribute in .pmc files (only meaningful for dynamic PMCs): C, which will get automatically picked up by the new dynclasses/build.pl to generate a single shared library out of all PMCs with the same group tag. So to implement two new dynamic PMCs 'mylangPmc1' and 'mylangPmc2', you would: * Implement the .pmc files, and include 'group mylang' in their pmclass lines * Add mylangPmc1 and mylangPmc2 to config/gen/makefiles/dynclasses.in * Re-run Configure.pl That is the same procedure as is currently used to implement independent dynamic PMCs right now, except for the addition of the 'group mylang' tag. I am not committing this patch directly because I know that other people are currently actively working on the dynamic PMC stuff and the build system, and I didn't want to step on anyone's toes. Note that build.pl is NOT a general build tool, although it covers everything needed for the dynclasses/ directory. At the moment, it doesn't even bother to do dependency analysis for the grouped PMCs, although it does for all of the rest. Still, this patch gets stuff working that currently doesn't exist, and doesn't break anything that currently works AFAIK. I fully expect (and hope) that it will be replaced by something more general someday. But I'd rather not wait for that day, having first-hand experience with how much "fun" it is to get partial linking of dynamic libraries working on multiple platforms. Index: config/gen/makefiles.pl === RCS file: /cvs/public/parrot/config/gen/makefiles.pl,v retrieving revision 1.34 diff -u -r1.34 makefiles.pl --- config/gen/makefiles.pl 19 Jun 2004 09:33:09 - 1.34 +++ config/gen/makefiles.pl 5 Sep 2004 22:28:23 - @@ -81,6 +81,8 @@ commentType => '#', replace_slashes => 1); genfile('config/gen/makefiles/dynclasses.in', 'dynclasses/Makefile', commentType => '#', replace_slashes => 1); + genfile('config/gen/makefiles/dynclasses.pl.in', 'dynclasses/build.pl', + commentType => '#', replace_slashes => 0); genfile('config/gen/makefiles/dynoplibs.in', 'dynoplibs/Makefile', commentType => '#', replace_slashes => 1); genfile('config/gen/makefiles/parrot_compiler.in', 'languages/parrot_compiler/Makefile', Index: classes/pmc2c2.pl === RCS file: /cvs/public/parrot/classes/pmc2c2.pl,v retrieving revision 1.16 diff -u -r1.16 pmc2c2.pl --- classes/pmc2c2.pl 22 Aug 2004 09:15:51 - 1.16 +++ classes/pmc2c2.pl 5 Sep 2004 22:28:24 - @@ -135,12 +135,6 @@ Used with C: No C code is generated. -=item C - -The class is a dynamic class. These have a special C -routine suitable for dynamic loading at runtime. See the F -directory for an example. - =item C Classes with this flag get 2 vtables and 2 enums, one pair with @@ -164,6 +158,18 @@ library ref +=item C + +The class is a dynamic class. These have a special C +routine suitable for dynamic loading at runtime. See the F +directory for an example. + +=item C + +The class is part of a group of interrelated PMCs that should be +compiled together into a single shared library of the given name. Only +valid for dynamic PMCs. + =back =item 3. @@ -318,7 +324,7 @@ my $c = shift; $$c =~ s
Re: Semantics for regexes
On Sep-01, Dan Sugalski wrote: > > This is a list of the semantics that I see as needed for a regex > engine. When we have 'em, we'll map them to string ops, and may well > add in some special-case code for faster access. > > *) extract substring > *) exact string compare > *) find string in string > *) find first character of class X in string > *) find first character not of class X in string > *) find boundary between X and not-X > *) Find boundary defined by arbitrary code (mainly for word breaks) Huh? What do you mean by "semantics"? The only semantics needed are the minimum necessary to answer the question "is the fred at offset i equal to the fred X?" (Sorry, not sure if fred is actually character or codepoint or whatever, and is probably all of them at different levels.) We also almost certainly need to be able to do character class comparisons, although if you assume that you can always transcode to what the regex was compiled with, then you don't even need that -- instead, you need to be able to convert to something like a difference list of numbered freds. But if we're talking about semantics, then yes you need the character class manipulation. Everything else in this list sounds like optimizations to me, and probably not the right optimizations (I don't think it's possible to predict what will be useful yet.) For other things that parrot will be used for, I suspect that the first 3 will be needed. I'm curious as to how you came up with that list; it seems to imply a particular way of implementing the grammar engine. I would expect all of that, barring certain optimizations, to be done directly with existing pasm instructions. There will be a need for saving a stack of former values of hypothetical variables, which can also be done with pasm ops but might interact with overloaded assignment or something wacky like that.
Re: Proposal for a new PMC layout and more
On Sep-01, Leopold Toetsch wrote: > Below is a pod document describing some IMHO worthwhile changes. I hope > I didn't miss some issues that could inhibit the implementation. Overall, I like it, although I'm sure I haven't thought of all of the repercussions. The one part that concerns me is the loss of the flags -- flags just seem generally useful for a number of things. In the limit, each flag could be replaced by an equivalent vtable entry that just returned true or false, but that will only work for rarely-used flags due to the extra levels of indirection. I suppose we could also have a large class of PMCs that contained a flag word, and only the primitive PMCs would lack it, but then the flags cannot be used without knowing the type of PMC. It all comes down to the specific current and future uses of flags. You've dealt with the GC flags; what about the rest? The proposal would also expand the size of the vtable by a bit due to the string vtable stuff. I don't know how much that is, percentage-wise. And I suppose that increase is dwarfed by the decrease due to eliminating the S variants. (Although that's another part of the proposal that makes me nervous -- will MMD really take care of all of the places where we care that we're going to a string, specifically, rather than any other random PMC type? Strings are a pretty widespread concept throughout the code base, and this is the only highly user-visible part of the change.) I also view the proposal as being comprised of several fairly independent pieces. Something like: * Merging PMCs and Buffers * Merging STRINGs and PMCs * Removing GC-related flags and moving them to GC implementations * Removing the rest of the flags * Using Null instead of Undef * Moving "extra" stuff to before the PMC pointer * Using Refs to expand PMCs * Using DOD to remove the Ref indirection * Shrinking the base PMC size ..and whatever else I forgot. Not all of these are dependent on each other, and could be implemented separately. And some are only dependent in the sense that you'll make space or time performance worse until you make the rest of the related changes. You could call those design-dependent, rather than implementation-dependent.
Re: Compile op with return values
On Aug-30, Dan Sugalski wrote: > I've been watching this thread with some bemusement -- I've got to > admit, I don't see the problem here. > > I'm not sure what the point of passing in parameters to the > compilation is. (Not that I don't see the point of having changeable > settings for compilers, but that's something separate) The interface > is simple on purpose -- in most cases either there *are* no > parameters possible (Perl's eval and its equivalent in other > languages) or there's no reasonable way to know what the parameters > are (Perl's eval evaluating code of a different language). The syntax > just isn't there to have them, and is really unlikely to ever > materialize, so there's little point in putting in parameters to the > compilation. In those cases where the programmer may know what to > change, they can tweak any external knobs the compiler module might > have programmatically. I understand that perspective, but I guess I'm thinking about embedded compilers somewhat differently. For example, consider a regex compiler. It needs to be able to compile embedded code in whatever the host language is. In fact, it needs to be able to switch back and forth freely between the regex compile and the host language compile, and the compilation of the inner language might need to be tailored to fit into whatever the regex compiler needs. Maybe that's a simple as saying "don't provide a main()", in which case it can be done by having two C registration strings for the same language. But you have to get the name of that language into the regex compiler in the first place. (Ok, you might be able to avoid that in this particular case by making the regex compiler into a coroutine, but I don't want to get too caught up in one particular example.) And the compiler needs to be reentrant, for the cases where the language within the regex rule invokes another regex match. I mention that only to say that you can't just set properties on the PMC returned from C, because that PMC will be shared during reentrant calls. You could always clone it and then configure it, I suppose. Anyway, I'm just trying to come up with situations where compilers need to know more than just the language they're compiling, and especially cases where you want different configuration for every compile. Another example of this would be if your regex syntax involved binding hypothetical variables (or something similar), and the inner language needed to know at compile time which variables had been defined at that point. I'm sure I could come up with workarounds for all of these issues, but I was expecting that much of the usefulness of Parrot would be in mixing together (and nesting) several languages in one program, and it seems like in many cases nested compiles are going to need to communicate nontrivial amounts of information. I'm okay with things if the answer is "don't do that" -- meaning if you need complex cases like this, then forget about C and do everything with straight subroutines or whatever else -- but I would like to understand the intent of the C op better so I can forget about trying to make my stuff fit into its mold if what I'm doing is just different. > The whole "name for the function I'm compiling" thing isn't an issue > either, or at least it shouldn't be. The code being compiled is > implicitly a subroutine -- you don't have to have code that reads: > >.sub foo_1234423_some_random_text >. >. >. >.end > > and go look for 'foo12 34423_some_random_text' in a namespace > somewhere. Just leave out the .sub/.end (they should be implied) and > the returned PMC is a sub PMC for your nicely anonymous sub. Which is > fine, and as it should be. That would work fine for me. The current state also works ok since overriding is allowed, but it feels wrong to construct a sub with a specific name and then disavow all knowledge of that name even though it's been registered in some global table. Leo's @ANON implementation of your scheme works great for me (I have no problem wrapping that around my code.) All this does raise the question of garbage collection for packfile objects; is there any? Both my current day job project and (I'm guessing) mod_perl both hope the answer is "yes". :-)
Re: [perl #31268] [PATCH] Dynamic library with multiple PMCs
On Aug-21, Mattia Barbon wrote: > > Hello, > as promised with this patch: > > pmc2c2 ... --library foo --c pmc1.pmc pmc2.pmc pmc3.pmc ... > > outputs pmcX.c and pmc_pmcX.h as it did before, plus > foo.c and pmc_foo.h containig a single Parrot_lib_foo_load > that initialized vtables and MMD dispatch for all the PMCs, > taking into account circular PMC dependencies in MMD dispatch. I am trying to use this facility right now, and am encountering problems. For a detailed blow-by-blow account of what I'm doing, see http://0xdeadbeef.net/wiki/wiki.pl?FinkBlog/SharedLibraryHell But that's much too verbose, so I'll just describe the problem. I have two PMCs that need to know about each other, match.pmc and matchrange.pmc. (Ok, actually just one needs to know about the other, but whatever.) So I bound them together into a library 'match_group' using your --library flag. But the resulting shared library uses symbols defined in its constituents, and so when parrot tries to load it, it cannot resolve those symbols. I can fix this by changing the link line that builds match_group.so to explicitly list match.so and matchrange.so, and that enters them in as NEEDED in the shared library match_group.so. So far so good. But then I still can't get it to work, because parrot is only able to find match_group.so because it explicitly constructs the path runtime/parrot/dynext/LIBRARY.so; when dlopen internally tries to load match.so, it doesn't have that path in its RPATH. This is fixable by adding the path to LD_LIBRARY_PATH, but if you're going to do that then why bother with the explicit path construction within parrot? It seems to me that we need to either add the full RPATH to the parrot binary, or teach parrot its absolute path so it can add it at runtime. And I'm not totally sure that the runtime approach will work. Or we could punt and wrap everything for a PMC library into the same .so. But that doesn't seem very nice. How does perl5 handle this? Also, it seems that dynclasses/Makefile has a few gmake-isms thrown in, which I imagine nobody has complained about because the shared library stuff probably only works on Linux anyway right now. (And in my local version, I added lots more to get the match_group.so linked up correctly.)
Re: Library loading
On Aug-28, Dan Sugalski wrote: > > We dynamically load libraries. Whee! Yay, us. We need a set of > semantics defined and an API to go with them so we can meaningfully > and reliably work with them. Hm. Today I was working with the current implementation of this stuff, and uncovered a bunch of questions. I'm not sure this thread is really the right one for my questions, but it was too timely to pass up. > 1) Load the shared library from disk > > The equivalent of dlopen. (May well *be* dlopen) I'm running into problems with loading libraries that are dependent on other (user-provided, not system) libraries. I'm thinking it would be nice to have an interface for setting search paths. Take my particular case: I have something called match_group.so which contains some undefined symbols from both match.so and matchrange.so. All of these are from dynamic PMCs, so they should be found in runtime/parrot/dynext/. And if I run from the top-level directory, match_group.so *is* found -- but it fails to load, because it can't find its dependencies. match_group.so is found because dynext.c:get_path() explicitly looks through runtime/parrot/dynext, but that doesn't help the implicit loading of match.so. I can get it to work by explicitly setting LD_LIBRARY_PATH to the absolute path of the dynext/ directory, but that's not a good long-term solution. So there are two problems. One is that parrot looks in a relative directory when it's searching for dynamic libraries. If this were an absolute path, then you could run parrot from some other directory and still find your libraries, but I don't know if parrot knows where it is running from yet. (FindBin, anyone?) The second problem is that parrot must be able to find dependent libraries implicitly, so it can't manually construct paths and have things work. One solution for this would be to add an API entry to set the search path. On Unix, I think that means setting LD_LIBRARY_PATH, I haven't tried that to see if it actually works if you set it while a program is running -- there were some hints on Google that it might fail. If it doesn't work, I don't know of any other way to set the implicit search path programmatically, so perhaps it shouldn't be in the API. :-) Alternatively, we can hardcode the absolute path to the dynext/ directory in the RPATH tag of the dynamic section (for ELF). > We're also going to want to allow embedding applications to pass in > handles to existing libraries (so, for example, we don't try and load > in half a dozen versions of the expat library...) that it's already > loaded in. I believe glibc already caches all of the handles and just gives you back the same handle if you ask for it again. So unless other systems don't, this isn't needed. I'll write another message specifically talking about the problems I am encountering, because the remainder has no bearing on the API.
Re: Compile op with return values
On Aug-27, Leopold Toetsch wrote: > Steve Fink wrote: > >On Aug-26, Leopold Toetsch wrote: > > >>.sub @regex_at_foo_imc_line_4711 # e.g. > > >Yes, this illustrates what I was really getting at. My compiler can > >certainly take a subroutine name (or file and line number, or whatever) > >to use to generate the code with, but what is the proper way to pass > >that infomation in through the compile op? > > I don't know how your compiler generates the code. But you are probably > concatenating a string of PIR instructions and pass that over to the > C opcode. > Anyway, the identifier you are using for the C<.sub> directive gets > stored in globals and is the name of the subroutine. Um... sorry, I was unclear. I am talking about how to get data *into* my compile op. You are describing how to get it *out*, which I am well aware of. > >... I can just stick it in some > >register, but it seems like there ought to be some standard-ish way of > >passing parameters to compilers. Which then makes me wonder why compile > >is done as an opcode rather than a method invocation on the return value > >of compreg. > > C as a method call for the compiler would really be a > worthwhile extension. But you can provide your own compiler wrapper and > pass the subroutine name to that function. [1] Yes, that is a reasonable way of implementing it. I probably wouldn't do it exactly that way because I'd rather have the generated code be a completely valid PIR snippet on its own -- in your example I would have some orphaned ".param x" lines in what my compile op returned, which would require the surrounding ".sub" to be valid. But it does encapsulate things nicely, and makes it clear how to pass such information in. But that's not my issue. To stave off confusion, I should mention here that I am not blocked on any of this; I can think of half a dozen ways of doing what I want, now including the one you suggested. I am now only asking these questions so that I might better write up a FAQ entry, and what I am still unsure about is how to explain the purpose of the compile op, and what the "official" way is to pass in parameters that influence the compiled code. > >... I see that for Compiler and NCI PMCs, that's exactly what it > >does, but for anything else it does the Parrot_runops_fromc_args_save > >thing; couldn't that be somehow exposed separately so that the compile > >op goes away? My only complaint about C is that it isn't > >transparent how to use it, whereas I am comfortable with invoking things > >and following the calling conventions. > > Well, there isn't much difference. The compile function is called as a > plain function. A method call would additionally pass C, which you > can pass as an argument too, if you need it. Right. So why does the compile op exist? I assert that many compilers will need some form of additional contextual information in order to properly compile the code they are passed. My one example so far is the key needed to ensure that the return Sub PMC is associated with a unique name. This isn't a very good example, because it could be done wholly within the compile op and therefore doesn't *really* need any information passed in. However, there are many other examples: include paths, library paths, optimization options, etc. It seems like there ought to be a standard way of communicating this sort of information to embedded compilers. Perhaps a way that is standard enough that the language to be compiled could be treated as a dynamic parameter -- in which case, all compilers would need interpret the passed-in contextual information in the same way. We already have one way of communicating information, and that is to use the register calling conventions. If we really wanted different languages' compilers to be able to interpret the same contextual information (which is nice to have, but hardly a necessity), then we would additionally need to specify a common signature -- perhaps to compile something, you look up an invokable compiler PMC using compreg, then "call" it with a single parameter representing named arguments. Or something. The question then becomes how to call the compiler. We could use the C op, but it would be doing exactly the same thing as the C op, so why not just use that? If I look at the code, it shows that there sometimes is a different between the two -- namely, if the compiler PMC is anything than a Compiler or an NCI, then it calls Parrot_runops_fromc_args_save rather than simply invoking it. Whatever that function does, it must be necessary, so... why? Then at last I will understand why there is a need for a C separate from C.
Re: Compile op with return values
On Aug-26, Leopold Toetsch wrote: > Steve Fink wrote: > > >I can store some global counter that makes it generate different sub > >names each time, but that seems a bit hackish given that I don't really > >want the subroutine to be globally visible anyway; I'm just using one so > >that I can use PIR's support for handling return values. > > I don't think its hackish. And you might want to keep the compiled regex > around for later (and repeated) invocation. I certainly do, but I can always do that by remembering the PMC returned from the compile op myself. > Visibility is another issue though. You could mangle[1] the subroutine > name or use a distinct namespace, which might reduce the possibility of > name collisions. > > leo > > .sub @regex_at_foo_imc_line_4711 # e.g. > .end Yes, this illustrates what I was really getting at. My compiler can certainly take a subroutine name (or file and line number, or whatever) to use to generate the code with, but what is the proper way to pass that infomation in through the compile op? I can just stick it in some register, but it seems like there ought to be some standard-ish way of passing parameters to compilers. Which then makes me wonder why compile is done as an opcode rather than a method invocation on the return value of compreg. I see that for Compiler and NCI PMCs, that's exactly what it does, but for anything else it does the Parrot_runops_fromc_args_save thing; couldn't that be somehow exposed separately so that the compile op goes away? My only complaint about C is that it isn't transparent how to use it, whereas I am comfortable with invoking things and following the calling conventions.
Re: Compile op with return values
On Aug-22, Leopold Toetsch wrote: > Steve Fink <[EMAIL PROTECTED]> wrote: > > I am experimenting with registering my own compiler for the "regex" > > language, but the usage is confusing. It seems that the intention is > > that compilers will return a code object that gets invoked, at which > > time it runs until it hits an C opcode. But what if I want to > > return some values from the compiled code? I see the following > > options: > > An Eval object isa Closure. The only difference to a subroutine is IIRC, > that if supports direct jumps out of the evaled code segment via the > inter-segment branch_cs opcode: > > . > . > . > > Thus it should be totally valid to return a Sub or Closure object from > your compiler. The C of these two handles the packfile segment > switching. > > > $P0 = compreg "pig-latin" > > $P1 = compile $P0, "eturnray oneay oremay anthay asway assedpay inay" > > $I0 = $P1(41) > > print $I0 # Should print out 42 > > That's fine. Ok, that makes sense. I didn't realize the relationship between the Eval PMC and Sub. Thanks. So then what is the proper way to return a Sub from the compile op? Right now, I always compile to the same subroutine name "_regex", and fetch it using find_global _regex. This works, even when I compile multiple chunks of regex code. I guess whatever is entering things into the global symbol table is overriding the previous definition, and I grab it out immediately thereafter. But is this safe to rely on, or will it later become an error to override a global subroutine? I can store some global counter that makes it generate different sub names each time, but that seems a bit hackish given that I don't really want the subroutine to be globally visible anyway; I'm just using one so that I can use PIR's support for handling return values.
Compile op with return values
I am experimenting with registering my own compiler for the "regex" language, but the usage is confusing. It seems that the intention is that compilers will return a code object that gets invoked, at which time it runs until it hits an C opcode. But what if I want to return some values from the compiled code? I see the following options: 1) Manually set up the return values in the appropriate registers. 2) Use .pcc_begin_return/.pcc_end_return to set up the return values, but override the P1 return continuation to go to some subroutine that immediately calls C. 3) Use .pcc_begin_return/.pcc_end_return to set up the return values, and have the user of the compiler grab out some magic-named subroutine using C and invoke it rather than directly invoking the return value of the C op. 4) Make my compiler behave differently from the built-in compilers, by arranging for it to return a Sub object from the C op rather than an Eval object. Then the caller can directly invoke the return value of C and get back return values as if it were calling a normal subroutine, because it is. #1 is painful and extremely error-prone. #2 is a total hack. #3 forces the user of the compiler to not only call things differently depending on which language it's using, but also it has to know the magic subroutine name #4 seems to make compilers inconsistent with each other, and I worry that the Eval PMC does more than just running code until it hits an C op. If my explanation was confusing, then here are some examples. I expect the user of the compiler to look like this: $P0 = compreg "pig-latin" $P1 = compile $P0, "eturnray oneay oremay anthay asway assedpay inay" $I0 = $P1(41) print $I0 # Should print out 42 Option #3 would change this to $P0 = compreg "pig-latin" compile $P0, "eturnray oneay oremay anthay asway assedpay inay" $P1 = find_global "_pig_latin_eval_block" $I0 = $P1(41) print $I0 # Should print out 42 Rather than writing out example code for the compiler and generated code to illustrate each of the cases, I think I'll wait to see if this needs clarification first. :-)
Parrot experiences log
As I prepared to dive into a big area of parrot that I'm completely unfamiliar with, I decided to log my travels in hopes of helping out the next poor soul who happens along a similar path. For now, the focus is on converting my toy languages/regex compiler into more of a real Perl6-style rule compiler callable directly by Parrot. I know someone else was blogging his initial experiences slogging through the Parrot source -- that's where I got the idea, though I'm not exactly a newbie around here so I'll be assuming a bit more familiarity with things. (Though I'm pretty out of touch right now, so maybe not...) Oh, right, the url: http://0xdeadbeef.net/wiki/FinkBlog At least for now. That's my server, and my DSL line is flaky, but I wanted to do the minimal amount of setup, and I'm familiar with this particular wiki syntax. And I expect to get bored of writing stuff up before reaching the point at which moving would be worthwhile. Yes, I'm aware of the wiki at http://www.vendian.org/parrot/wiki/bin/view.cgi/Main/WebHome but I couldn't find an appropriate place to put this there. Also, I wanted something I can get to while off-line, and something I can molest more easily to make writing the log easier. I've modified my wiki a fair amount, and have gotten dependent on some of the shortcuts.
Re: [PATCH] Match PMC
Oh, and here's my test code for the Match PMC. This is really just a copy of t/pmc/perlhash.t (since the Match PMC is supposed to behave like a hash for the most part), but with one added test case at the end showing how this would be used to store and retrieve hypotheticals. Index: t/pmc/match.t === RCS file: t/pmc/match.t diff -N t/pmc/match.t --- /dev/null 1 Jan 1970 00:00:00 - +++ t/pmc/match.t 17 Aug 2004 17:28:17 - @@ -0,0 +1,1256 @@ +#! perl + +# Copyright: 2001-2003 The Perl Foundation. All Rights Reserved. +# $Id: match.t,v 1.44 2004/04/19 12:15:22 leo Exp $ + +=head1 NAME + +t/pmc/match.t - Match Objects + +=head1 SYNOPSIS + + % perl t/pmc/match.t + +=head1 DESCRIPTION + +Tests the C PMC. Does standard hashtable testing. Then tests +various aspects of retrieving ranges of the input string. + +Probably ought to do nested match objects too. + +=cut + +use Parrot::Test tests => 34; +use Test::More; + +output_is(< "a", b => [undef, undef] } + +clone P1, P0 +set P0["c"], 4 +set P3, P0["b"] +set P3, 3 +set P0["b"], P3 +set P1["a"], "A" + +# P0 = { a => "a", b => [undef, undef, undef], c => 4 } +# P1 = { a => "A", b => [undef, undef] } + +set S0, P0["a"] +eq S0, "a", ok1 +print "not " +ok1: +print "ok 1\n" + +set P5, P0["b"] +set I0, P5 +eq I0, 3, ok2 +print "not " +ok2: +print "ok 2\n" + +set I0, P0["c"] +eq I0, 4, ok3 +print "not " +ok3: +print "ok 3\n" + +set S0, P1["a"] +eq S0, "A", ok4 +print "not " +ok4: +print "ok 4\n" + +set P5, P1["b"] +set I0, P5 +eq I0, 2, ok5 +print "not (" +print I0 +print ") " +ok5: +print "ok 5\n" + +# XXX: this should return undef or something, but it dies instead. +# set P3, P0["c"] +# unless P3, ok6 +# print "not " +# ok6: +# print "ok 6\n" + end +CODE +ok 1 +ok 2 +ok 3 +ok 4 +ok 5 +OUTPUT + +output_is(<<'CODE', <\n" +ret + +subtest: +print "subrule/" +print S0 +print ":" +set S0, P1["subrule";S0] +isnull S0, report_null +print S0 +print "\n" +ret +subreport_null: +print "\n" +ret +CODE +empty_at_start: +empty_at_middle: +whole:the full input string +full:full +no_start: +no_end: +regular_key:boring old value +subrule/empty_at_start: +subrule/empty_at_middle: +subrule/whole:the full input string +subrule/full:full +subrule/no_start: +subrule/no_end: +subrule/regular_key:boring old value +Direct access to start,end for 'full': 4,7 +OUTPUT + +1;
[PATCH] Match PMC
I needed to create a Match PMC object for holding the match groups (parenthesized expressions and capturing rules) from a regex match. Unfortunately, it works by using another new PMC type, the MatchRange PMC, to signal that an element of its hashtable should be interpreted specially (as a substring of the input string). One PMC knowing about another currently means they need to be static PMCs, not dynamic. (AFAIK) So this is the patch of what I am currently using. I cannot guarantee it will actually be useful for any other regex implementors, so I feel uncomfortable committing it myself. (OTOH, if someone needs something different, they can just add it as a different name.) The point is, this is something I need for my stuff and the future of languages/regex is with some version of it, so I can't commit those changes without this. Although I fully expect the Match PMC will need to be substantially beefed up to become a full grammar object (or something...), this is base functionality that it needs to start with. With these two PMCs, I can construct a match object containing the hypotheticals $1, $2, etc., as well as a full parse tree comprised of nested match objects. This does *not* handle saving and restoring previous hypothetical values, as is needed in the case of (a)+b In my compiler, that is handled by the compiled engine code. Index: classes/match.pmc === RCS file: classes/match.pmc diff -N classes/match.pmc --- /dev/null 1 Jan 1970 00:00:00 - +++ classes/match.pmc 17 Aug 2004 17:02:01 - @@ -0,0 +1,205 @@ +/* +Copyright: 2004 The Perl Foundation. All Rights Reserved. +$Id$ + +=head1 NAME + +classes/match.pmc - Match object for rules + +=head1 DESCRIPTION + +This is a match object for holding hypothetical variables, the input string, +etc. + +For now, it is really just proof-of-concept code, and I fully expect +anyone who reads this to hurl. Violently. + +=head2 Functions + +=over 4 + +=cut + +*/ + +#include +#include "parrot/parrot.h" + +STRING * hash_get_idx(Interp *interpreter, Hash *hash, PMC *key); + +static STRING* make_hash_key(Interp* interpreter, PMC * key) +{ +if (key == NULL) { +internal_exception(OUT_OF_BOUNDS, +"Cannot use NULL key for Match!\n"); +return NULL; +} +return key_string(interpreter, key); +} + +static STRING* match_range(Interp* interp, PMC* self, PMC* range) +{ +STRING* input_key = const_string(interp, "!INPUT"); +Hash* hash = (Hash*) PMC_struct_val(self); +HashBucket *b; +STRING* input; +int start, end; + +b = hash_get_bucket(interp, hash, input_key); +if (!b) { +internal_exception(1, "Match: input string not set"); +return NULL; +} + +input = VTABLE_get_string(interp, (PMC*) b->value); +/* These could both be converted to grab UVal_int directly, but + * I'll leave it like this for now because it'll test the vtable + * members. */ +start = VTABLE_get_integer_keyed_int(interp, range, 0); +end = VTABLE_get_integer_keyed_int(interp, range, 1); + +if (start == -2 || end == -2 || end < start - 1) +return NULL; +else +return string_substr(interp, input, start, end - start + 1, NULL, 0); +} + +static STRING* fetch_string(Interp* interp, PMC* matchobj, PMC* val) +{ +if (val->vtable->base_type == enum_class_MatchRange) { +return match_range(interp, matchobj, val); +} else { +return VTABLE_get_string(interp, val); +} +} + +static INTVAL fetch_integer(Interp* interp, PMC* matchobj, PMC* val) +{ +if (val->vtable->base_type == enum_class_MatchRange) { +STRING* valstr = match_range(interp, matchobj, val); +return string_to_int(interp, valstr); +} else { +return VTABLE_get_integer(interp, val); +} +} + +pmclass Match extends PerlHash { + +/* + +=item C + +=cut + +*/ + +STRING* get_string_keyed_str (STRING* key) { +PMC* value; +Hash* hash = (Hash*) PMC_struct_val(SELF); +HashBucket *b = hash_get_bucket(INTERP, hash, key); +if (b == NULL) { +/* XXX Warning: use of uninitialized value */ +/* return VTABLE_get_string(INTERP, undef); */ +return NULL; +} +return fetch_string(INTERP, SELF, (PMC*) b->value); +} + +/* + +=item C + +Returns the string value for the element at C<*key>. + +=cut + +*/ + +STRING* get_string_keyed (PMC* key) { +PMC* valpmc; +STRING* keystr; +HashBucket *b; +Hash *hash = PMC_struct_val(SELF); +PMC* nextkey; + +switch (PObj_get_FLAGS(key) & KEY_type_FLAGS) { +case KEY_integer_FLAG: +/* called from iterator with an integer idx in key */ +/* BUG! This will iterate through the input string as + * well as all of the real values. */ +if (hash->key_type == Hash_key_type_int) { +
Re: [perl #31128] Infinite loop in key_string
Oh, and while I have my fingers crossed, I may as well throw in the original test patch as well. I'll let these messages go to hell together. Urk! Except I used stupid filenames, and swapped the attachments. So this attachment is actually the patch. Need more sleep. ? src/py_func.str Index: src/key.c === RCS file: /cvs/public/parrot/src/key.c,v retrieving revision 1.51 diff -u -r1.51 key.c --- src/key.c 8 Jul 2004 10:19:11 - 1.51 +++ src/key.c 17 Aug 2004 17:00:08 - @@ -357,6 +357,10 @@ case KEY_pmc_FLAG | KEY_register_FLAG: reg = interpreter->pmc_reg.registers[PMC_int_val(key)]; return VTABLE_get_string(interpreter, reg); +case KEY_integer_FLAG: +return string_from_int(interpreter, PMC_int_val(key)); +case KEY_integer_FLAG | KEY_register_FLAG: +return string_from_int(interpreter, interpreter->int_reg.registers[PMC_int_val(key)]); default: case KEY_pmc_FLAG: return VTABLE_get_string(interpreter, key);
[PATCH] Re: [perl #31128] Infinite loop in key_string
I don't know what's eating my mail, but evidently the attachment never made it out. I tracked down this particular problem and fixed it for the actual case I was using, which was not a PerlHash at all but rather my own custom Match PMC for use in regexes. The attached patch resolves the exact symptom I was seeing, but actually doesn't fix the problem in either the PerlHash nor the Match cases, for different reasons. For PerlHash, P0["foo";3] seems to be interpreted as an iterator access? I hope there's some other way of indicating that. For my Match PMC, I needed to avoid the whole conversion to string anyway. Still, I won't commit this patch directly, because I have only recently delved into the latest incarnation of the keyed code, and it scares me. Oh boy. What are the odds of this message actually making it out? Index: t/pmc/perlhash.t === RCS file: /cvs/public/parrot/t/pmc/perlhash.t,v retrieving revision 1.44 diff -r1.44 perlhash.t 22c22 < use Parrot::Test tests => 34; --- > use Parrot::Test tests => 35; 693a694,709 > output_is(<< 'CODE', << 'OUTPUT', "Getting PMCs from string;int compound keys"); > new P0, .PerlHash > new P1, .PerlHash > new P2, .PerlInt > set P2, 4 > set P1[0], P2 > set P0["a"], P1 > set I0, P0["a";0] > print "Four is " > print I0 > print "\n" > end > CODE > Four is 4 > OUTPUT >
Re: The new Perl 6 compiler pumpking
Sorry if this is a repeat, but I didn't get my own mail back, so I think I may have had sending problems. On Aug-09, Patrick R. Michaud wrote: > > Luke Palmer and I started work on the grammar engine this past week. > It's a wee bit too early in the process for us to be making any > promises about when people might be seeing releases and the like. > But I think he and I are in agreement that we'd like to have a grammar > engine substantially completed (at least to the level of being able > to "bootstrap" a Perl 6 compiler) within the next 3-4 months. I have one of those too, in languages/regex. The parser is crap, and the implementation of the translation has probably seen a few too many rounds of evolution, but I believe the output is the Right Way To Do It and is relatively easy to extend to support the rest of the Perl6 features. Okay, maybe it's only One Of The Right Ways To Do It. I'd be interested in knowing what your output looks like, and encourage you -- if it's close enough -- to steal my rewrite rules wholesale. (Actually, the parser used in the languages/perl6 directory is pretty nice; the one in languages/regex that I use for testing is crap.) See languages/regex/docs/regex.pod for an introductory document on this style of rule implementation. I wrote it a while back, and have added at least one more fundamental concept to what I laid out there, but it should be more or less valid nonetheless. I left off working on it long ago to concentrate on beefing up the Perl6 compiler enough to make it an interesting host language, but recently I've (locally) added match objects and a parse tree. I have rules, but I was holding off grammars until someone else braver and smarter than I beat on Parrot's object support a bit more. The various cuts are straightforward additions that I've already sketched out, and everything else is gravy. Heh. And I'm really wishing I had picked Jako or one of the other languages/* as a sample host language...
[PATCH] Re: register allocation
On Aug-07, Leopold Toetsch wrote: > Sean O'Rourke <[EMAIL PROTECTED]> wrote: > > [EMAIL PROTECTED] (Leopold Toetsch) writes: > >> The interference_graph size is n_symbols * n_symbols * > >> sizeof(a_pointer). This might already be too much. > >> > >> 2) There is a note in the source code that the interference graph could > >> be done without the N x N graph array. Any hints welcome (Angel Faus!). > > > It looks like the way things are used in the code, you can use an > > adjacency list instead of the current adjacency matrix for the graph. > > Yeah. Or a bitmap. An adjacency list would definitely be much smaller, but I'd be concerned that it would slow down searches too much. I think the bitmap might be worth a try just to see how much the size matters. Since this is an n^2 issue, splitting out the four different register types could help -- except that I'd guess that most code with excessive register usage probably uses one type of register much more than the rest. Anyway, I've attached a patch that uses bitmaps instead of SymReg*'s, which should give a factor of 32 size reduction. I've only tested it by doing a 'make test' and verifying that the several dozen test failures are the same before and after (I don't think things are actually that broken; I think the make system is), but for all I know it's not even calling the code. That's what you get when I only have a two hour hacking window and I've never looked at the code before. > Or still better, create the interference graph per basic block. > Should be much smaller then. Huh? Is register allocation done wholly within basic blocks? I thought the point of the graph was to compute interference across basic blocks. I guess I should go and actually read the code. Index: imcc/reg_alloc.c === RCS file: /cvs/public/parrot/imcc/reg_alloc.c,v retrieving revision 1.14 diff -u -r1.14 reg_alloc.c --- imcc/reg_alloc.c23 Apr 2004 14:09:33 - 1.14 +++ imcc/reg_alloc.c7 Aug 2004 07:11:08 - @@ -41,7 +41,7 @@ static void compute_du_chain(IMC_Unit * unit); static void compute_one_du_chain(SymReg * r, IMC_Unit * unit); static int interferes(IMC_Unit *, SymReg * r0, SymReg * r1); -static int map_colors(int x, SymReg ** graph, int colors[], int typ); +static int map_colors(IMC_Unit *, int x, int * graph, int colors[], int typ); #ifdef DO_SIMPLIFY static int simplify (IMC_Unit *); #endif @@ -58,12 +58,46 @@ /* XXX FIXME: Globals: */ static IMCStack nodeStack; -static SymReg** interference_graph; -/* -static SymReg** reglist; -*/ +static int* interference_graph; static int n_symbols; +static int* ig_get_word(int i, int j, int N, int* graph, int* bit_ofs) +{ +int bit = i * N + j; +*bit_ofs = bit % sizeof(*graph); +return &graph[bit / sizeof(*graph)]; +} + +static void ig_set(int i, int j, int N, int* graph) +{ +int bit_ofs; +int* word = ig_get_word(i, j, N, graph, &bit_ofs); +*word |= (1 << bit_ofs); +} + +static void ig_clear(int i, int j, int N, int* graph) +{ +int bit_ofs; +int* word = ig_get_word(i, j, N, graph, &bit_ofs); +*word &= ~(1 << bit_ofs); +} + +static int ig_test(int i, int j, int N, int* graph) +{ +int bit_ofs; +int* word = ig_get_word(i, j, N, graph, &bit_ofs); +return *word & (1 << bit_ofs); +} + +static int* ig_allocate(int N) +{ +// size is N*N bits, but we want don't want to allocate a partial +// word, so round up to the nearest multiple of sizeof(int). +int need_bits = N * N; +int num_words = (need_bits + sizeof(int) - 1) / sizeof(int); +return (int*) calloc(num_words, sizeof(int)); +} + /* imc_reg_alloc is the main loop of the allocation algorithm. It operates * on a single compilation unit at a time. */ @@ -446,6 +480,12 @@ /* creates the interference graph between the variables. * + * data structure is a 2-d array 'interference_graph' where row/column + * indices represent the same index in the list of all symbols + * (unit->reglist) in the current compilation unit. The value in the + * 2-d array interference_graph[i][j] is the symbol unit->reglist[j] + * itself. + * * two variables interfere when they are alive at the * same time */ @@ -461,7 +501,7 @@ /* Construct a graph N x N where N = number of symbolics. * This piece can be rewritten without the N x N array */ -interference_graph = calloc(n_symbols * n_symbols, sizeof(SymReg*)); +interference_graph = ig_allocate(n_symbols); if (interference_graph == NULL) fatal(1, "build_interference_graph","Out of mem\n"); unit->interference_graph = interference_graph; @@ -475,8 +515,8 @@ if (!unit->reglist[y]->first_ins) continue; if (interferes(unit, unit->reglist[x], unit->reglist[y])) { -interference_graph[x*n_symbols+y] = unit->reglist[y]; -interference_graph[y*n_symbols+x] = unit->reglist[x]; +
Re: Regexp::Parser v0.02 on CPAN (and Perl 6 regex question)
On Jul-04, Jeff 'japhy' Pinyan wrote: > I want to make sure I haven't misunderstood anything. *What* purpose will > my module that will be able to parse Perl 6 regexes into a tree be? You > must be aware that I have no power Damian does not possess, and I cannot > translate *all* Perl 6 regexes to Perl 5 regexes. All I can promise is a > tree structure and limited (albeit correct) translation to Perl 5. In general, it could perhaps be used as a piece of the implementation of Perl6-style regexes for any Parrot-hosted language. Personally, I could see using it with the current prototype perl6 compiler to take over the parsing whenever a regex is seen. The resulting tree structure would then be translated into a languages/regex-style tree, and from there converted into PIR instructions. The translation step could perhaps be skipped if your parser uses some extensible factory-like pattern so that I could produce my preferred regex tree nodes directly -- or if I converted my regex compiler to use your tree nodes as their native representation. In order to get the parsing correct, however, I would need the ability to call back into my native perl6 parser when you encounter perl6 code during your parse -- and perhaps call you again within that code. I don't know if this is in the scope of what you were planning for your parser; now I'm wondering if you were intending to write something akin to Perl6::Rules in that it translates Perl6 rules into perl5-edible chunks, and all this business of reentrancy and external parsing callouts is not at all what you're interested in dealing with. If so, then I would still find a use for it in providing a better Perl6-style regex parser for languages/regex. It would be used mostly for testing, but eventually I hope to get around to plugging languages/regex into Parrot as a directly-callable compiler. This has the same reentrancy etc. issues for the host language, but then they're that language's author's problem, not mine. :-) I'll go download Regexp::Parser now, just so I'm not speculating quite so much.
Re: Perl 6 regex parser
On Jun-27, Jeff 'japhy' Pinyan wrote: > On Jun 27, Steve Fink said: > > >On Jun-26, Jeff 'japhy' Pinyan wrote: > >> I am currently completing work on an extensible regex-specific parsing > >> module, Regexp::Parser. It should appear on CPAN by early July (hopefully > >> under my *new* CPAN ID "JAPHY"). > >> > >> Once it is completed, I will be starting work on writing a subclass that > >> matches Perl 6 regexes, Regexp::Perl6 (or Perl6::Regexp, or > >> Perl6::Regexp::Parser). I think this might be of some use to the Perl 6 > >> dev crew, but I'm not sure how. > > > >Sounds interesting, but I'm a bit confused about what it is. Clearly, > >it parses regexes, but what is the output? A parse tree? Tables and/or > >code that implement a matching engine for that regex? PIR? A training > >regimen that can be used to condition a monkey to push a "yes" or "no" > >button whenever you give it a banana with an input string inscribed on > >it? > > It creates a tree structure, not identical but similar to the array of > nodes Perl uses internally. Ah, good. Then I am interested. When I manage to find some time for hacking again, I'll graft it onto languages/regex as a replacement for the ridiculous parser I have there now. languages/regex is meant to be a language-independent regex engine, and has a silly stub parser to get basic stuff into it for testing. languages/perl6 uses the engine too, but provides its own parser. But nobody's done anything with that parser since Sean O'Rourke stopped working on it (admittedly, he implemented a surprisingly large portion of the syntax), and it'd be great to be working with something that's maintained. (But mostly I like the idea of using a language-independent front-end with a language-independent backend.) I should just look at the code, but I'm wondering what you do with language-specific constructs. Embedded code, for example. How do you find the end of it? And will you be supporting things like Perl6's / $x := (a*b) / where '$x' is a language-dependent variable name syntax?
Re: Perl 6 regex parser
On Jun-26, Jeff 'japhy' Pinyan wrote: > I am currently completing work on an extensible regex-specific parsing > module, Regexp::Parser. It should appear on CPAN by early July (hopefully > under my *new* CPAN ID "JAPHY"). > > Once it is completed, I will be starting work on writing a subclass that > matches Perl 6 regexes, Regexp::Perl6 (or Perl6::Regexp, or > Perl6::Regexp::Parser). I think this might be of some use to the Perl 6 > dev crew, but I'm not sure how. Sounds interesting, but I'm a bit confused about what it is. Clearly, it parses regexes, but what is the output? A parse tree? Tables and/or code that implement a matching engine for that regex? PIR? A training regimen that can be used to condition a monkey to push a "yes" or "no" button whenever you give it a banana with an input string inscribed on it? If it just parses the regex, then I would be interested in it for both languages/regex and languages/perl6. If it does more, then you're on your own, because it's been difficult enough to graft the current regex engine onto the languages/perl6 code; I have no problems with someone doing the same for a different engine, but I'm not going to be the one! Also, I find that regex stack overflows can sometimes trigger the monkey to begin wildly throwing feces, and I've no desire to experience that again.
Re: Simple trinary ops?
On Jun-16, Dan Sugalski wrote: > At 8:24 PM +0200 6/16/04, Leopold Toetsch wrote: > >Dan Sugalski <[EMAIL PROTECTED]> wrote: > >> I'm wondering if it'd be useful enough to be worthwhile to have > >> non-flowcontrol min/max ops. Something like: > > > >> min P1, P2, P3 > >> max P1, P2, P3 > > > >Which cmp operation of the three we have? I smell opcode bloat. > > Yeah, I've already given up on it. :) ### min P1, P2, P3 ### isgt I0, P2, P3 choose P1, I0, P2, P3
Re: $ENV{ICU_DATA_DIR}
On May-31, Nicholas Clark wrote: > On Sat, May 29, 2004 at 11:03:12PM -0700, Steve Fink wrote: > > > +/* DEFAULT_ICU_DATA_DIR is configured at build time, or it may be > > + set through the $ICU_DATA_DIR environment variable. Need a way > > + to specify this via the command line as well. */ > > +data_dir = Parrot_getenv("ICU_DATA_DIR", &free_data_dir); > > Is ICU_DATA_DIR something the ICU folks define? Or something we define? > And if the latter, shouldn't it be called PARROT_ICU_DATA_DIR? Fair enough. The rename has been committed.
Q: MMD splice
The Perl6 compiler often appends on array onto another. Right now, it does this by iterating over the source array. I would like to just use the C op, but I am now getting a mixture of PerlArrays (from Perl6) and Arrays (from C), and the C vtable entry only works if the types of the arrays is identical. This sounded to me like a perfect application of MMD: I could define splice(PerlArray,PerlArray), splice(PerlArray,Array), and splice(other, other). But my initial foray into the MMD code raised enough questions that I was hoping I could borrow a few clues from someone who's been paying more attention to things. Is this a reasonable thing to do with the current MMD setup? Where should I register my routines? The MMD stuff I looked at seemed intended only for binary C operators; splice needs to dispatch on only its two PMC entries, but has a function signature that takes a few other parameters (integer offset and count).
Re: $ENV{ICU_DATA_DIR}
On May-30, Leopold Toetsch wrote: > Steve Fink <[EMAIL PROTECTED]> wrote: > > > Anyone mind if I commit this? > > The patch is fine. > > > ... One thing I'm not sure of, though -- I > > try to behave myself and use Parrot_getenv rather than a plain > > getenv(), but I'm not convinced the API is complete -- Parrot_getenv > > saves back a boolean saying whether to free the returned string or > > not, but what should I call to free it? > > It's for Win32. config/gen/platform/win32/env.c uses > C, so it should be C. Thanks, patch applied. I went to add documentation on Parrot_getenv, but found that it was already there in platform_interface.h. Doh!
$ENV{ICU_DATA_DIR}
Anyone mind if I commit this? One thing I'm not sure of, though -- I try to behave myself and use Parrot_getenv rather than a plain getenv(), but I'm not convinced the API is complete -- Parrot_getenv saves back a boolean saying whether to free the returned string or not, but what should I call to free it? I could call Parrot_free_memalign, but who said anything about alignment? Perhaps Parrot_getenv should simply return the string, and we should have a platform-specific Parrot_freeenv that either frees the string or is a no-op? Whatever. For now, I'm just calling free() if Parrot_getenv() tells me to. Which it probably never will. Index: src/string.c === RCS file: /cvs/public/parrot/src/string.c,v retrieving revision 1.202 diff -u -r1.202 string.c --- src/string.c25 May 2004 08:34:24 - 1.202 +++ src/string.c30 May 2004 05:55:29 - @@ -243,9 +243,18 @@ string_init(Parrot_Interp interpreter) { size_t i; -/* DEFAULT_ICU_DATA_DIR is configured at build time. Need a way to -specify this at runtime as well. */ -string_set_data_directory(DEFAULT_ICU_DATA_DIR); +char *data_dir; +int free_data_dir = 0; + +/* DEFAULT_ICU_DATA_DIR is configured at build time, or it may be + set through the $ICU_DATA_DIR environment variable. Need a way + to specify this via the command line as well. */ +data_dir = Parrot_getenv("ICU_DATA_DIR", &free_data_dir); +if (data_dir == NULL) +data_dir = DEFAULT_ICU_DATA_DIR; +string_set_data_directory(data_dir); +if (free_data_dir) +free(data_dir); /* encoding_init(); chartype_init();
Re: First draft, IO & event design
On May-25, Dan Sugalski wrote: > At 10:31 AM +0200 5/25/04, Leopold Toetsch wrote: > >Dan Sugalski <[EMAIL PROTECTED]> wrote: > >> An unsolicited event, on the other hand, is one that parrot generates > >> as the result of something happening external to itself, or as the > >> result of some recurring event happening. Signals and GUI events, for > >> example, are unsolicted as are recurring timer events. > > > >I don't think that there is much difference between these two types of > >events. You don't get signals if you don't do the appropriate sigaction > >call. You ask the OS for an one-shot timer or for a recurring one, so > >you'll get one or more events. That's all known. > > The difference there is that a solicited event is one you have asked > for *and* received an event/request object for, so you can identify > the request/event as it makes its way through a stream. You can't do > that with the unsolicited ones, since you don't know they exist until > they've shown up. Perhaps that's a better thing to use to describe them, then. I understand the intuitive difference between expected/unexpected or solicited/unsolicited, but upon closer examination that particular difference gets really fuzzy. Perhaps registered/unregistered? You really want to say that you have a handle ("event"? "object"?) associated with your solicited event, but I don't know how to turn that into an adjective. Preallocated? Prepared? Identified? Labeled? Named? Tracked? Hey, that last one might work.
Re: compiler-faq
On May-29, Brent 'Dax' Royal-Gordon wrote: > William Coleda wrote: > >=head2 How do I generate a sub call with a variable-length parameter > >list in PIR? > > > >This is currently not trivial. > ... > >=head2 How do I retrieve the contents of a variable-length parameter > >list being passed to me? > > > >The easiest way to do this is to use the C opcode to take a > >variable > >number of PMC arguments and wrap them in an C PMC. > > I may just be an idiot, but why can't someone just write C > (or somesuch) as the complement of C? You mean .flatten_arg? It's not an op, it's a PIR directive, but it sounds like what the question is looking for. (And an op would make it faster.)
Re: PARROT_API, compiler and linker flags (was TODO: Linker magic step for configure)
On May-15, Jeff Clites wrote: > > When linking against ("using") a static library version of ICU, we need > a C++-aware linker (because ICU contains C++ code); with a > dynamic-library version of ICU presumably we wouldn't. I don't know if this applies here, but there is a good reason to use a C++-compatible linker even if you aren't including any C++ code. By default, many C linkers will not allow C++ exceptions to propagate through their stack frames. Unwinding the stack for an exception requires some additional information stored in the stack frames. I had to compile my own version of perl for my day job code, since we are writing in C++ and embedding the Perl interpreter, and if C++ calls perl calls C++ throws an exception, then I need the outer C++ try block to catch the exception.
Re: P6C: Parser Weirdness
Top-down and bottom-up are not mutually exclusive. At least not completely. But self-modifying parsers are *much* easier to do with top-down than bottom-up, because the whole point of bottom-up is that you can analyze the grammar at "compile" (parser generation) time, and propagate the knowledge throughout the rule engine. A simple example is /fish|job|petunia/. Rather than trying to match /fish/ and upon failure, trying /job/, and as a last resort /petunia/, you could do a dispatch table on the first letter and never have to fall back to anything. In the case of /fish|job|/, however, you can't guarantee that FIRST() will always be "p". You could generate both parsers and then use a notification or dependency system to pick which to use, but done naively you end up with an exponential number of parsers and the logic is likely to be both slow and a bug magnet. You could compile the parser assuming everything is final and then at runtime regenerate the entire parser if anything changes. Which wouldn't work for long, but perhaps you could break the grammar down into components that are individually compiled bottom-up, but coordinated top-down. Then you could limit the scope of recompiles in the bottom-up components, and not need recompiles in the top-down structure. The decisions of where to break things down would be quite similar to a regular compiler's inlining decisions. It may be that grammars are just too recursive for this to help much, though. I suspect a slight variant of the above may work best. Rather than doing a full-out LALR(1) parser for the bottom-up components, you'd do a somewhat more naive but still table-driven (shift/reduce) parser, carefully limiting what it is assuming about the FIRST() etc. of the rules within it. That should limit the impact of changes, and simplify the logic of what needs to be done differently when a change is detected. On May-11, Matt Fowles wrote: > > Perhaps Perl 6 grammars should provide an is parsed trait that > allows one to specify which type of parsing to use, then we could > dictate that the default behavior for parsing or perl itself is > shift reduce parsing rather than recursive descent. Optimization hints could also be very helpful, or we could even default to a total recursive-descent parser and only attempt bottom-up precomputation if the grammar author specifically says it's ok. The main problem being that people will say lots of things if it makes their code faster, without having any idea what it actually means.
Re: P6C: Parser Weirdness
On May-10, Joseph Ryan wrote: > > The Parse::RecDescent in parrot/lib is a hacked version that removes > a bunch of stuff (tracing code, iirc) from the outputted grammer so > that it runs many orders faster than the regular version. Or, to > put it another way, it increases P6C's runspeed from "infuriating" > to "slow" :) I think I've been told that at least once before, and forgotten. Is there a good place on the wiki FAQ where this could be inserted?
Re: [PATCH: P6C] update calling convention
On May-09, Allison Randal wrote: > > BTW, should I keep working on P6C? As A12 has just come out P6C may be > > heavily under construction, and I don't want to be in the way... > > Please do. I'm working on a first rough implementation of classes, but > it shouldn't interfere with general patches. I am very slowly working on a set of changes that both your patch and Allison's last patch interfere with -- but that's purely my problem, because I still don't have the time to finish it off enough to commit anytime soon. So please go ahead and work away; I'll deal with merging my own mess eventually. Is anyone other than the three of us currently working on P6C at all? Just curious.
Re: P6C: Parser Weirdness
On May-09, Abhijit A. Mahabal wrote: > On Sat, 8 May 2004, Abhijit A. Mahabal wrote: > > > I was writing a few tests for the P6 parser and ran into a weird problem. > > If I have the following in a file in languages/perl6, it works as > > expected: > > [...] > > > Now, if I put exactly the same contents in a file in > > languages/perl6/t/parser, then I get the following error: > > > Okay, I traced the problem to a "use FindBin" in P6C::Parser.pm. Is it > okay to change > > use lib $FindBin::Bin/../../lib; > > to > > use lib ../../lib; > > or is there a good reason not to? Neither of those seems right to me. The first keys off of the position of the binary, which could be anywhere with respect to the library module you're in; the second is relative to whatever the current directory is while you're running the script. I would think that something like use File::Basename qw(dirname); use lib dirname($INC{"P6C/Parser.pm"})."/../../../../lib"; (untested and probably not quite the right path) would be better. Or perhaps it should be ripped out entirely, and any script using P6C::Parser should be required to set the lib path correctly? It partly depends on whether we want to ensure that P6C::Parser preferentially uses the Parse::RecDescent from parrot/lib rather than a system-provided one. Which probably is not the case? > The current version makes it necessary > to put all files that "use P6C::Parser" in the same directory, and the > change would allow: > > perl t/parser/foo.t > > to work. Just make sure cd t; perl parser/foo.t works too.
Re: Q: status of IntList
On Apr-21, Leopold Toetsch wrote: > Is IntList used outside of some tests? > Can we rename it to IntvalArray? Yes, it is used in the languages/regex compiler (at least when embedded in Perl6, but IIRC in all cases.) But yes, go ahead and rename it.
Re: [CVS ci] cpu specfic config
On Apr-24, Leopold Toetsch wrote: > I've extended the config system by CPU specific functionality: > - new config step config/gen/cpu.pl is run after jit.pl > - this step probes for config/gen/cpu/$cpu/auto.pl and runs it if present > > For i386 we have: > - a new tool as2c.pl, which creates hopefully platform independent C > code from a gcc source file > - memcpy_mmx*.c > - and i386/auto.pl which runs this file as a test and sets config vars > > Next step will be to incorporated these files in platform code. And then we will have a natural choice for the next code name. Parrot will be the fastest bird in existence, but still won't quite fly, so let's call it a Cheeto (cheetah + dodo).
Re: ICU data file location issues
On Apr-14, Jeff Clites wrote: > For Unix platforms at least, you should be able to do this: > > executablePath = isAbsolute($0) ? dirname($0) : cwd().dirname($0) Nope. sub executablePath { return dirname($0) if isAbsolute($0); return cwd().dirname($0) if hasSlash($0); foreach dir in $PATH { return $dir if -x "$dir/$0"; } return "bastard process"; } which is why on Linux I give up on portability and say: return readlink("/proc/self/exe"); (ok, to match that'd need to be dirname(readlink(...)))
Re: [RESEND] [PATCH] Interpreter PMC
On Apr-09, Will Coleda wrote: > Subject: [perl #16414] [PATCH] Interpreter PMC > Created: 2004-04-09 02:59:29 > Content: There is now a ParrotInterpreter class which seems to provide > most of this functionality > - Is there anything you feel is still missing, or can we resolve the > call? Seems good enough. I resolved the ticket.
Re: Windows tinder builds
On Mar-26, Dan Sugalski wrote: > The VS/.NET build works fine, though three of the tests fail for odd > reasons. Those look like potential test harness errors. > > The cygwin build sorta kinda works OK, but the link fails because of > a missing _inet_pton. I seem to remember this cropping up in the past > and I thought we'd gotten it fixed, but apparently not. > > Anyway, these are set for hourly builds at half-hour offsets, so if > you check in any significant changes it'd be advisable to take a look > at the results. For those that don't know, all the tinderbox info is > web-accessable at > http://tinderbox.perl.org/tinderbox/bdshowbuild.cgi?tree=parrot A couple of weeks back, I also beefed up the error parsing stuff a bit on my Parrot tinderbox summarizer at http://foxglove.dnsalias.org/parrot/ so that it should give a good "what's broken and why" summary at a glance. Note that because I only run the summarizer once an hour at 17 minutes past the hour, it is not the right place to go if you're watching to see if your commit broke anything. For that, see the official URL above.
Re: Safety and security
On Mar-24, Dan Sugalski wrote: > At 12:36 PM +1100 3/24/04, [EMAIL PROTECTED] wrote: > >On 24/03/2004, at 6:38 AM, Dan Sugalski wrote: > > > >This is a question without a simple answer, but does Parrot provide > >an infrastructure so that it would be possible to have > >proof-carrying[1] Parrot bytecode? > > In the general sense, no. The presence of eval and the dynamic nature > of the languages we're looking at pretty much shoots down most of the > provable bytecode work. Unfortunately. ? I'm not sure if I understand why. (Though I should warn that I did not read the referenced paper; my concept of PCC comes from reading a single CMU paper on it a couple of years ago.) My understanding of PCC is that it freely allows any arbitrarily complex code to be run, as long as you provide a machine-interpretable (and valid) proof of its safety along with it. Clearly, eval'ing arbitrary strings cannot be proved to be safe, so no such proof can be provided (or if it is, it will discovered to be invalid.) But that just means that you have to avoid unprovable constructs in your PCC-boxed code. Eval'ing a specific string *might* be provably safe, which means that we should have a way for an external (untrusted) compiler to not only produce bytecode, but also proofs of the safety of that bytecode. We'd also need, of course, the trusted PCC-equipped bytecode loader to verify the proof before executing the bytecode. (And we'd need that anyway to load in and prove the initial bytecode anyway.) This would largely eliminate one of the main advantages of PCC, namely that the expensive construction of a proof need not be paid at runtime, only the relatively cheap proof verification. But if it is only used for small, easily proven eval's, then it could still make sense. The fun bit would be allowing the eval'ed code's proof to reference aspects of the main program's proof. But perhaps the PCC people have that worked out already? Let me pause a second to tighten the bungee cord attached to my desk -- all this handwaving, and I'm starting to lift off a little. The next step into crazy land could be allowing the proofs to express detailed properties of strings, such that they could prove that a particular string could not possibly compile down to unsafe bytecode. This would only be useful for very restricted languages, of course, and I'd rather floss my brain with diamond-encrusted piano wire than attempt to implement such a thing, but I think it still serves as a proof of concept that Parrot and PCC aren't totally at odds. Back to reality. I understand that many of Parrot's features would be difficult to prove, but I'm not sure it's fundamentally any more difficult than most OO languages. (I assume PCC allows you to punt on proofs to some degree by inserting explicit checks for unprovable properties, since then the guarded code can make use of those properties to prove its own safety.)
Re: imcc concat and method syntax
On Mar-13, Luke Palmer wrote: > luka frelih writes: > > >But how should the two interpretations of x.x be resolved? Is that > > >concatenation or method calling? > > > > currently, the pir line > > S5 = S5 . 'foo' > > produces > > error:imcc:object isn't a PMC > > > > concatenation with . seems to be gone > > i cannot think of a good replacement for it > > Well, Perl 6 uses ~ . I think that would be a fair replacement: > > S5 = S5 ~ 'foo' That currently means binary xor in imcc, so if we changed it we'd break compatibility with current compilers and scripts. OTOH, it sounds like I already broke it by changing the outcome of the ambiguous x.x interpretation -- oops. I can change it back with precedence games, but would rather not exert the effort, since I think using ~ is a better way to go. (Barring other better ideas, that is.) I tend to use the 'concat' op in my own code anyway. So I'll abide by Leo's or Melvin's ruling.
Re: [BUG] can not call methods with "self"
On Mar-12, Leopold Toetsch wrote: > Steve Fink <[EMAIL PROTECTED]> wrote: > > The attached patch should remove all of the conflicts, and replace > > them with a single shift/reduce conflict that appears to be a bug in > > the actual grammar, namely: > > > x = x . x > > Ah yes. Or course, Thanks a lot, applied. But how should the two interpretations of x.x be resolved? Is that concatenation or method calling?
Re: Methods and IMCC
On Mar-12, Dan Sugalski wrote: > At 9:49 AM +0100 3/12/04, Leopold Toetsch wrote: > >Dan Sugalski wrote: > > > >>Calling a method: > >> > >> object.variable(pararms) > > > >Do we need the more explicit pcc_call syntax too: > > > > .pcc_begin > > .arg x > > .meth_call PObj, ("meth" | PMeth ) [, PReturnContinuation ] > > .result r > > .pcc_end > > Sure. Or we could make it: > >.pcc_begin >.arg x >.object y >.meth_call "foo" >.result r >.pcc_end > > to make things simpler. I vote yes -- until we add AST input to imcc, making the args and invocant be line-oriented makes code generation easier for the Perl6 compiler, at least. (Although I might do it the 1st way anyway, just because I spend so much time staring at generated code.) But I had to stare at the ".object" for a second before I realized you weren't just giving the type of another arg -- would it be better to use ".invocant"?
Re: [BUG] can not call methods with "self"
On Mar-11, Leopold Toetsch wrote: > Jens Rieks <[EMAIL PROTECTED]> wrote: > > > attached is a patch to t/pmc/object-meths.t that adds a test that is > > currently failing because IMCC rejects code like self."blah"() > > Yep. It produces reduce/reduce conflicts. Something's wrong with > precedence. I'd be glad if someone can fix it. The attached patch should remove all of the conflicts, and replace them with a single shift/reduce conflict that appears to be a bug in the actual grammar, namely: x = x . x can be parsed as x = x . x VAR '=' VAR '.' VAR target '=' var '.' var assignment or x = x . x VAR '=' VAR '.' VAR target '=' target ptr target target '=' the_sub target '=' sub_call assignment Personally, I'd probably also rename 'target' to 'lhs', and 'var' (and its variants) to 'rhs'. But maybe that's just me. Oh, and 'lhs' is available because this patch eliminates it. I didn't try the test mentioned, though. Index: imcc/imcc.y === RCS file: /cvs/public/parrot/imcc/imcc.y,v retrieving revision 1.125 diff -u -r1.125 imcc.y --- imcc/imcc.y 11 Mar 2004 16:37:56 - 1.125 +++ imcc/imcc.y 12 Mar 2004 08:33:49 - @@ -272,7 +272,7 @@ %type key keylist _keylist %type vars _vars var_or_i _var_or_i label_op %type pasmcode pasmline pasm_inst -%type pasm_args lhs +%type pasm_args %type targetlist arglist %token VAR %token LINECOMMENT @@ -784,7 +784,7 @@ { $$ = MK_I(interp, cur_unit, "bxor", 3, $1, $3, $5); } | target '=' var '[' keylist ']' { $$ = iINDEXFETCH(interp, cur_unit, $1, $3, $5); } - | var '[' keylist ']' '=' var + | target '[' keylist ']' '=' var { $$ = iINDEXSET(interp, cur_unit, $1, $3, $6); } | target '=' NEW classname COMMA var { $$ = iNEW(interp, cur_unit, $1, $4, $6, 1); } @@ -850,9 +850,9 @@ if ($1->set != 'P') fataly(1, sourcefile, line, "Sub isn't a PMC"); } - | lhs ptr IDENTIFIER { cur_obj = $1; $$ = mk_sub_address($3); } - | lhs ptr STRINGC{ cur_obj = $1; $$ = mk_const($3, 'S'); } - | lhs ptr target { cur_obj = $1; $$ = $3; } + | target ptr IDENTIFIER { cur_obj = $1; $$ = mk_sub_address($3); } + | target ptr STRINGC{ cur_obj = $1; $$ = mk_const($3, 'S'); } + | target ptr target { cur_obj = $1; $$ = $3; } ; ptr:POINTY { $$=0; } @@ -916,11 +916,6 @@ | reg ; -lhs: - VAR/* duplicated because of reduce conflict */ - | reg - ; - vars: /* empty */ { $$ = NULL; } | _vars { $$ = $1; } @@ -933,7 +928,7 @@ _var_or_i: var_or_i { regs[nargs++] = $1; } - | lhs '[' keylist ']' + | target '[' keylist ']' { regs[nargs++] = $1; keyvec |= KEY_BIT(nargs); @@ -952,8 +947,7 @@ ; var: - VAR - | reg + target | const ;
CVS update warning
. . . P docs/pmc/subs.pod cvs server: internal error: unsupported substitution string -kCOPY U docs/resources/parrot.small.png U docs/resources/perl-styles.css cvs server: internal error: unsupported substitution string -kCOPY U docs/resources/up.gif . . . Should those perhaps be -kb or -ko? My version of CVS certainly doesn't know COPY, nor have I ever heard of it.
Re: Objects: Now or forever (well, for a while) hold your peace
On Feb-19, Dan Sugalski wrote: > At 7:30 PM -0500 2/18/04, Simon Glover wrote: > > One really pedantic comment: wouldn't it make sense to rename the > > fetchmethod op to fetchmeth, for consistency with callmeth, tailcallmeth > > etc? > > Good point. I'll change that, then. D yo reall wan t repea C's infamou "creat" (mis)spellin? Admittedl, i i no ver ambiguou i thi cas, becaus ther ar alread s man letter, bu stil, tw extr letter i a smal pric t pa...
Re: Re: RT Cleanup
Andrew Dougherty wrote: On Wed, 4 Feb 2004, Steve Fink wrote: On Feb-02, Andrew Dougherty wrote: [EMAIL PROTECTED] 19184 languages/perl6/t/rx/call test error 1 years Keep this one open. The tests still fail. How recently did you check? I committed a reimplementation of perl6 regexes about a week ago. The above test still failed, but only due to a parrot memory corruption bug, and I committed something else the next day that coincidentally sidestepped the bug on my machine. It's probably a different bug than #19184, but here's what I just got for cd languages/perl6 make test (This is for perl5.00503, Solaris 8/SPARC, Sun Workshop compiler) Try cd languages/perl6 ./perl6 --force-grammar -e 1 # don't worry if it fails make test Except I never do 'make test' because, as you noticed, it takes forever to run. Use ./perl6 --test instead. (Or, in this case, maybe just ./perl6 --test t/rx/*.t) The slowest part of the perl6 compiler is simply loading in the Parse::RecDescent parser. That line loads it in once and reuses it for all the tests. I think nobody ever changed 'make test' to use it because if one test kills the process, then all remaining tests fail too. But perhaps I should have make test print at the very end: Hey, that took forever, didn't it? Maybe you should try using ./perl6 --test instead, as documented in [I forget where, and can't look it up right now]. t/rx/basic..Read on closed filehandle at P6C/TestCompiler.pm line 71. Use of uninitialized value at ../../lib/Parrot/Test.pm line 87. # Failed test (t/rx/basic.t at line 7) # got: 'error:imcc:main: Error reading source file t/rx/basic_1.pasm. # ' # expected: 'ok 1 # ok 2 # ok 3 # ok 4 # ok 5 # ok 6 # ok 7 # ok 8 # ok 9 # ' Odd... I'll take a look tonight, thanks. Finally, is it just me, or do these tests take a long time for everyone? Today, it took 21 minutes to run the perl6 test suite. While I appreciate the value of a comprehensive test suite, I wonder if there might be some way to speed things up a bit (apart from buying a faster machine, of course!) Oh, and the perl6 test suite is *far* from comprehensive. It's just slow.
Re: RT Cleanup
On Feb-02, Andrew Dougherty wrote: > [EMAIL PROTECTED] 19184 languages/perl6/t/rx/call test error > 1 years > > Keep this one open. The tests still fail. How recently did you check? I committed a reimplementation of perl6 regexes about a week ago. The above test still failed, but only due to a parrot memory corruption bug, and I committed something else the next day that coincidentally sidestepped the bug on my machine. I would find it easy to believe that it is still happening on other machines. Could you give it a try and let me know? All perl6 tests should be passing right now.
IMC returning ints
I did a cvs update, and it looks like imcc doesn't properly return integers anymore from nonprototyped routines. Or maybe it never did, and the switchover from nonprototype being the default to prototyped is what triggered it (because I had to add some explicit non_prototyped declarations, although I suspect they are incorrect.) Test patch is attached, test case is: .pcc_sub _main $P0 = newsub _L_closure2 $I0 = 17 .pcc_begin non_prototyped .arg $I0 .pcc_call $P0 L_after_call7: .result $I1 .pcc_end after_call: print "returned " print $I1 print "\n" end .end .pcc_sub _L_closure2 non_prototyped .param int value .pcc_begin_return .return value .pcc_end_return .end ? imcc/tc Index: imcc/t/syn/pcc.t === RCS file: /cvs/public/parrot/imcc/t/syn/pcc.t,v retrieving revision 1.31 diff -p -u -b -r1.31 pcc.t --- imcc/t/syn/pcc.t20 Jan 2004 01:50:47 - 1.31 +++ imcc/t/syn/pcc.t20 Jan 2004 01:58:28 - @@ -1,6 +1,6 @@ #!perl use strict; -use TestCompiler tests => 34; +use TestCompiler tests => 36; ## # Parrot Calling Conventions @@ -96,6 +96,60 @@ CODE 10 20 30 +OUT + +output_is(<<'CODE', <<'OUT', "non-prototyped int return"); +.pcc_sub _main + $P0 = newsub _L_closure2 +$I0 = 17 + .pcc_begin non_prototyped + .arg $I0 + .pcc_call $P0 +L_after_call7: + .result $I1 + .pcc_end +after_call: +print "returned " +print $I1 +print "\n" +end +.end + +.pcc_sub _L_closure2 non_prototyped + .param int value +.pcc_begin_return +.return value +.pcc_end_return +.end +CODE +returned 17 +OUT + +output_is(<<'CODE', <<'OUT', "prototyped int return"); +.pcc_sub _main + $P0 = newsub _L_closure2 +$I0 = 17 + .pcc_begin prototyped + .arg $I0 + .pcc_call $P0 +L_after_call7: + .result $I1 + .pcc_end +after_call: +print "returned " +print $I1 +print "\n" +end +.end + +.pcc_sub _L_closure2 prototyped + .param int value +.pcc_begin_return +.return value +.pcc_end_return +.end +CODE +returned 17 OUT ##
Re: cvs commit: parrot/imcc/t/syn pcc.t
On Jan-15, Melvin Smith wrote: > At 11:20 AM 1/15/2004 +0100, Leopold Toetsch wrote: > >Melvin Smith <[EMAIL PROTECTED]> wrote: > >> > >> For some reason 1 test in pcc.t is failing (the nci call) > > > >Off by one error caused by: > > > >> -for (j = 0; j < 4; j++) { > > > >> +for (set = 0; set < REGSET_MAX; set++) { > > > >As most loops inside Parrot use the former scheme, I'd rather keep it > >then switching to an (it seems) error prone variant "set <= REGSET_MAX" > > I like my version because it is self-documenting. I think these are #defines, but for enums I always use the pattern: enum { CLR_BLUE, CLR_RED, CLR_VOMIT_GREEN, NUM_COLORS/* or CLR_COUNT, or CLR_ENTRIES, or ... */ }; for (i = 0; i < NUM_COLORS; i++) ... So how about a REGSET_SIZE?
Memory corruption
Here's another obnoxious test case. I started to try to strip it down, but it starts working again if I even delete nonsense lines from a subroutine that is never called. And I'm working on something else and not at all in the mood to re-learn how to debug parrot internals. It turns out that I don't get the crash when running JITted, so I think I'll just do that for now. So, in case anyone is curious (hi leo!), attached is a 56KB (<9KB gzipped) imc file. It crashes on a memcpy inside compact_pool (triggered by new_hash). b->buflen is obviously corrupted. Using -G to disable garbage collection (does that work?) doesn't seem to help matters at all. Deleting the __setup sub at the end of the file makes the problem go away. (Note that __setup is never actually called, and the body of the routine is irrelevant other than its length.) dead.imc.gz Description: GNU Zip compressed data
Re: [RFC] IMCC pending changes request for comments
On Dec-02, Melvin Smith wrote: > > 1) Currently typenames are not checked except with 'new ' I would vote for no aliases at all. I propagated the existing uses of ".local object" in the Perl6 compiler and introduced several more uses, but that was only because I wasn't sure at the time whether it was intended to (now or in the future) have slightly different semantics. It wasn't, I'm pretty sure. So I'll switch Perl to using 'pmc' if you make the change. > 2) In the absence of some sort of return instruction, subs > currently just > run off the end of their code and continue merrily. This > feature really > isn't useful as far as I can see because it is not supported to > assume > any ordering between compilation unit, which a sub _is_. > > It is easier to just assume a sub returns by the active > convention. > > I'd like to just be able to write void subs as: > > .sub _foo >print "hello\n" > .end Do you really want to generate the extra unused code at the end of all the subroutines? I don't think you want to autodetect whether the code is guaranteed to return. How about adding a return value specification if you want this generated automatically: .sub _gorble () returns (int r) r = 5 .end .sub _foo () returns void print "hello\n" .end (This assumes you're creating implicit locals for return values as well as parameters, as you described in #3.) > 3) Strict prototyping mode shortcut (backwards compatible of > course): > As usual, shortcuts are for hand-hackers, but its easier to > debug either way. > >.sub _baz (pmc p, int i) > ... >.end > >Same as: > >.sub _baz protoyped > .param pmc p > .param int i > ... >.end Sounds good to me; debugging the Perl6 sub calling stuff would have been easier if I didn't have to read so much code to figure out what was going on.
Re: Another minor task for the interested
On Nov-21, Dan Sugalski wrote: > > I was mostly thinking that some step or other in the Makefile has a > dependency on that file, and some other step creates it, but the > dependency's not explicit. I'd like to find the step(s) that require it > and make it a dependency for them, then add in a dependency for the file > for whatever step actually creates it, so things get ordered properly. It > should (hopefully) be straightforward, but... I have other evidence of dependency entanglement -- fairly often, I do a 'make; make test' (which I think is equivalent to 'make test' both theoretically and practically), and I'll have a bunch of tests fail. Doing a 'make clean; make test' fixes the failures. (Ok, sometimes it requires a re-Configure.pl too, but that's another issue.) There is a known dependency gap due to the recursive invocation of classes/Makefile, but I don't think that is causing either of these problems. Random idea for this problem and the zillions of similar problems people face all the time with make: it would be cool to patch ccache so that it reports its cache misses. And just to be anal, add a flag saying 'for this run, do not evict things from the cache to restrict space usage.' Then, when you do a 'make' that doesn't remake enough, you could do make clean export CC="ccache --report-misses=/tmp/misses.txt --no-evictions gcc" make and you could look at the first miss to see an example of something that needed to be rebuilt, but did not have a dependency triggering it. (I can see at least one way in which this scheme is still not guaranteed to be correct -- there's a reasonable chance you would hit in the cache from an unrelated compile, and thus fail to see the first missing dependency.)
Re: Do my debugging for me? (memory corruption)
On Nov-21, Leopold Toetsch wrote: > Steve Fink <[EMAIL PROTECTED]> wrote: > > I'm staring at a crash > > > I'll attach the 5KB compressed .imc file (25KB uncompressed; PIR code > > Its really good, to have such short code snippets, that clearly show, > where a bug is coming from ;) Anyway, it was again me causing this bug - > sorry. > > Fixed and updated the comment which I didn't understand when removing > it. You're awesome. Thank you. I didn't boil that down to a small test case because I felt that it was necessary to preserve the full context, so that people could adequately appreciate the entirety of the problem with a complete historical and cultural perspective. A stripped-down test case would conceal the intention behind the code and forever prevent future historians from... Yeah, okay. It was late and I was lazy. Thanks again.
Do my debugging for me? (memory corruption)
I'm staring at a crash, my eyes are glazing over, and I need sleep. So I was wondering if anyone would be interested in taking a look at a .imc file that is giving me a seg fault while marking a hash in a gc run triggered by a hash PMC allocation. Or at least tell me whether it's seg faulting on your machine too. I'll attach the 5KB compressed .imc file (25KB uncompressed; PIR code is redundant!) It's generated from the following Perl6 code, but you'd need my local changes in order to regenerate it: rule thing() { <[a-z]>+ } rule parens() { { print "entering parens\n"; } \( [ | | \s ]* \) { print "leaving parens\n"; } } sub main() { my $t = "()()(((blah blah () blah))(blah))"; my $t2 = $t _ ")"; print "ok 8\n" if $t2 !~ /^ $/; } (the "entering/leaving parens" printouts have no effect on the bug; they're just remnants of my flailing.) If you run with --gc-debug, it dies a little earlier, but in what appears to be the same op. Hopefully, Steve hash_bug.imc.gz Description: GNU Zip compressed data
Re: Calling conventions. Again
I'm getting a little confused about what we're arguing about. I will take a stab at describing the playing field, so people can correct me where I'm wrong: Nonprototyped functions: these are simpler. The only point of contention here is whether args should be passed in P5..P15, overflowing into P3; or just everything in P3. Dan has stated at least once that he much prefers the P5..P15, and there hasn't been much disagreement, so I'll assume that that's the way it'll be. Prototyped functions: there are a range of possibilities. 1. Everything gets PMC-ized and passed in P3. (Oops, I wasn't going to mention this. I did because Joe Wilson seemed to be proposing this.) No arg counts. 2. Everything gets PMC-ized and passed in P5..P15+P3. Ix is an arg count for the number of args passed in P5..P15. P3 is empty if argcount <= 11 (so you have to completely fill P5..P15 before putting stuff in P3.) 3. Same as above, but you can start overflowing into P3 whenever you want. Mentioned for completeness. Not gonna happen. In fact, anything above this point ain't gonna happen. 4. PMCs get passed in P5..P15+P3, ints get passed in I5..I15+P3, etc. Ix is a total argument count (number of non-overflowed PMC args + number of non-overflowed int args + ...). Arguments are always ordered, so it is unambiguous which ones were omitted in a varargs situation. I think this is what Leo is arguing for. 5. PMCs get passed in P5..P15+P3, ints get passed in I5..I15+P3, etc. Ix is the number of non-overflowed PMC args, Iy is the number of non-overflowed int args, etc. I think this is what Dan is arguing for. 6. PMCs get passed in Px..P15+P3, ints get passed in I5..I15+P4, etc. Ix is the number of non-overflowed PMC args, Iy is the number of non-overflowed int args, etc. I made this one up; see below. Given that all different types of arguments get overflowed into the same array (P3) in #4 and #5, #4 makes some sense -- if you want to separate out the types, then perhaps it should be done consistently with both argument counts _and_ overflow arrays. That's what #6 would be. Note that it burns a lot of PMC registers. The other question is how much high-level argument passing stuff (eg, default values) should be crammed in. The argument against is that it will bloat the interface and slow down calling. The argument for is that it increases the amount of shared semantics between Parrot-hosted languages. An example of how default values could be wedged in is to say that any PMC parameter can be passed a Null PMC, which is the signal to use the default value (which would need to be computed in the callee, remember), or die loudly if the parameter is required. Supporting optional integer, numeric, or string parameters would be trickier. Or disallowed. Hopefully I got all that right.
Re: [CVS ci] hash compare
On Nov-12, Leopold Toetsch wrote: > I've committed a change that speeds up hash_compare considerably[1], > when comparing hashes with mixed e.g. ascii and utf8 encodings. I read the patch, and thought that we'll also have a lot of ($x eq $y) and ($x ne $y) statements that this won't accelerate -- couldn't this be done as another string vtable entry instead of being specific to hash_compare? It seems like you're able to do the optimizations when string_compare isn't purely because string_compare needs to test ordering, not just equality. Am I missing something (as usual), or would this be better done by adding a string_equal?
Re: #define version obj.version is a kind of generic thing
On Oct-27, Leopold Toetsch wrote: > Arthur Bergman <[EMAIL PROTECTED]> wrote: > > > include/parrot/pobj.h:# define version obj.version > > Sorry for that :) We can AFAIK toss the version part of a PObj. Its > almost unused and hardly needed. It could be renamed too inside parrot. I'm the guilty one who added the version field (though back then, it was added to a Buffer object, IIRC). I found it very helpful in debugging GC problems -- I would have a problem where memory was being stomped on, and trace it back to a prematurely freed and reused header. But in order to figure out what was going on, I needed to trace it back to the point of allocation of the header, but headers get reused so frequently that setting a conditional breakpoint on the address of the header was useless. So I added in the version number to provide something to set a breakpoint on. It is very possible that this is no longer useful; I haven't been working on stuff where I have needed it for long enough that the code has mutated significantly. Is it still useful? If not, then go ahead and rip it out.
Re: [BUG] IMCC looking in P3[0] for 1st arg
On Oct-26, Melvin Smith wrote: > At 06:25 PM 10/26/2003 -0800, Steve Fink wrote: > > .pcc_sub _main prototyped > > .pcc_begin_return > > .return 10 > > .return 20 > > .pcc_end_return > > .end > > It is still the same issue. This code explicitly mixes 2 call conventions. > _main is declared as prototyped so it will return 1 in I0 which signals that > it is returning its values in registers in prototyped convention. Your > call explicitly calls in non_prototyped mode which does not generate any > code to check the return convention since you are saying by your code > that you _know_ what the call convention is. Oops, I meant to leave the "prototyped" off of the _main sub. This behaves indistinguishably from declaring it prototyped; it seg faults if called non_prototyped. I believe it's supposed to work when called with either style. Sorry for the bad example. Likewise, if I declare the .pcc_sub to be non_prototyped (so that both the call and declaration are non_prototyped), I get the same error: .sub _main .local Sub myfunc myfunc = newsub _myfunc .pcc_begin non_prototyped .pcc_call myfunc ret: .result $I0 .result $I1 .pcc_end print "Returned " print $I0 print "," print $I1 print "\n" end .end .pcc_sub _myfunc non_prototyped .pcc_begin_return .return 10 .return 20 .pcc_end_return .end % ./perl6 -Rt mini.imc . . . PC=60; OP=38 (invoke_p); ARGS=(P1=RetContinuation=PMC(0x40c04998)) PC=21; OP=1003 (restoretop) PC=22; OP=801 (shift_i_p); ARGS=(I17=0, P3=NULL) Error: '/home/sfink/parrot/parrot -t -r mini.imc ' failed died with signal 11 (SIGSEGV) > However, I see your point. To be orthogonal would suggest that we > implement the same feature for .pcc_call that we do for the .pcc_sub > declaration. If you left off the calling convention to .pcc_call it > would generate code to check for either. Although this would really > bloat the code, it might be wise to support the feature for some > instances. No, sorry, my bad example obscured the issue. I was not asking for a .pcc_begin that works in either case; I just want it to be possible to call a subroutine without a prototype and have it successfully return values. True, I would also like to be able to call the same subroutine *with* a prototype in another call site, but that is already implemented. I don't think allowing people to leave off the {non_,}prototyped declaration from .pcc_begin provides anything but superficial syntactic orthogonality; a call site really ought to know whether it can see the prototype of what it's calling or not! (Well... except I sometimes call non_prototyped even when I know the prototype, because pdd03 calling convention prototypes don't handle everything in a Perl6 prototype. But that's irrelevant here.) My brain doesn't seem to be working all that well this weekend... I'll throw in one more thing just because I know a certain Mr. P. Cawley dearly loves people to pile unrelated things into a single thread: could there be a way to expose which continuation to invoke when returning from a routine? In a regex, I'd really like a rule to be invoked with a "success" continuation and a "fail, so backtrack" continuation. And possibly with some more extreme failure continuations for cuts and commits and things. But right now the return continuation in P1 is hidden inside the PCC mechanism. (I guess I could just manually overwrite P1, but that seems like it's working against imcc rather than with it.) I'm faking it for now by returning a boolean status code, but that doesn't really feel like the "right" solution.
Re: [BUG] IMCC looking in P3[0] for 1st arg
On Oct-26, Leopold Toetsch wrote: > Steve Fink <[EMAIL PROTECTED]> wrote: > > I am getting a seg fault when doing a very simple subroutine call with > > IMCC: > > > .sub _main > > newsub $P4, .Sub, _two_of > > $P6 = new PerlHash > > .pcc_begin prototyped > ^^ > > .pcc_sub _two_of non_prototyped >^^ > > You are stating explicitely conflicting call types. That can't work. > When you remove "non_prototyped" in the sub, its prepared to be called > either way and it works. That is working for me now for the parameter passing, but not for return values. The below code seg faults because it is attempting to pry return values out of P3; it works if I switch the line .pcc_begin non_prototyped to .pcc_begin prototyped I'm not sure if this is implemented yet, though. Code follows: .sub __main .local Sub main_sub main_sub = newsub _main .pcc_begin non_prototyped .pcc_call main_sub ret: .result $I0 .result $I1 .pcc_end print "Returned " print $I0 print "," print $I1 print "\n" end .end .pcc_sub _main prototyped .pcc_begin_return .return 10 .return 20 .pcc_end_return .end
Re: [BUG] IMCC looking in P3[0] for 1st arg
On Oct-26, Leopold Toetsch wrote: > Steve Fink <[EMAIL PROTECTED]> wrote: > > I am getting a seg fault when doing a very simple subroutine call with > > IMCC: > > > .sub _main > > newsub $P4, .Sub, _two_of > > $P6 = new PerlHash > > .pcc_begin prototyped > ^^ > > .pcc_sub _two_of non_prototyped >^^ > > You are stating explicitely conflicting call types. That can't work. > When you remove "non_prototyped" in the sub, its prepared to be called > either way and it works. Doh! Thanks, I definitely should have noticed that. Although this does bring up another issue -- should parrot really be seg faulting when it gets a uninitialized (null) PMC? It happens to me quite often. In a way, the current behavior is rather nice, since the errors tend to be more obvious. Then again, that's only because all my test programs die early enough that no incorrect but non-null values have snuck into my registers yet. Also, I wouldn't expect a VM to fall flat on its face from something like this.
[BUG] IMCC looking in P3[0] for 1st arg
I am getting a seg fault when doing a very simple subroutine call with IMCC: .sub _main newsub $P4, .Sub, _two_of $P6 = new PerlHash .pcc_begin prototyped .arg $P6 .arg 14 .pcc_call $P4 after: .pcc_end end .end .pcc_sub _two_of non_prototyped .param PerlHash Sunknown_named3 .param int mode .pcc_begin_return .pcc_end_return .end The problem is that IMCC is checking to see whether the 1st argument is of the correct type (PerlHash), but it looks for the argument in P3[0], when in fact it isn't an overflow arg and so is in P5. P3, in fact, is null and so parrot seg faults. Oddly, if I take away the int parameter (and corresponding argument), it does not crash. But this also seems to remove the typecheck entirely.
Re: [perl #24226] [PATCH] Bad casts in interpreter.c
On Oct-15, Adam Thomason wrote: > # New Ticket Created by "Adam Thomason" > # Please include the string: [perl #24226] > # in the subject line of all future correspondence about this issue. > # http://rt.perl.org/rt2/Ticket/Display.html?id=24226 > > > > IBM VisualAge C 6 complains about some data<->function pointer casts in > interpreter.c: Thanks, applied.
[COMMIT] perl6 sub calling
For those of you not on the CVS list, I just committed a rather large change to the perl6 compiler that implements a subset of the A6 subroutine signature rules. My implementation is rather ad-hoc, but it is a decent representation of my slowly evolving understanding of how this stuff's supposed to work. Eventually, I'd like to rewrite it to be more encapsulated, and make it plug into the parser better. Ooh, and it needs some runtime context stuff, but I don't think that really exists anywhere in perl6 right now. Even better, I'd love to see someone else rewrite it properly. See languages/perl6/t/compiler/call.t for several examples of usage. Briefly, it handles things like: sub f ($a, $b) { ... } sub g ($a, +$b) { ... } sub h ($a, [EMAIL PROTECTED]) { ... } f(1,2); f(a, b => 2); g(a => 1, b => 2); h(1, 2, 3, 4); h(1, 2, [EMAIL PROTECTED], 40); It pretends to handle optional parameters, but if you don't pass in a value to an optional parameter, it reuses whatever happened to be lying around in that register, and there's no way of telling whether the caller specified a value or not. I doubt anyone will actually use any of this for a while, so unless I get change requests I'm planning on leaving this in its current half-baked state for now, and going back to using it for a regular expression engine, which is why I went down this rabbit hole in the first place.
Re: [COMMIT] new IO op 'pioctl'
On Oct-11, Melvin Smith wrote: > At 09:19 AM 10/11/2003 -0700, Steve Fink wrote: > >On Oct-10, Melvin Smith wrote: > >> At 08:31 AM 10/10/2003 -0400, Dan Sugalski wrote: > >> > > >> >I think it's time to start thinking about it. (And I think we need a new > >> >name, but that's because I've always hated 'ioctl' :) > >> > >> :) > >> > >> I also considered iocmd, ioattr and ioset. > >> > >> IPop your favorite into the suggestion box... > > > >How about keyed access to the IO PMC? > > > > set I0, P0[.CMDGETBUFSIZE] > > set P0[.CMDSETBUFSIZE], I0 > > I like that. > > Actually it could look even simpler since we have separate setkeyed > and getkeyed support: > > set IO, P0[.BUFSIZE] > set P0[.BUFSIZE], 8192 Actually, looking at that suggests that perhaps this should be done through the setprop/getprop interface instead, since that seems like a closer semantic fit to what you're doing.
Re: [COMMIT] new IO op 'pioctl'
On Oct-10, Melvin Smith wrote: > At 08:31 AM 10/10/2003 -0400, Dan Sugalski wrote: > > > >I think it's time to start thinking about it. (And I think we need a new > >name, but that's because I've always hated 'ioctl' :) > > :) > > I also considered iocmd, ioattr and ioset. > > IPop your favorite into the suggestion box... How about keyed access to the IO PMC? set I0, P0[.CMDGETBUFSIZE] set P0[.CMDSETBUFSIZE], I0
Re: More fun with argument passing
On Oct-05, Luke Palmer wrote: > Steve Fink writes: > > Ok, I'm back to argument passing. I'm starting a new thread because > > I'm lazy and I have to scroll back too far in my mailer to see the old > > arg passing thread. :-) And yes, most of this message should also be > > on -languages. > > Which it now is. Although, there are some internals issues, too, so I > wonder how we can do this. How about, when someone responds to either > an -internals- or a -language-specific question, he directs it only to > the appropriate list. And I guess cross-post one last time when moving a piece from one list to another? > > I can use names to pass required arguments, but all positional args > > must precede all named args. So then is this legal: > > > > f(1, 2, b => 1.5) > > > > or must all of the positional args referred to by named parameters > > follow those passed positionally? (There are two orders -- argument > > order and parameter order. In which of those two must all positionals > > precede the named params?) > > Both. (In parameter order, named-only must come after positionals) So > f(1, 2, b => 1.5) was correct. Huh? In argument order, clearly all the positionals precede the named. But if 2 is bound to $c, then they are out of parameter order. Or does that not bind 2 to $c? Are both 2 and 1.5 bound to $b (and resolved as below)? > > In > > > > sub j($a, ?$b, *%c) { ... } > > > > can I actually pass %c after the rest of the params? If I call it with > > > > j(1, $blue => 'red') > > > > then does that compile down to > > > > .param 1 > > .param named_arg_table > > > > ? How is the callee supposed to know whether the 2nd param is $b or > > %c? What if $blue happened to be "b"? > > If $blue was 'b', then j would get $b to be 'red'. Run-time positionals > are another one of those things I don't expect to see all that often > (but that might be a different story in my code >:-). Sorry, that was an implementation question, not a language question. >From the language-level, clearly the effect you want is for 1 to be bound to $a and 'red' to be bound to either $b or %c{$blue}, depending on whether $blue eq "b". The question is in what order the parameters should be passed. Both work, but both cause problems. > The real problem arises in: > > j(1, 2, $blue => 'red') > > If $blue happens to be 'b'. I think the behavior then would be $b gets > 2, and %h gets { b => 'red' }. In particular, I think it's wrong to > croak with an error here. Larry had some discussion of this: However, it is erroneous to simultaneously bind a parameter both by position and by name. Perl may (but is not required to) give you a warning or error about this. If the problem is ignored, the positional parameter takes precedence, since the name collision might have come in by accident as a result of passing extra arguments intended for a different routine. Problems like this can arise when passing optional arguments to all the base classes of the current class, for instance. It's not yet clear how fail-soft we should be here. Oh, and in discussing this, I'm wondering about one bit of vocabulary: do you bind an argument to a parameter, a parameter to an argument, or do you bind an argument and parameter together? E6 binds arguments to parameters. What if you are binding multiple arguments to a single parameter, as is the case with slurpy params? It doesn't matter, really, but in my documentation and discussion I'd like to be consistent, just because this stuff is already a ways beyond my mental capacity and anything that simplifies things is greatly appreciated!
More fun with argument passing
Ok, I'm back to argument passing. I'm starting a new thread because I'm lazy and I have to scroll back too far in my mailer to see the old arg passing thread. :-) And yes, most of this message should also be on -languages. Could somebody tell me where I go wrong: If you have a prototype sub f ($a, $b, $c) { ... } then you should pass $a in P5, $b in P6, etc. So the code will look like: .param $a .param $b .param $c If you declare a sub without a prototype, it should default to ([EMAIL PROTECTED]). A slurpy array parameter puts its corresponding arguments in a list context, which is the same as a flattening context. This is stated in E6 and S6, though not in A6 as I read it (but it doesn't disagree either.) Let's add a prototype-less sub for use in discussion: sub g { ... } Is there any way to create a prototype that, when called with any number of array variables, would pass all of the arrays as objects? So, for example, f(@x, @y, @z) would do the right thing for exactly three arrays, but couldn't handle f(@x,@y,@z,@w). g(@x, @y, @z) seems to flatten all of them together. I'm sure something like g([EMAIL PROTECTED],[EMAIL PROTECTED],[EMAIL PROTECTED]) would work, but what if I want to do the call without backslashes? The calls f(1, 2, 3) and g(1, 2, 3) should both generate .arg 1 .arg 2 .arg 3 ...except instead of constant ints, you'd probably need PMCs. Splatted array params are aliases to the actual args. So sub h ([EMAIL PROTECTED]) { @params[1] = 'tuna!'; } h($x, $y, $z); should set $y to 'tuna!'. Would h(@x) set @x[1] to 'tuna!'? If so, then does h(@x, $y) change $y's value depending on the number of elements in @x? It seems that @params is either a magical array where lookups trigger a runtime computation of where that index would be found in the original argument list, or it is an array of references to either variables or pairs, and all of those references are built at runtime immediately when the call is made. (Which rather defeats the default laziness of arrays.) Actually, "proxies" might be a more accurate term. You should be able to pass @params into another function just like any other array, or do $gloof = (rand(100) < 30) ? @params : @normal_array; Or maybe h(@x) does NOT set @x[1] to 'tuna!'? Ok, the whole aliasing thing was something of a tangent. Back to f() and h(), which are really f($,$,$) and h(*@). I can use names to pass required arguments, but all positional args must precede all named args. So then is this legal: f(1, 2, b => 1.5) or must all of the positional args referred to by named parameters follow those passed positionally? (There are two orders -- argument order and parameter order. In which of those two must all positionals precede the named params?) In sub j($a, ?$b, *%c) { ... } can I actually pass %c after the rest of the params? If I call it with j(1, $blue => 'red') then does that compile down to .param 1 .param named_arg_table ? How is the callee supposed to know whether the 2nd param is $b or %c? What if $blue happened to be "b"? If I do it the other way around, .param named_arg_table .param 1 then at least I can always assume the named args are passed first, and use the arg count to directly determine whether $b was passed or not. But then all Perl6 subroutines would have to take a named arg table as their first argument, by convention, and cross-language calls would need to be aware of this -- even when calling unprototyped. ("Oh, yeah, if you're calling a Perl6 routine you have to pass an empty hashtable as the first param.") I have a first cut at Perl6 parameter passing. It doesn't do runtime context, the named params are in there but I assume they don't work, and it reflects my earlier misconception that a [EMAIL PROTECTED] array should NOT flatten its arguments. Or, in other words, I compile sub g { ... } g(10,20,30) down to $P0[0] = 10 $P0[1] = 20 $P0[2] = 30 .arg (empty hash) .arg $P0 and sub f ($a, $b, $c) { ... } f(10,20,30) down to .arg (empty hash) .arg 10 .arg 20 .arg 30 which means that if you call f($a,$b,$c) without its prototype, then it compiles to the former code which results in $a getting 3 (the length of the @_ array), while $b and $c get bad values that seg fault parrot. I'm hesitating to manually pull all of the [EMAIL PROTECTED] elements out of P5..P15 + P3[0..] until someone makes me a little more confident that it's the right thing to do. And even then, perhaps it should be handled by some kind of .flattened_param declaration. But neither would handle the magical aliasing I talked about above, if that is required. And it's also slowing down the callee in what could easily be a common case. The patch for this is mixed in with some other stuff I've been working on, but the relevant part is more or less Addcontext.pm | 137 +- Builtins.pm | 411 ++-- Context.pm |
Re: IMCC parsing weirdness
On Sep-28, Steve Fink wrote: > > I've attached a diff to languages/imcc/t/syn/pcc.t, but I'm not sure > if that's the right place for the test. Oops. Except CVS is being very flaky right now, so the patch hadn't been written to the file before I sent it. Oh well. I'm committing a fix for the bug, as well as resolving all shift/reduce conflicts via precedence. I'll also commit the test. I'll let someone else move it around if they want. I didn't bother allowing line and filename comments in the middle of .param sections, although I suppose that might make sense if you're splitting your parameter declarations across multiple lines. Oh well.
IMCC parsing weirdness
I am getting strange behavior from IMCC if the first line after .pcc_sub is a comment. It seems to misinterpret things and ends up emitting a restore_p op that triggers a "No entries on UserStack!" exception. I've attached a diff to languages/imcc/t/syn/pcc.t, but I'm not sure if that's the right place for the test. I'm looking at fixing this now, but the grammar rules relating to this are a bit hairy. I've eliminated the existing shift/reduce conflict by assigning precedence to a dummy rule, but I'm still working on changing the grammar to accept stuff in the parameter list.
Re: Pondering argument passing
On Sep-28, Leopold Toetsch wrote: > Steve Fink <[EMAIL PROTECTED]> wrote: > > > I'm not sure IMCC is going to be able to autogenerate the prototyped > > and nonprototyped versions of the callee's entry code from the same > > set of .param declarations. > > This is currently implemented as the "unprototyped" case. If a sub does > expect to be called either prototyped or non-prototyped, the proto > specifier of the .pcc_sub SHOULD be omitted. The entry code then looks > at C and does either case. > > So we should specify, what to do with wrong param counts or wrong > types. pcc.t has some examples for this (labeled "unproto" or > "exception"). I was arguing that this isn't enough. We need the set of parameters to really be different in the two cases, so we need two sets of ".param" statements, not just one. > I think we als need the concept of a default value (for at least > Pie-Thon's sake - does Perl6 also have default values?) > > sub f($a, $b=1, $c="Default"); > ... > &f(42); Yes, this is what I was talking about in the big block comment in the sample code at the end of my last message. Perl6 does have them. I don't know whether Perl6 or any other language we want to be nice to has *non-constant* defaults. If so, and if we want direct support for them, then it means we need to evaluate them in the context of the callee. Which is the natural way to do it anyway, but having the caller fill in default values for a prototyped call could work around some of the issues with argument passing that would otherwise require more native support to handle. (I'd prefer the native support, whatever it might be.) There's the issue of detecting whether a parameter was passed or not, in order to decide whether to fill in the default value. (See my last message for more discussion of this.)
Re: Pondering argument passing
On Sep-26, Leopold Toetsch wrote: > Dan Sugalski <[EMAIL PROTECTED]> wrote: > > [ splatted function args ] > > > ... For this, I think we're > > going to need a "setp Ix, Py" op which does indirect register addressing. > > Done. Cool, thanks! With it, I've been able to get a bit farther. Found a minor bug where it multiply defines a label if you make two calls. I'm not sure IMCC is going to be able to autogenerate the prototyped and nonprototyped versions of the callee's entry code from the same set of .param declarations. sub f3($a, $b, ?$c) If that is converted to .param $a .param $b .param $c then IMCC is going to get very unhappy when you call it without a prototype and only pass it two arguments. If it is converted to .param $a .param $b .param remaining_args .local $c if remaining_args > 0 $c = shift remaining_args then I'll need to pass in the extra remaining_args array in all calls, prototyped or non-. That's not a huge deal, and it fixes this problem. But you'll hit it again if you call &f3 unprototyped with &f3(3, b => 4); Named arguments cause a lot more trouble than that, but I still don't want to go into that yet. But the above example should be enough to demonstrate that one set of .param declarations won't be enough unless we add more metadata to the declarations, and that feels like it might be too perl6-specific. How bad would it be to do: .pcc_sub _main .pcc_non_prototyped # . # . # . # code to place params into the registers that they would # be in if called with a prototype. I'm not sure if IMCC # can autogenerate any of this or not. # . # . # . # IMCC inserts a jump to the main body of the routine .pcc_prototyped .param _SV_a .param _SV_b .param _SV_c # This is a little strange. The caller knows the prototype, # so can pass in an undef if the 3rd argument wasn't given. # But the callee may have a different default value in # mind than undef, and if it's an expression then it # probably needs to be evaluated by the callee in its # local environment. We could create a new UnpassedArg PMC, # or we could add something to the calling conventions # so that you know how many arguments were actually # passed. .pcc_body # subroutine body .end The non-prototyped section would want to use the same _SV_? variables, so perhaps the prototyped section should come first.
Re: CVS checkout hints for the bandwidth-limited
On Sep-26, Leopold Toetsch wrote: > Filtering the output through a small > script, that just does something like: > > if ($_ !~ /^cvs server: Updating/) { > print $_; > } > > helps to unclutter update results. cvs -q will suppress those lines for you.
Re: Pondering argument passing
On Sep-24, Leopold Toetsch wrote: > No. But you are right. That's the code (/s\$I2/\$I1/) that ".args" > should produce. Perhaps we shoud name the directive ".flatten_arg". Yes, that makes its purpose more clear than calling it ".args". > > Is it supposed to do deep flattening? Do we need ".deeply_flatten_arg" > too? It should not deeply flatten, and I didn't see anywhere in A6 that indicated that we ever deeply flatten. There is a ** prefix operator, but it is only "more splattier" than * in that it is required to immediately evaluate its operand (eg, **1..Inf is supposed to do something bad). But I don't even want to think about lazy lists yet, anyway.
Re: Pondering argument passing
Ah, I reread one of your earlier posts. It appears that you are proposing to pass the arguments in a PerlArray. So flattening is possible. Then what I am saying is that sub f($a,$b) { ... } is going to expect $a to be in P5 and $b to by in P6. In your scheme, $a would be in P5[0] and $b would be in P5[1]. While I personally am not fundamentally opposed to that idea, I believe it's not going to fly because 1. the whole point of using register parameter passing is to avoid exactly this. 2. the existing Perl6 builtin functions could not be given prototypes 3. other Parrot-hosted languages would not interoperate -- they would need to treat all functions using this calling convention as single-argument functions that took an array Or are you saying that this is only used for non-prototyped calls? I believe this directly violates something Dan said. He expects an unprototyped call passing two scalars to pass those scalars in P5 and P6 and have no speed cost as compared to calling the same function with a prototype. Which makes sense, though one could certainly argue about the frequencies of various sorts of calls -- it might be enough to streamline prototyped functions involving no flattening, and not worry about non-prototyped calls, simple or not.