from:"Steve Fink"

Re: archive search?

2005-01-11 Thread Steve Fink

On Jan-11, Peter Christopher wrote:
> 
>   Last I checked (which was moments ago) it was either (a) 
> impossible to, or (b) I couldn't figure out how to, search the parrot 
> mailing list archive. I.e. I can't search 
> www.nntp.perl.org/group/perl.perl6.internals 
> for key words. Could someone give me the heads up on how to search this 
> list? If the search can't be done, is there a way that I can download 
> the archive en mass: so that I can grep to my hearts delight? 

Use Google groups search.

Specifically: go to groups.google.com. To search for "aardvark", enter

  group:perl.perl6.internals aardvark

Re: Parrot & Strong typing

2004-12-05 Thread Steve Fink

On Dec-01, Dan Sugalski wrote:
> 
> C, for example, is weakly typed. That is, while you tell the system 
> that a variable is one thing or another (an int, or a float), you're 
> perfectly welcome to treat it as another type. This is *especially* 
> true of values you get to via pointers. For example, this snippet 
> (and yes, it's a bit more explicit than it needs to be. Cope, you 
> pedants :):
> 
> char foo[4] = "abcd";
> printf("%i", *(int *)&foo[0]);
> 
> tells the C compiler that foo is a 4 character string with the value 
> "abcd", but in the next statement we get a pointer to the start of 
> the string, tell the compiler "No, really, this is a pointer to an 
> int. Really!" and then dereference it as if the string "abcd" really 
> *was* an integer. If C were strongly typed you couldn't do that.

A wholly off-topic comment about how to make use of this:

When running gdb on a C (or better, STL-happy C++) program, it's nice to be 
able to set conditional breakpoints.

  b myfile.c:328
  cond 1 somevar == 17

but gdb can often get confused or very slow if you try to do something 
similar with a char* value:

  cond 1 strcmp(mystring,"badness") == 0

It goes crazy making a function call every time that breakpoint is reached. I 
can get gdb to segfault this way without too much trouble. So instead, use 
this trick:

  cond 1 *(int*)mystring == *(int*)"badness"

and it'll go back to doing a simple integer comparison. And it's very fast 
about that. Note that because you're using a probably 32-bit integer, that 
isn't really looking at the whole string; it'll have exactly the same effect 
if you say

  cond 1 *(int*)mystring == *(int*)"badn"

I often use this in combination with std::basic_string types in C++, since 
the templatized types and other implementation-dependent weirdnesses end up 
making things much harder than if you were using simple char*'s. So it would 
look something like:

  cond 1 *(int*)mystring.data() == *(int*)"badn"

Or maybe it's slightly safer to do this, I dunno:

  cond 1 *(int*)mystring.c_str() == *(int*)"badn"

Sorry for the diversion. Um... if I had to say something on-topic, I'd point 
out that Perl's type system isn't complete ("not THAT strong"), since there 
some corners in the language where you can sneak around it. pack("p"), some
system calls, and other things I can't think of. But maybe the very oddness 
of those things is evidence that Perl does indeed have a strong type system. 
("Strong" in the technical, not comparative, sense.) Not that anyone seems to 
be able to agree on the exact definition of "strong typing".

Re: [perl #31208] dynclasses/README's instructions fail on OS X

2004-11-15 Thread Steve Fink

On Nov-10, Will Coleda via RT wrote:
> This is now obsolete, neh?
> 
> > > I hack round this with
> > >
> > > $ cp dynclasses/foo.dump .
> > 
> >   Alternativley, change line 609 of pmc2c2.pl to read
> > 
> > unshift @include, ".", "$FindBin::Bin/..", $FindBin::Bin;
> > 
> > adding "." to search path

I believe so. I never used either of those workarounds when I was messing with 
that 
stuff, so at least on the configurations I was using (i.e., your machine and 
mine), 
it wasn't needed.

Re: [perl #32137] stack walking failing to detect pointer in local variable on x86 Linux

2004-10-26 Thread Steve Fink via RT

This doesn't address the deeper problem, but we could also simplify the 
whole function by just doing:

 static size_t
 find_common_mask(size_t val1, size_t val2)
 {
size_t mask = ~0;
size_t diff = val1 ^ val2;
 
while (diff & mask)
mask <<= 1;
 
return mask;
 }

Bit twiddling is such fun. And error prone. So I won't commit this; I'll 
just attach the patch.


Index: src/dod.c
===
RCS file: /cvs/public/parrot/src/dod.c,v
retrieving revision 1.138
diff -u -r1.138 dod.c
--- src/dod.c   26 Oct 2004 15:01:29 -  1.138
+++ src/dod.c   26 Oct 2004 19:11:48 -
@@ -921,27 +921,13 @@
 static size_t
 find_common_mask(size_t val1, size_t val2)
 {
-int i;
-int bound = sizeof(size_t) * 8;
+size_t mask = ~0;
+size_t diff = val1 ^ val2;
 
-/* Shifting a value by its size (in bits) or larger is undefined behaviour.
-   so need an explict check to return 0 if there is no prefix, rather than
-   attempting to rely on (say) 0x << 32 being 0  */
-for (i = 0; i < bound; i++) {
-if (val1 == val2) {
-return ~(size_t)0 << i;
-}
-val1 >>= 1;
-val2 >>= 1;
-}
-if (val1 == val2) {
-assert(i == bound);
-return 0;
-}
+while (diff & mask)
+mask <<= 1;
 
-internal_exception(INTERP_ERROR,
-"Unexpected condition in find_common_mask()!\n");
-return 0;
+return mask;
 }
 
 /*

Re: [perl #32137] stack walking failing to detect pointer in local variable on x86 Linux

2004-10-26 Thread Steve Fink

This doesn't address the deeper problem, but we could also simplify the 
whole function by just doing:

 static size_t
 find_common_mask(size_t val1, size_t val2)
 {
size_t mask = ~0;
size_t diff = val1 ^ val2;
 
while (diff & mask)
mask <<= 1;
 
return mask;
 }

Bit twiddling is such fun. And error prone. So I won't commit this; I'll 
just attach the patch.

Index: src/dod.c
===
RCS file: /cvs/public/parrot/src/dod.c,v
retrieving revision 1.138
diff -u -r1.138 dod.c
--- src/dod.c   26 Oct 2004 15:01:29 -  1.138
+++ src/dod.c   26 Oct 2004 19:11:48 -
@@ -921,27 +921,13 @@
 static size_t
 find_common_mask(size_t val1, size_t val2)
 {
-int i;
-int bound = sizeof(size_t) * 8;
+size_t mask = ~0;
+size_t diff = val1 ^ val2;
 
-/* Shifting a value by its size (in bits) or larger is undefined behaviour.
-   so need an explict check to return 0 if there is no prefix, rather than
-   attempting to rely on (say) 0x << 32 being 0  */
-for (i = 0; i < bound; i++) {
-if (val1 == val2) {
-return ~(size_t)0 << i;
-}
-val1 >>= 1;
-val2 >>= 1;
-}
-if (val1 == val2) {
-assert(i == bound);
-return 0;
-}
+while (diff & mask)
+mask <<= 1;
 
-internal_exception(INTERP_ERROR,
-"Unexpected condition in find_common_mask()!\n");
-return 0;
+return mask;
 }
 
 /*

Re: Cross-compiling Parrot

2004-10-18 Thread Steve Fink

On Oct-17, Dan Sugalski wrote:
> At 9:49 AM -0400 10/17/04, Jacques Mony wrote:
> >Hello,
> >
> >I'm trying to port parrot to the unununium operating system, which 
> >uses a modified version of 'diet lib c'. Can anyone tell me if this 
> >is actually possible to force the use of this library using the 
> >current Configure.pl script or if I will need to change it a lot... 
> >or even replace it with my own?
> 
> There's a pretty good bet you're going to have to alter the configure 
> script quite a bit, but it shouldn't require a full rewrite. Teaching 
> it to read from a pre-done configuration data file would be a good 
> place to start, which'd let us feed in the cross-compilation 
> settings. (And we could then leverage for the upgrade settings too)

It's not exactly that, but you can set pretty much anything you want in
a config/init/hints/local.pl file.

Re: Python and Perl interop

2004-10-12 Thread Steve Fink

Perl5 has the notion of contexts, where an expression may behave very
differently in string, boolean, or list context. Perl6 intends to expand
that notion. What if the whole context notion were moved down into
Parrot? Every function call and every MMD dispatch could have an
additional context parameter, and 'add' could then behave very
differently in Pythonic context as opposed to Perl Boolean context?

I guess I'm not actually saying anything here, other than suggesting the
possibility of unifying this problem with Perl's context problem.

ICU causing make to fail w/o --prefix=`pwd`

2004-10-08 Thread Steve Fink

If I just do

  perl Configure.pl
  make

right now, it builds the parrot executable ok but then fails when it
tries to compile the library .imc files. It's looking for the icu data
dir in $(prefix)/blib/lib/2.6.1. It works if I do

  perl Configure.pl --prefix=$(pwd)
  make

or set PARROT_ICU_DATA_DIR, but this seems like an unfriendly default
for developers.

I have a similar problem with the search path for loadable modules.

I think I probably broke this, btw, when I repaired 'make install'. I
had previously bandaged over the problem by defaulting ${prefix} to the
top-level directory. But I'm not sure how to fix it.

Re: dynamically loadable modules

2004-10-08 Thread Steve Fink

On Oct-08, Andy Dougherty wrote:
> 
> Sorry -- offhand I don't have any sense of any "standard" names, and I
> won't have time till late next week to look at it at all.  The most
> important thing is to *DOCUMENT CAREFULLY* exactly what the names are and
> what they mean.
> 
> Whatever names you add, please list them in config/init/data.pl along with
> a nice good long verbose description of exactly what they are.  See
> the existing entries for cc, link, and ld for some good examples.  See
> ld_shared and ld_shared_flags for some bad examples.

Well, my documentation isn't very verbose, but it's there. I suppose I
should expand it a little bit. As for ld_shared and ld_shared_flags -- I
really couldn't figure out what the intention was, because the
implications of the names didn't match how they were actually being
used. I guessed that ld_shared would be the command used to link shared
libraries, and ld would be the command to link executables. But they
weren't. So I deleted both and replaced them with a single
ld_share_flags. If we need a different linker for shared libs or
loadable modules on some platform, we can reintroduce it when we
encounter the problem.

Re: dynamically loadable modules

2004-10-08 Thread Steve Fink

Ok, it's in. I did not add the 'cd dynclasses; make' to the default
target; I though I'd see what regular builds I broke first. Testers
wanted, especially on platforms other than darwin and linux.

Re: dynamically loadable modules

2004-10-07 Thread Steve Fink

On Oct-07, Dan Sugalski wrote:
> At 9:55 PM +0200 10/7/04, Leopold Toetsch wrote:
> >Steve Fink <[EMAIL PROTECTED]> wrote:
> >
> > > Clearly, I'm not very experienced with dealing with these things across
> >> platforms, so I was hoping somebody (Andy?) might have a better sense
> >> for what these things are called.
> >
> >AOL ;)
> 
> Heh. If Andy doesn't have a good name, let's call them shareable and 
> loadable libraries, LD_SHARE_FLAGS and LD_LOAD_FLAGS for the flags, 
> and SHARE_EXT and LOAD_EXT for the extensions. (All subject to 
> wholesale pitching if there's a better name :)

Well, its bit longer than the $(SO) that we have everywhere now, but it
works fine for me. If I can disentangle my patch from some other stuff
that somehow crept in (I've no idea how; I'm using a virgin tree for
this), I'll commit it under those names for now.

dynamically loadable modules

2004-10-07 Thread Steve Fink

I've been struggling with getting Darwin to build the dynclasses/ stuff,
and I seem to have it working now. (Oh, and it fixes the Linux build
too.) It's a fairly large change, and I would like to use standard
naming conventions throughout, but I haven't really found any
convincing, definitive source of terminology.

The issue is that on some platforms, dynamically loadable modules and
shared libraries are different things. Although you may know them under
different names (and one of those names is often used for the other.)

A dynamically loadable module is something you explicitly load after
startup time, via dlopen() or some similar interface. A shared library
is implicitly loaded at startup time; you can see a list of these under
unix with ldd.

Under Linux, they both end in a .so extension and are built the same
way. Under Darwin, shared libraries end in .dylib and d-l-modules end in
whatever you want them to. The former is compiled with something like
-dynamiclib and the latter with -bundle (or something; I don't remember
exactly).

So what I need is names for these. At the moment, I'm mostly using $(SO)
for shared lib extensions, $(DYNMOD) for d-l-modules. The buildflags I
gneerally call $(LD_SHARED) or something with shared for shared libs,
and something like $(LD_DYNMOD_FLAGS) for d-l-modules.

Clearly, I'm not very experienced with dealing with these things across
platforms, so I was hoping somebody (Andy?) might have a better sense
for what these things are called.

Re: [perl #31849] [PATCH] Slightly quieter and more informative compilation

2004-10-06 Thread Steve Fink

On Oct-06, Leopold Toetsch wrote:
> Andy Dougherty <[EMAIL PROTECTED]> wrote:
> 
> There are some changes e.g. when different CFLAGS settings are used, or
> for compiling classes. When there is a problem with compiling, just type
> another 'make' and you'll get again "Compiling with ...".

I think this may be gmake-specific, but in the past I have used

ifeq (,$(VERBOSE))
BREVITY=@
else
BREVITY=
endif

c.o:
$(BREVITY)$(CC) $(CFLAGS)...

To make it make version independent, we could do it as

SUPPRESS=@
c.o:
$(SUPPRESS)$(CC) $(CFLAGS)...

and then run make as 'make SUPPRESS=' when we wanted to see the
messages.

Re: dynclasses build (was: Towards 0.1.1 - timetable)

2004-10-06 Thread Steve Fink

On Oct-06, Leopold Toetsch wrote:
> William Coleda <[EMAIL PROTECTED]> wrote:
> > Any chance of getting:
> 
> > 'cd dynclasses; make'
> 
> > working on OS X by then?
> 
> It's broken on Linux too. The problem seems to be that non-existing
> shared libs are used for the final "perl build.pl copy" phase. These
> libs seem to bundled into lib-*$(SO).

Whoops, I thought that was only after my local changes.

> For OS X, it could be that $LD_SHARED_FLAGS = "-fPIC" is necessary.

Nope, it's something more like "-bundle -undefined suppress". But the
real problem is that we don't yet distinguish between building shared
libraries and dynamically loadable modules. (Linux doesn't make this 
distinction.)

I'm working on it. Will gave me an account on his Darwin box, and I have
it building and installing. I'm still trying to track down a problem
where it doesn't seem to call the init function. (Or something. I ran
out of time when I got to that point.)

Re: Towards 0.1.1 - timetable

2004-10-05 Thread Steve Fink

On Oct-05, Leopold Toetsch wrote:
> Wed 6.10. 18:00 GMT - feature freeze
> Sat 9.10.  8:00 GMT - code freeze - no checkins please
> 
> - Parrot 0.1.1 will go out on Saturday.
> - nice release name wanted

0.1.1 - Hydroparrot
0.1.2 - Helioparrot
0.1.3 - Parrolith
0.1.4 - Perylous
0.1.5 - Porn (um... Borot?)
0.1.6 - Carrot
0.1.7 - Nitroparrot
0.1.8 - Parrot Oxide
0.1.9 - Fluoridated Parrot
0.1.10 - Neon Parrot
0.1.11 - Podium

Re: [perl #31850] [PATCH] Remove obsolete files from MANIFEST.generated

2004-10-05 Thread Steve Fink

On Oct-05, Andy Dougherty wrote:
> 
> This patch removes two files that are no longer generated from
> MANIFEST.generated.

Thanks, applied.

Re: [perl #31849] [PATCH] Slightly quieter and more informative compilation

2004-10-05 Thread Steve Fink

On Oct-05, Andy Dougherty wrote:
> 
> The following patch makes compilation both slightly quieter and also
> slightly more informative.
> 
> Or, with less "spin", it fixes bad advice I gave previously. Specifically,
> I had previously noted that it's generally helpful if the Makefile prints
> out the commands it is trying to execute so that it's easier to track down
> problems when they fail.  When I suggested that the '@' be removed from
> the .pbc.imc rule, I suggested at the same time that the '@' also be
> removed from the compilation step above it.
> 
> Alas, that wasn't quite the right place to do it.  What's ultimately of
> interest is the command that actually gets called.  Accordingly, this
> patch puts the '@' back in front of the invocation
> of tools/dev/cc_flags.pl, but changes the print statement in
> tools/dev/cc_flags.pl to show the actual compilation command being issued.
> 
> Again, I have found that information to be useful on numerous
> occasions.  Also, considering how noisy the whole ICU build is, I
> think the extra clutter for parrot's sources is not a significant
> additional burden.

Makes sense to me. Applied, thanks.

Tell me again why Andy doesn't have commit privs yet?

Re: [perl #31807] make install not portable

2004-10-03 Thread Steve Fink

On Oct-02, Nicholas Clark wrote:
> $ make install
> /home/nick/Install/bin/perl5.8.4 tools/dev/install_files.pl --buildprefix= 
> --prefix=/home/nick/Install/parrot --exec-prefix=/home/nick/Install/parrot 
> --bindir=/home/nick/Install/parrot/bin --libdir=/home/nick/Install/parrot/lib 
> --includedir=/home/nick/Install/parrot/include MANIFEST MANIFEST.generated | sh
> 
> We have perl. Which is guaranteed to be on all platforms we build on.
> So why are we making a big list of commands and then feeding them to Unix
> shell? Which isn't going to be on all platforms that we build on.

Because long ago, when I implemented 'make install', I only did the
minimum necessary to get RPM building working. And RPMs don't work so
well on Windows, or so I hear. The "generate a script and pipe it
through sh" approach is something I frequently use when cooking things
up quickly, because you can develop without the pipe, and then when
everything looks good, add it on and know what's going on. It's a nice
way to get visibility into what something's doing. And the script itself
uses forward slashes everwhere, so why bother pretending to be portable?

Not that any of this matters. I've committed fixes to hopefully make it
work portably. Of course, the MANIFEST.generated file is woefully
incomplete, so the files actually installed aren't terribly useful.

Re: Why lexical pads

2004-09-24 Thread Steve Fink

On Sep-24, Aaron Sherman wrote:
> On Fri, 2004-09-24 at 10:03, KJ wrote:
> 
> > So, my question is, why would one need lexical pads anyway (why are they 
> > there)?
> 
> They are there so that variables can be found by name in a lexically
> scoped way. One example, in Perl 5, of this need is:
> 
>   my $foo = 1;
>   return sub { $foo ++ };
> 
> Here, you keep this pad around for use by the anon sub (and anyone else
> who still has access to that lexical scope) to find and modify the same
> $foo every time. In this case it doesn't look like a "by-name" lookup,
> and once optimized, it probably won't be, but remember that you are
> allowed to say:
> 
>   perl -le 'sub x {my $foo = 1; return sub { ${"foo"}++ } }$x=x();print $x->(), 
> $x->(), $x->()'
> 
> Which prints "012" because of the ability to find "foo" by name.

Umm maybe I"m confused, but I'd say that your example prints "012"
because of the *inability* to find "foo" by name. If it could find "foo"
by name, it would be printing 123. Your snippet is actually finding the
global $main::foo, not the lexical $foo.

But I agree that it is doing a name lookup in the string eval case.
Although if you try it, you get puzzling results:

 perl -le 'sub x {my $foo = 1; return sub { eval q($foo++) } };$x=x();print
$x->(), $x->(), $x->()'

prints 012 again. Which confused me, because Perl *can* do named lookups
of lexicals. The problem, apparently, is that it's doing the lookup but
not finding it. If you add in a nonsensical use of $foo to make sure it
sticks around to be found, it works:

 perl -le 'sub x {my $foo = 1; return sub { $foo; eval q($foo++) }
};$x=x();print $x->(), $x->(), $x->()'

Now apparently the closure captures the lexical $foo, and thus the eval
is able to find it. On the other hand, your original example still
doesn't work, and I think that's because symbolic references do not do
pad lookups:

 perl -le 'sub x {my $foo = 1; return sub { $foo; ${"foo"}++ } }$x=x();print
$x->(), $x->(), $x->()'

still prints 012.

Yep. From perlref:

  Only package variables (globals, even if localized) are visible
  to symbolic references.  Lexical variables (declared with my())
  aren't in a symbol table, and thus are invisible to this
  mechanism.  For example:

Re: Compile op and building compilers

2004-09-22 Thread Steve Fink

On Sep-20, Dan Sugalski wrote:
> 
> Now, the issue is how to actually build a compiler. Right now a 
> compiler is a simple thing -- it's a method hanging off the __invoke 
> vtable slot of the PMC. I'm not sure I like that, as it seems really, 
> really hackish. Hacks are inevitable, of course, but it seems a bit 
> early for that. (We ought to at least wait until we do a beta 
> freeze...) On the other hand it does make a certain amount of sense 
> -- it's a compilation subroutine we're getting, so we ought to invoke 
> it, and I can certainly live with that.
> 
> Time to weigh in with opinions, questions, and whatnot. There's not 
> much reason to JFDI and make the decisions final, so weigh away and 
> we'll just nail it all down on wednesday.

My preference, as I've stated before, is to leave compilers as
invoke-able PMCs -- and further, I think that compilers will sometimes
be coroutines, or return multiple continuations, or play other such
tricks available via C (if appropriate for what they do).  Which
is easy if you forget about compilation as being something special, but
instead just say it's invocable and thereby inherit all of the PIR
syntactic sugar for Subs.

On the other hand, that opinion assumes that compilers are used in the
funky ways that I am thinking of, which involves a lot of switching
between languages, using other languages' facilities for implementing
pieces of your language, etc. If Parrot is primarily going to be mixing
languages by having one language call another's libraries, then I can
see some utility in having a separate C op, even if its only
purpose is to explicitly declare that compilers must take only a single
string and produce a callable PMC, and no more. (Though I wonder if you
might want sometimes use a filename rather than a
string-containing-enormous-chunk-of-code). Screwballs like me would then
make our languages compiled via a different mechanism, and we wouldn't
play in the same sandbox as "regular" compilers. However, then we'd need
to decide whether those types of compilers should be registered via
compreg, or whether anything registered via compreg is required to do
something meaningful when invoked with a single string argument
containing code (or whatever C ends up doing; that's just what
it does now.)

A question: when last we talked about this, you mentioned that you
didn't envision it being useful for compilers to take arguments. I think
you were only talking about configuration, but in any case, what sorts
of mechanisms do you feel are appropriate for setting options, pointing
to libraries or include paths, etc? Also, is Parrot supposed to provide
a rich enough set of core functionality that compilers will never need
to communicate directly with the "host" language? As a simple example,
say you have an embedded language that wants to add a new local
variable. Parrot has pads for this purpose, but what if you need to
specify some sort of rich type information or register it with some
host language-specific registry singleton of some sort? I don't know if
these sorts of things are useful, but they're easily within the scope of
imagination. :-)

Re: [perl #31682] [BUG] Dynamic PMCS [Tcl]

2004-09-22 Thread Steve Fink

On Sep-22, Will Coleda wrote:
> ld: /Users/coke/research/parrot/blib/lib/libparrot.dylib is input for the dynamic 
> link editor, is not relocatable by the static link editor again
> compile foo.c failed (256)
> 
> As for the next error... huh?

Not surprising. What architecture and linker are you using? Does 'make
shared' at the toplevel work for you? If so, can you send the output of
it (so I can see the command it runs)? Or better yet, do have an example
of a valid link line?

I don't have any remotely interesting systems to test on, so I don't
know how much help I can be, but I'll take a shot.

Re: Problems Re-Implementing Parrot Forth

2004-09-17 Thread Steve Fink

On Sep-17, Matt Diephouse wrote:
> Having mentally absorbed the forth.pasm code, I thought I'd rewrite it
> in PIR and try to make it a full parrot compiler and, hopefully, a bit
> more approachable. I've already begun on this (see attached file).
> Unfortunately, I've run into a few issues/questions along the way.
> 
> In the following, the term "eval" refers to the use of the
> compreg/compile opcodes.
> 
> o In current builds evaling non PASM/PIR code segfaults (or gives a
> bus error on OS X) as a result of the last patch to op/core.ops (See
> #31573). Any plans here?

Leo made the change as a result of a discussion I kicked off, so it's
probably my fault. :-| But more to the point, I abandoned that
particular interface at the same time, and at least until the bug is
fixed, I recommend you do the same. Instead of using the compile op,
just call your PMC as a subroutine:

  $P0 = compreg "forth"
  $P1 = $P0("...forth code...")

Personally, I plan to stick to this interface unless C becomes
useful, but other people (eg, Dan) disagree that my usage is correct.

> o Evaling PIR code requires a named subroutine and the use of the
> 'end' opcode. Someone mentioned on IRC that this might not be the
> desired behavior. Is it?

Leo talked about adding an @ANON attribute to subs, but I don't believe
he has implemented it yet. For now, just use a fixed name every time you
compile; the symbol table entry will be overwritten every time you
compile, but the compilation should return the sub object for you to
keep track of, so you don't need to care about the symbol table.

> o Calling subroutines from an eval creates a copy of the user stack,
> so all changes are lost (rendering my Forth code unusable). Is this
> behavior correct? If so, how should I go about this?

Dunno. Don't use an Eval? (Use a plain Sub or something instead)

> Any clarifications or statuses, as well as any comments on the PIR
> code, would be very much appreciated. I'm more or less stalled at this
> point until I get some answers/help. It would be nice to get this to
> the point where it can be used to test parrot (as well as serve as an
> example to anyone wanting to write a compiler).

I have finally gotten around to committing my example compiler in
languages/regex. Read the README for usage instructions. Or for a
painful account of my attempts to accomplish what you're doing, just
for a different language, see http://0xdeadbeef.net/wiki/wiki.pl?FinkBlog

parrot/examples/japh/japh15.pasm is a good source of example code too.

Re: [PATCH] dynamic pmc libraries

2004-09-13 Thread Steve Fink

On Sep-09, Brent 'Dax' Royal-Gordon wrote:
> 
> Tiny nit: for consistency with other Configure source files, this
> should probably be named dynclasses_pl.in.  No big deal, though.

Consistency is good, and you're the authority. Change committed.

Re: [perl #31493] Overlapping memory corruption

2004-09-09 Thread Steve Fink

On Sep-09, Leopold Toetsch wrote:
> Steve Fink (via RT) wrote:
> 
> >I won't go through all the details of what I looked at (though I'll
> >post them in my blog eventually), but what's happening is that this
> >line (from perlhash.pmc's clone() implementation) is corrupting the
> >flags field:
> >
> >((Hash*)PMC_struct_val(dest))->container = dest;
> 
> Ah, yep. PMC_struct_val(dest) doesn't hold the hash yet, it is created 
> in hash_clone() only after this line.
> 
> >The problem is that the dest PMC contains a Hash structure in its
> >struct_val field
> 
> No. That's the pointer of the free_list, pointing to the previous PMC in 
> that size class.
> Putting above line after the hash_clone() fixes that bug.

Hey, your reason is much better than my reason. Still, why do the
_noinit stuff and duplicate the creation code? Why not just call pmc_new
as in my replacement code?

Re: Semantics for regexes - copy/snapshot

2004-09-08 Thread Steve Fink

On Sep-09, [EMAIL PROTECTED] wrote:
> On Wed, 8 Sep 2004, Chip Salzenberg wrote:
> 
> > According to [EMAIL PROTECTED]:
> > > So how many stores do we expect for
> > >($a = "xxx") =~ s/a/b/g
> > > and which of the possible answers would be more useful?
> >
> > I think it depends on C<($a = "aaa") =~ s/a/b/g>.
> 
> I would agree with you in general, but since we're generally after speed,
> surely we want to allow for optimisations such as "don't store unless
> something's changed"; this would also be compatible with the boolean context
> value of s///.

I vote for leaving all of these sorts of cases undefined. Well,
partially defined -- I'd rather we didn't allow ($a = "aaa") =~ s/a/b/g
to turn $a into "gawrsh". At the very least, define the exact number of
output and stores for "strict aka slow mode", but have an optional
optimization flag that explicitly drops those guarantees. It would allow
for more flexibility in implementations.

Re: [PATCH] dynamic pmc libraries

2004-09-07 Thread Steve Fink

On Sep-07, Leopold Toetsch wrote:
> Steve Fink <[EMAIL PROTECTED]> wrote:
> 
> > This patch introduces something that feels suspiciously like libtool,
> > despite the fact that libtool has never been very kind to me. But for
> > now I am targeting this only at the dynamic PMC generation problem; this
> > solution could be expanded to ease porting of other parts of the build
> > procedure, but I think other people are already working on that.
> 
> Looks good.
> 
> > I am not committing this patch directly because I know that other people
> > are currently actively working on the dynamic PMC stuff and the build
> > system, and I didn't want to step on anyone's toes.
> 
> So please give it a try.

Ok, it's in. See dynclasses/README for brief usage instructions. I'll
probably be committing a couple of dynamic PMCs soon (when I can figure
out why they're complaining about not having a destroy() defined, when
they don't have the active_destroy flag set.)

Re: TODOish fix ops

2004-09-06 Thread Steve Fink

On Sep-06, Jens Rieks wrote:
> Leopold Toetsch wrote:
> > So first:
> > - do we keep these opcodes?
> >If yes some permutations are missing.
> > - if no,? we should either not include experimental.ops in the default
> > opcode set or move it to dynops.
> I have not used them yet, but I think that they can be useful.
> Has anyone else except Leo and Dan used them?

I use them for debugging printouts, when I want to print the status of
something without defining a bunch of labels and contorting the control
flow. I also use them for simple non-short-circuiting ors and ands.
Nothing terribly important or irreplaceable.

[PATCH] dynamic pmc libraries

2004-09-06 Thread Steve Fink

Mattia Barbon recently implemented the capability to group multiple
dynamic PMCs into a single library. It took me a while, but the correct
way of using it finally percolated through my thick skull. One remaining
problem is that the build process is very platform-dependent.  This
patch doesn't fix that, but it does eliminate the gmake dependency.
Another problem is that you have to specifically write Makefile rules to
build your group of dynamic PMCs into a library, and that is very
difficult to do portably.

This patch introduces something that feels suspiciously like libtool,
despite the fact that libtool has never been very kind to me. But for
now I am targeting this only at the dynamic PMC generation problem; this
solution could be expanded to ease porting of other parts of the build
procedure, but I think other people are already working on that.

The patch adds an additional target to config/gen/makefiles.pl: instead
of just converting config/gen/makefiles/dynclasses.in to
dynclasses/Makefile, it also converts
config/gen/makefiles/dynclasses.pl.in to dynclasses/build.pl, and
changes that Makefile to call build.pl to do all the real work. It is
thus able to pick up config/init/data.pl's notions of all of the ${cc},
${ld}, etc. definitions, but leaves the description of which PMCs to
build with the original Makefile (which probably isn't the greatest
place, but I'm trying to change as little as possible.)

My guess is that this will not immediately cause dynamic PMCs to start
working on the platforms where they do not currently work, but it should
make it easier to get them to work. It also implements a new pmclass
attribute in .pmc files (only meaningful for dynamic PMCs): C, which will get automatically picked up by the new
dynclasses/build.pl to generate a single shared library out of all PMCs
with the same group tag. So to implement two new dynamic PMCs
'mylangPmc1' and 'mylangPmc2', you would:

 * Implement the .pmc files, and include 'group mylang' in their pmclass lines
 * Add mylangPmc1 and mylangPmc2 to config/gen/makefiles/dynclasses.in
 * Re-run Configure.pl

That is the same procedure as is currently used to implement independent
dynamic PMCs right now, except for the addition of the 'group mylang'
tag.

I am not committing this patch directly because I know that other people
are currently actively working on the dynamic PMC stuff and the build
system, and I didn't want to step on anyone's toes. Note that build.pl
is NOT a general build tool, although it covers everything needed for
the dynclasses/ directory. At the moment, it doesn't even bother to do
dependency analysis for the grouped PMCs, although it does for all of
the rest. Still, this patch gets stuff working that currently doesn't
exist, and doesn't break anything that currently works AFAIK.  I fully
expect (and hope) that it will be replaced by something more general
someday. But I'd rather not wait for that day, having first-hand
experience with how much "fun" it is to get partial linking of dynamic
libraries working on multiple platforms.
Index: config/gen/makefiles.pl
===
RCS file: /cvs/public/parrot/config/gen/makefiles.pl,v
retrieving revision 1.34
diff -u -r1.34 makefiles.pl
--- config/gen/makefiles.pl 19 Jun 2004 09:33:09 -  1.34
+++ config/gen/makefiles.pl 5 Sep 2004 22:28:23 -
@@ -81,6 +81,8 @@
   commentType => '#', replace_slashes => 1);
   genfile('config/gen/makefiles/dynclasses.in',   'dynclasses/Makefile',
   commentType => '#', replace_slashes => 1);
+  genfile('config/gen/makefiles/dynclasses.pl.in',   'dynclasses/build.pl',
+  commentType => '#', replace_slashes => 0);
   genfile('config/gen/makefiles/dynoplibs.in',   'dynoplibs/Makefile',
   commentType => '#', replace_slashes => 1);
   genfile('config/gen/makefiles/parrot_compiler.in', 
'languages/parrot_compiler/Makefile',
Index: classes/pmc2c2.pl
===
RCS file: /cvs/public/parrot/classes/pmc2c2.pl,v
retrieving revision 1.16
diff -u -r1.16 pmc2c2.pl
--- classes/pmc2c2.pl   22 Aug 2004 09:15:51 -  1.16
+++ classes/pmc2c2.pl   5 Sep 2004 22:28:24 -
@@ -135,12 +135,6 @@
 
 Used with C: No C code is generated.
 
-=item C
-
-The class is a dynamic class. These have a special C
-routine suitable for dynamic loading at runtime. See the F
-directory for an example.
-
 =item C
 
 Classes with this flag get 2 vtables and 2 enums, one pair with
@@ -164,6 +158,18 @@
 library
 ref
 
+=item C
+
+The class is a dynamic class. These have a special C
+routine suitable for dynamic loading at runtime. See the F
+directory for an example.
+
+=item C
+
+The class is part of a group of interrelated PMCs that should be
+compiled together into a single shared library of the given name. Only
+valid for dynamic PMCs.
+
 =back
 
 =item 3.
@@ -318,7 +324,7 @@
 my $c = shift;
 $$c =~ s

Re: Semantics for regexes

2004-09-01 Thread Steve Fink

On Sep-01, Dan Sugalski wrote:
> 
> This is a list of the semantics that I see as needed for a regex 
> engine. When we have 'em, we'll map them to string ops, and may well 
> add in some special-case code for faster access.
> 
> *) extract substring
> *) exact string compare
> *) find string in string
> *) find first character of class X in string
> *) find first character not of class X in string
> *) find boundary between X and not-X
> *) Find boundary defined by arbitrary code (mainly for word breaks)

Huh? What do you mean by "semantics"? The only semantics needed are the
minimum necessary to answer the question "is the fred at offset i equal
to the fred X?" (Sorry, not sure if fred is actually character or
codepoint or whatever, and is probably all of them at different levels.)

We also almost certainly need to be able to do character class
comparisons, although if you assume that you can always transcode to
what the regex was compiled with, then you don't even need that --
instead, you need to be able to convert to something like a difference
list of numbered freds. But if we're talking about semantics, then yes
you need the character class manipulation.

Everything else in this list sounds like optimizations to me, and
probably not the right optimizations (I don't think it's possible to
predict what will be useful yet.)

For other things that parrot will be used for, I suspect that the first
3 will be needed.

I'm curious as to how you came up with that list; it seems to imply a
particular way of implementing the grammar engine. I would expect all of
that, barring certain optimizations, to be done directly with existing
pasm instructions.

There will be a need for saving a stack of former values of hypothetical
variables, which can also be done with pasm ops but might interact with
overloaded assignment or something wacky like that.

Re: Proposal for a new PMC layout and more

2004-09-01 Thread Steve Fink

On Sep-01, Leopold Toetsch wrote:
> Below is a pod document describing some IMHO worthwhile changes. I hope 
> I didn't miss some issues that could inhibit the implementation.

Overall, I like it, although I'm sure I haven't thought of all of the
repercussions.

The one part that concerns me is the loss of the flags -- flags just
seem generally useful for a number of things. In the limit, each flag
could be replaced by an equivalent vtable entry that just returned true
or false, but that will only work for rarely-used flags due to the extra
levels of indirection. I suppose we could also have a large class of
PMCs that contained a flag word, and only the primitive PMCs would lack
it, but then the flags cannot be used without knowing the type of PMC.

It all comes down to the specific current and future uses of flags.
You've dealt with the GC flags; what about the rest?

The proposal would also expand the size of the vtable by a bit due to
the string vtable stuff. I don't know how much that is, percentage-wise.
And I suppose that increase is dwarfed by the decrease due to
eliminating the S variants. (Although that's another part of the
proposal that makes me nervous -- will MMD really take care of all of
the places where we care that we're going to a string, specifically,
rather than any other random PMC type? Strings are a pretty widespread
concept throughout the code base, and this is the only highly
user-visible part of the change.)

I also view the proposal as being comprised of several fairly
independent pieces. Something like:

 * Merging PMCs and Buffers
 * Merging STRINGs and PMCs
 * Removing GC-related flags and moving them to GC implementations
 * Removing the rest of the flags
 * Using Null instead of Undef
 * Moving "extra" stuff to before the PMC pointer
 * Using Refs to expand PMCs
 * Using DOD to remove the Ref indirection
 * Shrinking the base PMC size

..and whatever else I forgot. Not all of these are dependent on each
other, and could be implemented separately. And some are only dependent
in the sense that you'll make space or time performance worse until you
make the rest of the related changes. You could call those
design-dependent, rather than implementation-dependent.

Re: Compile op with return values

2004-08-31 Thread Steve Fink

On Aug-30, Dan Sugalski wrote:
> I've been watching this thread with some bemusement -- I've got to 
> admit, I don't see the problem here.
> 
> I'm not sure what the point of passing in parameters to the 
> compilation is. (Not that I don't see the point of having changeable 
> settings for compilers, but that's something separate) The interface 
> is simple on purpose -- in most cases either there *are* no 
> parameters possible (Perl's eval and its equivalent in other 
> languages) or there's no reasonable way to know what the parameters 
> are (Perl's eval evaluating code of a different language). The syntax 
> just isn't there to have them, and is really unlikely to ever 
> materialize, so there's little point in putting in parameters to the 
> compilation. In those cases where the programmer may know what to 
> change, they can tweak any external knobs the compiler module might 
> have programmatically.

I understand that perspective, but I guess I'm thinking about embedded
compilers somewhat differently. For example, consider a regex compiler.
It needs to be able to compile embedded code in whatever the host
language is. In fact, it needs to be able to switch back and forth
freely between the regex compile and the host language compile, and the
compilation of the inner language might need to be tailored to fit into
whatever the regex compiler needs. Maybe that's a simple as saying
"don't provide a main()", in which case it can be done by having two
C registration strings for the same language. But you have to
get the name of that language into the regex compiler in the first
place. (Ok, you might be able to avoid that in this particular case by
making the regex compiler into a coroutine, but I don't want to get too
caught up in one particular example.)

And the compiler needs to be reentrant, for the cases where the language
within the regex rule invokes another regex match. I mention that only
to say that you can't just set properties on the PMC returned from
C, because that PMC will be shared during reentrant calls. You
could always clone it and then configure it, I suppose.

Anyway, I'm just trying to come up with situations where compilers need
to know more than just the language they're compiling, and especially
cases where you want different configuration for every compile. Another
example of this would be if your regex syntax involved binding
hypothetical variables (or something similar), and the inner language
needed to know at compile time which variables had been defined at that
point.

I'm sure I could come up with workarounds for all of these issues, but I
was expecting that much of the usefulness of Parrot would be in mixing
together (and nesting) several languages in one program, and it seems
like in many cases nested compiles are going to need to communicate
nontrivial amounts of information.

I'm okay with things if the answer is "don't do that" -- meaning if you
need complex cases like this, then forget about C and do
everything with straight subroutines or whatever else -- but I would
like to understand the intent of the C op better so I can
forget about trying to make my stuff fit into its mold if what I'm doing
is just different.

> The whole "name for the function I'm compiling" thing isn't an issue 
> either, or at least it shouldn't be. The code being compiled is 
> implicitly a subroutine -- you don't have to have code that reads:
> 
>.sub foo_1234423_some_random_text
>.
>.
>.
>.end
> 
> and go look for 'foo12 34423_some_random_text' in a namespace 
> somewhere. Just leave out the .sub/.end (they should be implied) and 
> the returned PMC is a sub PMC for your nicely anonymous sub. Which is 
> fine, and as it should be.

That would work fine for me. The current state also works ok since
overriding is allowed, but it feels wrong to construct a sub with a
specific name and then disavow all knowledge of that name even though
it's been registered in some global table. Leo's @ANON implementation of
your scheme works great for me (I have no problem wrapping that around
my code.) All this does raise the question of garbage collection for
packfile objects; is there any? Both my current day job project and (I'm
guessing) mod_perl both hope the answer is "yes". :-)

Re: [perl #31268] [PATCH] Dynamic library with multiple PMCs

2004-08-29 Thread Steve Fink

On Aug-21, Mattia Barbon wrote:
> 
>   Hello,
> as promised with this patch:
> 
> pmc2c2 ... --library foo --c pmc1.pmc pmc2.pmc pmc3.pmc ...
> 
> outputs pmcX.c and pmc_pmcX.h as it did before, plus
> foo.c and pmc_foo.h containig a single Parrot_lib_foo_load
> that initialized vtables and MMD dispatch for all the PMCs,
> taking into account circular PMC dependencies in MMD dispatch.

I am trying to use this facility right now, and am encountering
problems. For a detailed blow-by-blow account of what I'm doing, see
http://0xdeadbeef.net/wiki/wiki.pl?FinkBlog/SharedLibraryHell

But that's much too verbose, so I'll just describe the problem.

I have two PMCs that need to know about each other, match.pmc and
matchrange.pmc. (Ok, actually just one needs to know about the other,
but whatever.) So I bound them together into a library 'match_group'
using your --library flag. But the resulting shared library uses symbols
defined in its constituents, and so when parrot tries to load it, it
cannot resolve those symbols. I can fix this by changing the link line
that builds match_group.so to explicitly list match.so and
matchrange.so, and that enters them in as NEEDED in the shared library
match_group.so. So far so good. But then I still can't get it to work,
because parrot is only able to find match_group.so because it explicitly
constructs the path runtime/parrot/dynext/LIBRARY.so; when dlopen
internally tries to load match.so, it doesn't have that path in its
RPATH. This is fixable by adding the path to LD_LIBRARY_PATH, but if
you're going to do that then why bother with the explicit path
construction within parrot?

It seems to me that we need to either add the full RPATH to the parrot
binary, or teach parrot its absolute path so it can add it at runtime.
And I'm not totally sure that the runtime approach will work. Or we
could punt and wrap everything for a PMC library into the same .so. But
that doesn't seem very nice.

How does perl5 handle this?

Also, it seems that dynclasses/Makefile has a few gmake-isms thrown in,
which I imagine nobody has complained about because the shared library
stuff probably only works on Linux anyway right now. (And in my local
version, I added lots more to get the match_group.so linked up
correctly.)

Re: Library loading

2004-08-29 Thread Steve Fink

On Aug-28, Dan Sugalski wrote:
> 
> We dynamically load libraries. Whee! Yay, us. We need a set of 
> semantics defined and an API to go with them so we can meaningfully 
> and reliably work with them.

Hm. Today I was working with the current implementation of this stuff,
and uncovered a bunch of questions. I'm not sure this thread is really
the right one for my questions, but it was too timely to pass up.

> 1) Load the shared library from disk
> 
> The equivalent of dlopen. (May well *be* dlopen)

I'm running into problems with loading libraries that are dependent on
other (user-provided, not system) libraries. I'm thinking it would be
nice to have an interface for setting search paths.

Take my particular case: I have something called match_group.so which
contains some undefined symbols from both match.so and matchrange.so.
All of these are from dynamic PMCs, so they should be found in
runtime/parrot/dynext/. And if I run from the top-level directory,
match_group.so *is* found -- but it fails to load, because it can't find
its dependencies. match_group.so is found because dynext.c:get_path()
explicitly looks through runtime/parrot/dynext, but that doesn't help
the implicit loading of match.so. I can get it to work by explicitly
setting LD_LIBRARY_PATH to the absolute path of the dynext/ directory,
but that's not a good long-term solution.

So there are two problems. One is that parrot looks in a relative
directory when it's searching for dynamic libraries. If this were an
absolute path, then you could run parrot from some other directory and
still find your libraries, but I don't know if parrot knows where it is
running from yet. (FindBin, anyone?)

The second problem is that parrot must be able to find dependent
libraries implicitly, so it can't manually construct paths and have
things work. One solution for this would be to add an API entry to set
the search path. On Unix, I think that means setting LD_LIBRARY_PATH,
I haven't tried that to see if it actually works if you set it while a
program is running -- there were some hints on Google that it might
fail. If it doesn't work, I don't know of any other way to set the
implicit search path programmatically, so perhaps it shouldn't be in the
API. :-)

Alternatively, we can hardcode the absolute path to the dynext/
directory in the RPATH tag of the dynamic section (for ELF).

> We're also going to want to allow embedding applications to pass in 
> handles to existing libraries (so, for example, we don't try and load 
> in half a dozen versions of the expat library...) that it's already 
> loaded in.

I believe glibc already caches all of the handles and just gives you
back the same handle if you ask for it again. So unless other systems
don't, this isn't needed.

I'll write another message specifically talking about the problems I am
encountering, because the remainder has no bearing on the API.

Re: Compile op with return values

2004-08-28 Thread Steve Fink

On Aug-27, Leopold Toetsch wrote:
> Steve Fink wrote:
> >On Aug-26, Leopold Toetsch wrote:
> 
> >>.sub @regex_at_foo_imc_line_4711  # e.g.
> 
> >Yes, this illustrates what I was really getting at. My compiler can
> >certainly take a subroutine name (or file and line number, or whatever)
> >to use to generate the code with, but what is the proper way to pass
> >that infomation in through the compile op? 
> 
> I don't know how your compiler generates the code. But you are probably 
> concatenating a string of PIR instructions and pass that over to the 
> C opcode.
> Anyway, the identifier you are using for the C<.sub> directive gets 
> stored in globals and is the name of the subroutine.

Um... sorry, I was unclear. I am talking about how to get data *into* my
compile op. You are describing how to get it *out*, which I am well
aware of.

> >... I can just stick it in some
> >register, but it seems like there ought to be some standard-ish way of
> >passing parameters to compilers. Which then makes me wonder why compile
> >is done as an opcode rather than a method invocation on the return value
> >of compreg. 
> 
> C as a method call for the compiler would really be a 
> worthwhile extension. But you can provide your own compiler wrapper and 
> pass the subroutine name to that function. [1]

Yes, that is a reasonable way of implementing it. I probably wouldn't do
it exactly that way because I'd rather have the generated code be a
completely valid PIR snippet on its own -- in your example I would have
some orphaned ".param x" lines in what my compile op returned, which
would require the surrounding ".sub" to be valid. But it does
encapsulate things nicely, and makes it clear how to pass such
information in.

But that's not my issue. To stave off confusion, I should mention
here that I am not blocked on any of this; I can think of half a dozen
ways of doing what I want, now including the one you suggested. I am
now only asking these questions so that I might better write up a FAQ
entry, and what I am still unsure about is how to explain the purpose of
the compile op, and what the "official" way is to pass in parameters
that influence the compiled code. 

> >... I see that for Compiler and NCI PMCs, that's exactly what it
> >does, but for anything else it does the Parrot_runops_fromc_args_save
> >thing; couldn't that be somehow exposed separately so that the compile
> >op goes away? My only complaint about C is that it isn't
> >transparent how to use it, whereas I am comfortable with invoking things
> >and following the calling conventions.
> 
> Well, there isn't much difference. The compile function is called as a 
> plain function. A method call would additionally pass C, which you 
> can pass as an argument too, if you need it.

Right. So why does the compile op exist? I assert that many compilers
will need some form of additional contextual information in order to
properly compile the code they are passed. My one example so far is the
key needed to ensure that the return Sub PMC is associated with a unique
name. This isn't a very good example, because it could be done wholly
within the compile op and therefore doesn't *really* need any
information passed in. However, there are many other examples: include
paths, library paths, optimization options, etc. It seems like there
ought to be a standard way of communicating this sort of information to
embedded compilers. Perhaps a way that is standard enough that the
language to be compiled could be treated as a dynamic parameter -- in
which case, all compilers would need interpret the passed-in contextual
information in the same way.

We already have one way of communicating information, and that is to use
the register calling conventions. If we really wanted different
languages' compilers to be able to interpret the same contextual
information (which is nice to have, but hardly a necessity), then we
would additionally need to specify a common signature -- perhaps to
compile something, you look up an invokable compiler PMC using
compreg, then "call" it with a single parameter representing named
arguments. Or something.

The question then becomes how to call the compiler. We could use the
C op, but it would be doing exactly the same thing as the
C op, so why not just use that? If I look at the code, it shows
that there sometimes is a different between the two -- namely, if the
compiler PMC is anything than a Compiler or an NCI, then it calls
Parrot_runops_fromc_args_save rather than simply invoking it. Whatever
that function does, it must be necessary, so... why?  Then at last I
will understand why there is a need for a C separate from
C.

Re: Compile op with return values

2004-08-28 Thread Steve Fink

On Aug-26, Leopold Toetsch wrote:
> Steve Fink wrote:
> 
> >I can store some global counter that makes it generate different sub
> >names each time, but that seems a bit hackish given that I don't really
> >want the subroutine to be globally visible anyway; I'm just using one so
> >that I can use PIR's support for handling return values.
> 
> I don't think its hackish. And you might want to keep the compiled regex 
> around for later (and repeated) invocation.

I certainly do, but I can always do that by remembering the PMC returned
from the compile op myself.

> Visibility is another issue though. You could mangle[1] the subroutine 
> name or use a distinct namespace, which might reduce the possibility of 
> name collisions.
> 
> leo
> 
> .sub @regex_at_foo_imc_line_4711  # e.g.
> .end

Yes, this illustrates what I was really getting at. My compiler can
certainly take a subroutine name (or file and line number, or whatever)
to use to generate the code with, but what is the proper way to pass
that infomation in through the compile op? I can just stick it in some
register, but it seems like there ought to be some standard-ish way of
passing parameters to compilers. Which then makes me wonder why compile
is done as an opcode rather than a method invocation on the return value
of compreg. I see that for Compiler and NCI PMCs, that's exactly what it
does, but for anything else it does the Parrot_runops_fromc_args_save
thing; couldn't that be somehow exposed separately so that the compile
op goes away? My only complaint about C is that it isn't
transparent how to use it, whereas I am comfortable with invoking things
and following the calling conventions.

Re: Compile op with return values

2004-08-26 Thread Steve Fink

On Aug-22, Leopold Toetsch wrote:
> Steve Fink <[EMAIL PROTECTED]> wrote:
> > I am experimenting with registering my own compiler for the "regex"
> > language, but the usage is confusing. It seems that the intention is
> > that compilers will return a code object that gets invoked, at which
> > time it runs until it hits an C opcode. But what if I want to
> > return some values from the compiled code? I see the following
> > options:
> 
> An Eval object isa Closure. The only difference to a subroutine is IIRC,
> that if supports direct jumps out of the evaled code segment via the
> inter-segment branch_cs opcode:
> 
> .
> .
> .
> 
> Thus it should be totally valid to return a Sub or Closure object from
> your compiler. The C of these two handles the packfile segment
> switching.
> 
> >   $P0 = compreg "pig-latin"
> >   $P1 = compile $P0, "eturnray oneay oremay anthay asway assedpay inay"
> >   $I0 = $P1(41)
> >   print $I0 # Should print out 42
> 
> That's fine.

Ok, that makes sense. I didn't realize the relationship between the Eval
PMC and Sub. Thanks.

So then what is the proper way to return a Sub from the compile op?
Right now, I always compile to the same subroutine name "_regex", and
fetch it using find_global _regex. This works, even when I compile
multiple chunks of regex code. I guess whatever is entering things into
the global symbol table is overriding the previous definition, and I
grab it out immediately thereafter. But is this safe to rely on, or will
it later become an error to override a global subroutine?

I can store some global counter that makes it generate different sub
names each time, but that seems a bit hackish given that I don't really
want the subroutine to be globally visible anyway; I'm just using one so
that I can use PIR's support for handling return values.

Compile op with return values

2004-08-21 Thread Steve Fink

I am experimenting with registering my own compiler for the "regex"
language, but the usage is confusing. It seems that the intention is
that compilers will return a code object that gets invoked, at which
time it runs until it hits an C opcode. But what if I want to
return some values from the compiled code? I see the following
options:

 1) Manually set up the return values in the appropriate registers.

 2) Use .pcc_begin_return/.pcc_end_return to set up the return values,
but override the P1 return continuation to go to some subroutine
that immediately calls C.

 3) Use .pcc_begin_return/.pcc_end_return to set up the return values,
and have the user of the compiler grab out some magic-named
subroutine using C and invoke it rather than directly
invoking the return value of the C op.

 4) Make my compiler behave differently from the built-in compilers,
by arranging for it to return a Sub object from the C op
rather than an Eval object. Then the caller can directly invoke
the return value of C and get back return values as if
it were calling a normal subroutine, because it is.

#1 is painful and extremely error-prone.

#2 is a total hack.

#3 forces the user of the compiler to not only call things differently
depending on which language it's using, but also it has to know the
magic subroutine name

#4 seems to make compilers inconsistent with each other, and I worry that
the Eval PMC does more than just running code until it hits an C op.

If my explanation was confusing, then here are some examples. I expect
the user of the compiler to look like this:

  $P0 = compreg "pig-latin"
  $P1 = compile $P0, "eturnray oneay oremay anthay asway assedpay inay"
  $I0 = $P1(41)
  print $I0 # Should print out 42

Option #3 would change this to

  $P0 = compreg "pig-latin"
  compile $P0, "eturnray oneay oremay anthay asway assedpay inay"
  $P1 = find_global "_pig_latin_eval_block"
  $I0 = $P1(41)
  print $I0 # Should print out 42

Rather than writing out example code for the compiler and generated
code to illustrate each of the cases, I think I'll wait to see if this
needs clarification first. :-)

Parrot experiences log

2004-08-20 Thread Steve Fink

As I prepared to dive into a big area of parrot that I'm completely
unfamiliar with, I decided to log my travels in hopes of helping out
the next poor soul who happens along a similar path.

For now, the focus is on converting my toy languages/regex compiler
into more of a real Perl6-style rule compiler callable directly by
Parrot. I know someone else was blogging his initial experiences
slogging through the Parrot source -- that's where I got the idea,
though I'm not exactly a newbie around here so I'll be assuming a bit
more familiarity with things. (Though I'm pretty out of touch right
now, so maybe not...)

Oh, right, the url: http://0xdeadbeef.net/wiki/FinkBlog

At least for now. That's my server, and my DSL line is flaky, but I
wanted to do the minimal amount of setup, and I'm familiar with this
particular wiki syntax. And I expect to get bored of writing stuff up
before reaching the point at which moving would be worthwhile.

Yes, I'm aware of the wiki at
http://www.vendian.org/parrot/wiki/bin/view.cgi/Main/WebHome but I
couldn't find an appropriate place to put this there. Also, I wanted
something I can get to while off-line, and something I can molest more
easily to make writing the log easier. I've modified my wiki a fair
amount, and have gotten dependent on some of the shortcuts.

Re: [PATCH] Match PMC

2004-08-17 Thread Steve Fink

Oh, and here's my test code for the Match PMC. This is really just a
copy of t/pmc/perlhash.t (since the Match PMC is supposed to behave
like a hash for the most part), but with one added test case at the
end showing how this would be used to store and retrieve
hypotheticals.
Index: t/pmc/match.t
===
RCS file: t/pmc/match.t
diff -N t/pmc/match.t
--- /dev/null   1 Jan 1970 00:00:00 -
+++ t/pmc/match.t   17 Aug 2004 17:28:17 -
@@ -0,0 +1,1256 @@
+#! perl
+
+# Copyright: 2001-2003 The Perl Foundation.  All Rights Reserved.
+# $Id: match.t,v 1.44 2004/04/19 12:15:22 leo Exp $
+
+=head1 NAME
+
+t/pmc/match.t - Match Objects
+
+=head1 SYNOPSIS
+
+   % perl t/pmc/match.t
+
+=head1 DESCRIPTION
+
+Tests the C PMC. Does standard hashtable testing. Then tests
+various aspects of retrieving ranges of the input string.
+
+Probably ought to do nested match objects too.
+
+=cut
+
+use Parrot::Test tests => 34;
+use Test::More;
+
+output_is(< "a", b => [undef, undef] }
+
+clone P1, P0
+set P0["c"], 4
+set P3, P0["b"]
+set P3, 3
+set P0["b"], P3
+set P1["a"], "A"
+
+# P0 = { a => "a", b => [undef, undef, undef], c => 4 }
+# P1 = { a => "A", b => [undef, undef] }
+
+set S0, P0["a"]
+eq S0, "a", ok1
+print "not "
+ok1:
+print "ok 1\n"
+
+set P5, P0["b"]
+set I0, P5
+eq I0, 3, ok2
+print "not "
+ok2:
+print "ok 2\n"
+
+set I0, P0["c"]
+eq I0, 4, ok3
+print "not "
+ok3:
+print "ok 3\n"
+
+set S0, P1["a"]
+eq S0, "A", ok4
+print "not "
+ok4:
+print "ok 4\n"
+
+set P5, P1["b"]
+set I0, P5
+eq I0, 2, ok5
+print "not ("
+print I0
+print ") "
+ok5:
+print "ok 5\n"
+
+# XXX: this should return undef or something, but it dies instead.
+# set P3, P0["c"]
+# unless P3, ok6
+# print "not "
+# ok6:
+# print "ok 6\n"
+ end
+CODE
+ok 1
+ok 2
+ok 3
+ok 4
+ok 5
+OUTPUT
+
+output_is(<<'CODE', <\n"
+ret
+
+subtest:
+print "subrule/"
+print S0
+print ":"
+set S0, P1["subrule";S0]
+isnull S0, report_null
+print S0
+print "\n"
+ret
+subreport_null:
+print "\n"
+ret
+CODE
+empty_at_start:
+empty_at_middle:
+whole:the full input string
+full:full
+no_start:
+no_end:
+regular_key:boring old value
+subrule/empty_at_start:
+subrule/empty_at_middle:
+subrule/whole:the full input string
+subrule/full:full
+subrule/no_start:
+subrule/no_end:
+subrule/regular_key:boring old value
+Direct access to start,end for 'full': 4,7
+OUTPUT
+
+1;

[PATCH] Match PMC

2004-08-17 Thread Steve Fink

I needed to create a Match PMC object for holding the match groups
(parenthesized expressions and capturing rules) from a regex match.
Unfortunately, it works by using another new PMC type, the MatchRange
PMC, to signal that an element of its hashtable should be interpreted
specially (as a substring of the input string). One PMC knowing about
another currently means they need to be static PMCs, not dynamic.
(AFAIK) So this is the patch of what I am currently using. I cannot
guarantee it will actually be useful for any other regex implementors,
so I feel uncomfortable committing it myself. (OTOH, if someone needs
something different, they can just add it as a different name.) The
point is, this is something I need for my stuff and the future of
languages/regex is with some version of it, so I can't commit those
changes without this. Although I fully expect the Match PMC will need
to be substantially beefed up to become a full grammar object (or
something...), this is base functionality that it needs to start with.

With these two PMCs, I can construct a match object containing the
hypotheticals $1, $2, etc., as well as a full parse tree comprised of
nested match objects. This does *not* handle saving and restoring
previous hypothetical values, as is needed in the case of

 (a)+b

In my compiler, that is handled by the compiled engine code.
Index: classes/match.pmc
===
RCS file: classes/match.pmc
diff -N classes/match.pmc
--- /dev/null   1 Jan 1970 00:00:00 -
+++ classes/match.pmc   17 Aug 2004 17:02:01 -
@@ -0,0 +1,205 @@
+/*
+Copyright: 2004 The Perl Foundation.  All Rights Reserved.
+$Id$
+
+=head1 NAME
+
+classes/match.pmc - Match object for rules
+
+=head1 DESCRIPTION
+
+This is a match object for holding hypothetical variables, the input string,
+etc.
+
+For now, it is really just proof-of-concept code, and I fully expect
+anyone who reads this to hurl. Violently.
+
+=head2 Functions
+
+=over 4
+
+=cut
+
+*/
+
+#include 
+#include "parrot/parrot.h"
+
+STRING * hash_get_idx(Interp *interpreter, Hash *hash, PMC *key);
+
+static STRING* make_hash_key(Interp* interpreter, PMC * key)
+{
+if (key == NULL) {
+internal_exception(OUT_OF_BOUNDS,
+"Cannot use NULL key for Match!\n");
+return NULL;
+}
+return key_string(interpreter, key);
+}
+
+static STRING* match_range(Interp* interp, PMC* self, PMC* range)
+{
+STRING* input_key = const_string(interp, "!INPUT");
+Hash* hash = (Hash*) PMC_struct_val(self);
+HashBucket *b;
+STRING* input;
+int start, end;
+
+b = hash_get_bucket(interp, hash, input_key);
+if (!b) {
+internal_exception(1, "Match: input string not set");
+return NULL;
+}
+
+input = VTABLE_get_string(interp, (PMC*) b->value);
+/* These could both be converted to grab UVal_int directly, but
+ * I'll leave it like this for now because it'll test the vtable
+ * members. */
+start = VTABLE_get_integer_keyed_int(interp, range, 0);
+end = VTABLE_get_integer_keyed_int(interp, range, 1);
+
+if (start == -2 || end == -2 || end < start - 1)
+return NULL;
+else
+return string_substr(interp, input, start, end - start + 1, NULL, 0);
+}
+
+static STRING* fetch_string(Interp* interp, PMC* matchobj, PMC* val)
+{
+if (val->vtable->base_type == enum_class_MatchRange) {
+return match_range(interp, matchobj, val);
+} else {
+return VTABLE_get_string(interp, val);
+}
+}
+
+static INTVAL fetch_integer(Interp* interp, PMC* matchobj, PMC* val)
+{
+if (val->vtable->base_type == enum_class_MatchRange) {
+STRING* valstr = match_range(interp, matchobj, val);
+return string_to_int(interp, valstr);
+} else {
+return VTABLE_get_integer(interp, val);
+}
+}
+
+pmclass Match extends PerlHash {
+
+/*
+
+=item C
+
+=cut
+
+*/
+
+STRING* get_string_keyed_str (STRING* key) {
+PMC* value;
+Hash* hash = (Hash*) PMC_struct_val(SELF);
+HashBucket *b = hash_get_bucket(INTERP, hash, key);
+if (b == NULL) {
+/* XXX Warning: use of uninitialized value */
+/* return VTABLE_get_string(INTERP, undef); */
+return NULL;
+}
+return fetch_string(INTERP, SELF, (PMC*) b->value);
+}
+
+/*
+
+=item C
+
+Returns the string value for the element at C<*key>.
+
+=cut
+
+*/
+
+STRING* get_string_keyed (PMC* key) {
+PMC* valpmc;
+STRING* keystr;
+HashBucket *b;
+Hash *hash = PMC_struct_val(SELF);
+PMC* nextkey;
+
+switch (PObj_get_FLAGS(key) & KEY_type_FLAGS) {
+case KEY_integer_FLAG:
+/* called from iterator with an integer idx in key */
+/* BUG! This will iterate through the input string as
+ * well as all of the real values. */
+if (hash->key_type == Hash_key_type_int) {
+

Re: [perl #31128] Infinite loop in key_string

2004-08-17 Thread Steve Fink

Oh, and while I have my fingers crossed, I may as well throw in the
original test patch as well. I'll let these messages go to hell
together.

Urk! Except I used stupid filenames, and swapped the attachments. So
this attachment is actually the patch. Need more sleep.
? src/py_func.str
Index: src/key.c
===
RCS file: /cvs/public/parrot/src/key.c,v
retrieving revision 1.51
diff -u -r1.51 key.c
--- src/key.c   8 Jul 2004 10:19:11 -   1.51
+++ src/key.c   17 Aug 2004 17:00:08 -
@@ -357,6 +357,10 @@
 case KEY_pmc_FLAG | KEY_register_FLAG:
 reg = interpreter->pmc_reg.registers[PMC_int_val(key)];
 return VTABLE_get_string(interpreter, reg);
+case KEY_integer_FLAG:
+return string_from_int(interpreter, PMC_int_val(key));
+case KEY_integer_FLAG | KEY_register_FLAG:
+return string_from_int(interpreter, 
interpreter->int_reg.registers[PMC_int_val(key)]);
 default:
 case KEY_pmc_FLAG:
 return VTABLE_get_string(interpreter, key);

[PATCH] Re: [perl #31128] Infinite loop in key_string

2004-08-17 Thread Steve Fink

I don't know what's eating my mail, but evidently the attachment never
made it out. I tracked down this particular problem and fixed it for
the actual case I was using, which was not a PerlHash at all but
rather my own custom Match PMC for use in regexes. The attached patch
resolves the exact symptom I was seeing, but actually doesn't fix the
problem in either the PerlHash nor the Match cases, for different
reasons. For PerlHash, P0["foo";3] seems to be interpreted as an
iterator access? I hope there's some other way of indicating that. For
my Match PMC, I needed to avoid the whole conversion to string anyway.

Still, I won't commit this patch directly, because I have only
recently delved into the latest incarnation of the keyed code, and it
scares me.

Oh boy. What are the odds of this message actually making it out?
Index: t/pmc/perlhash.t
===
RCS file: /cvs/public/parrot/t/pmc/perlhash.t,v
retrieving revision 1.44
diff -r1.44 perlhash.t
22c22
< use Parrot::Test tests => 34;
---
> use Parrot::Test tests => 35;
693a694,709
> output_is(<< 'CODE', << 'OUTPUT', "Getting PMCs from string;int compound keys");
> new P0, .PerlHash
> new P1, .PerlHash
> new P2, .PerlInt
> set P2, 4
> set P1[0], P2
> set P0["a"], P1
> set I0, P0["a";0]
> print "Four is "
> print I0
> print "\n"
> end
> CODE
> Four is 4
> OUTPUT
>

Re: The new Perl 6 compiler pumpking

2004-08-13 Thread Steve Fink

Sorry if this is a repeat, but I didn't get my own mail back, so I
think I may have had sending problems.

On Aug-09, Patrick R. Michaud wrote:
>
> Luke Palmer and I started work on the grammar engine this past week.
> It's a wee bit too early in the process for us to be making any
> promises about when people might be seeing releases and the like.
> But I think he and I are in agreement that we'd like to have a grammar
> engine substantially completed (at least to the level of being able
> to "bootstrap" a Perl 6 compiler) within the next 3-4 months.

I have one of those too, in languages/regex. The parser is crap, and
the implementation of the translation has probably seen a few too many
rounds of evolution, but I believe the output is the Right Way To Do
It and is relatively easy to extend to support the rest of the Perl6
features. Okay, maybe it's only One Of The Right Ways To Do It. I'd be
interested in knowing what your output looks like, and encourage you
-- if it's close enough -- to steal my rewrite rules wholesale.
(Actually, the parser used in the languages/perl6 directory is pretty
nice; the one in languages/regex that I use for testing is crap.)

See languages/regex/docs/regex.pod for an introductory document on
this style of rule implementation. I wrote it a while back, and have
added at least one more fundamental concept to what I laid out there,
but it should be more or less valid nonetheless.

I left off working on it long ago to concentrate on beefing up the
Perl6 compiler enough to make it an interesting host language, but
recently I've (locally) added match objects and a parse tree. I have
rules, but I was holding off grammars until someone else braver and
smarter than I beat on Parrot's object support a bit more. The various
cuts are straightforward additions that I've already sketched out, and
everything else is gravy. Heh.

And I'm really wishing I had picked Jako or one of the other
languages/* as a sample host language...

[PATCH] Re: register allocation

2004-08-07 Thread Steve Fink

On Aug-07, Leopold Toetsch wrote:
> Sean O'Rourke <[EMAIL PROTECTED]> wrote:
> > [EMAIL PROTECTED] (Leopold Toetsch) writes:
> >> The interference_graph size is n_symbols * n_symbols *
> >> sizeof(a_pointer). This might already be too much.
> >>
> >> 2) There is a note in the source code that the interference graph could
> >> be done without the N x N graph array. Any hints welcome (Angel Faus!).
> 
> > It looks like the way things are used in the code, you can use an
> > adjacency list instead of the current adjacency matrix for the graph.
> 
> Yeah. Or a bitmap.

An adjacency list would definitely be much smaller, but I'd be
concerned that it would slow down searches too much. I think the
bitmap might be worth a try just to see how much the size matters.

Since this is an n^2 issue, splitting out the four different register
types could help -- except that I'd guess that most code with
excessive register usage probably uses one type of register much more
than the rest.

Anyway, I've attached a patch that uses bitmaps instead of SymReg*'s,
which should give a factor of 32 size reduction. I've only tested it
by doing a 'make test' and verifying that the several dozen test
failures are the same before and after (I don't think things are
actually that broken; I think the make system is), but for all I know
it's not even calling the code. That's what you get when I only have a
two hour hacking window and I've never looked at the code before.

> Or still better, create the interference graph per basic block.
> Should be much smaller then.

Huh? Is register allocation done wholly within basic blocks? I thought
the point of the graph was to compute interference across basic
blocks. I guess I should go and actually read the code.
Index: imcc/reg_alloc.c
===
RCS file: /cvs/public/parrot/imcc/reg_alloc.c,v
retrieving revision 1.14
diff -u -r1.14 reg_alloc.c
--- imcc/reg_alloc.c23 Apr 2004 14:09:33 -  1.14
+++ imcc/reg_alloc.c7 Aug 2004 07:11:08 -
@@ -41,7 +41,7 @@
 static void compute_du_chain(IMC_Unit * unit);
 static void compute_one_du_chain(SymReg * r, IMC_Unit * unit);
 static int interferes(IMC_Unit *, SymReg * r0, SymReg * r1);
-static int map_colors(int x, SymReg ** graph, int colors[], int typ);
+static int map_colors(IMC_Unit *, int x, int * graph, int colors[], int typ);
 #ifdef DO_SIMPLIFY
 static int simplify (IMC_Unit *);
 #endif
@@ -58,12 +58,46 @@
 /* XXX FIXME: Globals: */
 
 static IMCStack nodeStack;
-static SymReg** interference_graph;
-/*
-static SymReg** reglist;
-*/
+static int* interference_graph;
 static int n_symbols;
 
+static int* ig_get_word(int i, int j, int N, int* graph, int* bit_ofs)
+{
+int bit = i * N + j;
+*bit_ofs = bit % sizeof(*graph);
+return &graph[bit / sizeof(*graph)];
+}
+
+static void ig_set(int i, int j, int N, int* graph)
+{
+int bit_ofs;
+int* word = ig_get_word(i, j, N, graph, &bit_ofs);
+*word |= (1 << bit_ofs);
+}
+
+static void ig_clear(int i, int j, int N, int* graph)
+{
+int bit_ofs;
+int* word = ig_get_word(i, j, N, graph, &bit_ofs);
+*word &= ~(1 << bit_ofs);
+}
+
+static int ig_test(int i, int j, int N, int* graph)
+{
+int bit_ofs;
+int* word = ig_get_word(i, j, N, graph, &bit_ofs);
+return *word & (1 << bit_ofs);
+}
+
+static int* ig_allocate(int N)
+{
+// size is N*N bits, but we want don't want to allocate a partial
+// word, so round up to the nearest multiple of sizeof(int).
+int need_bits = N * N;
+int num_words = (need_bits + sizeof(int) - 1) / sizeof(int);
+return (int*) calloc(num_words, sizeof(int));
+}
+
 /* imc_reg_alloc is the main loop of the allocation algorithm. It operates
  * on a single compilation unit at a time.
  */
@@ -446,6 +480,12 @@
 
 /* creates the interference graph between the variables.
  *
+ * data structure is a 2-d array 'interference_graph' where row/column
+ * indices represent the same index in the list of all symbols
+ * (unit->reglist) in the current compilation unit. The value in the
+ * 2-d array interference_graph[i][j] is the symbol unit->reglist[j]
+ * itself.
+ *
  * two variables interfere when they are alive at the
  * same time
  */
@@ -461,7 +501,7 @@
 /* Construct a graph N x N where N = number of symbolics.
  * This piece can be rewritten without the N x N array
  */
-interference_graph = calloc(n_symbols * n_symbols, sizeof(SymReg*));
+interference_graph = ig_allocate(n_symbols);
 if (interference_graph == NULL)
 fatal(1, "build_interference_graph","Out of mem\n");
 unit->interference_graph = interference_graph;
@@ -475,8 +515,8 @@
 if (!unit->reglist[y]->first_ins)
 continue;
 if (interferes(unit, unit->reglist[x], unit->reglist[y])) {
-interference_graph[x*n_symbols+y] = unit->reglist[y];
-interference_graph[y*n_symbols+x] = unit->reglist[x];
+

Re: Regexp::Parser v0.02 on CPAN (and Perl 6 regex question)

2004-07-04 Thread Steve Fink

On Jul-04, Jeff 'japhy' Pinyan wrote:
> I want to make sure I haven't misunderstood anything.  *What* purpose will
> my module that will be able to parse Perl 6 regexes into a tree be?  You
> must be aware that I have no power Damian does not possess, and I cannot
> translate *all* Perl 6 regexes to Perl 5 regexes.  All I can promise is a
> tree structure and limited (albeit correct) translation to Perl 5.

In general, it could perhaps be used as a piece of the implementation
of Perl6-style regexes for any Parrot-hosted language.

Personally, I could see using it with the current prototype perl6
compiler to take over the parsing whenever a regex is seen. The
resulting tree structure would then be translated into a
languages/regex-style tree, and from there converted into PIR
instructions. The translation step could perhaps be skipped if your
parser uses some extensible factory-like pattern so that I could
produce my preferred regex tree nodes directly -- or if I converted my
regex compiler to use your tree nodes as their native representation.

In order to get the parsing correct, however, I would need the ability
to call back into my native perl6 parser when you encounter perl6 code
during your parse -- and perhaps call you again within that code.

I don't know if this is in the scope of what you were planning for
your parser; now I'm wondering if you were intending to write
something akin to Perl6::Rules in that it translates Perl6 rules into
perl5-edible chunks, and all this business of reentrancy and external
parsing callouts is not at all what you're interested in dealing with.

If so, then I would still find a use for it in providing a better
Perl6-style regex parser for languages/regex. It would be used mostly
for testing, but eventually I hope to get around to plugging
languages/regex into Parrot as a directly-callable compiler. This has
the same reentrancy etc. issues for the host language, but then
they're that language's author's problem, not mine. :-)

I'll go download Regexp::Parser now, just so I'm not speculating quite
so much.

Re: Perl 6 regex parser

2004-06-30 Thread Steve Fink

On Jun-27, Jeff 'japhy' Pinyan wrote:
> On Jun 27, Steve Fink said:
> 
> >On Jun-26, Jeff 'japhy' Pinyan wrote:
> >> I am currently completing work on an extensible regex-specific parsing
> >> module, Regexp::Parser.  It should appear on CPAN by early July (hopefully
> >> under my *new* CPAN ID "JAPHY").
> >>
> >> Once it is completed, I will be starting work on writing a subclass that
> >> matches Perl 6 regexes, Regexp::Perl6 (or Perl6::Regexp, or
> >> Perl6::Regexp::Parser).  I think this might be of some use to the Perl 6
> >> dev crew, but I'm not sure how.
> >
> >Sounds interesting, but I'm a bit confused about what it is. Clearly,
> >it parses regexes, but what is the output? A parse tree? Tables and/or
> >code that implement a matching engine for that regex? PIR? A training
> >regimen that can be used to condition a monkey to push a "yes" or "no"
> >button whenever you give it a banana with an input string inscribed on
> >it?
> 
> It creates a tree structure, not identical but similar to the array of
> nodes Perl uses internally.

Ah, good. Then I am interested. When I manage to find some time for
hacking again, I'll graft it onto languages/regex as a replacement for
the ridiculous parser I have there now.

languages/regex is meant to be a language-independent regex engine,
and has a silly stub parser to get basic stuff into it for testing.
languages/perl6 uses the engine too, but provides its own parser. But
nobody's done anything with that parser since Sean O'Rourke stopped
working on it (admittedly, he implemented a surprisingly large portion
of the syntax), and it'd be great to be working with something that's
maintained. (But mostly I like the idea of using a
language-independent front-end with a language-independent backend.)

I should just look at the code, but I'm wondering what you do with
language-specific constructs. Embedded code, for example. How do you
find the end of it? And will you be supporting things like Perl6's

  / $x := (a*b) /

where '$x' is a language-dependent variable name syntax?

Re: Perl 6 regex parser

2004-06-27 Thread Steve Fink

On Jun-26, Jeff 'japhy' Pinyan wrote:
> I am currently completing work on an extensible regex-specific parsing
> module, Regexp::Parser.  It should appear on CPAN by early July (hopefully
> under my *new* CPAN ID "JAPHY").
> 
> Once it is completed, I will be starting work on writing a subclass that
> matches Perl 6 regexes, Regexp::Perl6 (or Perl6::Regexp, or
> Perl6::Regexp::Parser).  I think this might be of some use to the Perl 6
> dev crew, but I'm not sure how.

Sounds interesting, but I'm a bit confused about what it is. Clearly,
it parses regexes, but what is the output? A parse tree? Tables and/or
code that implement a matching engine for that regex? PIR? A training
regimen that can be used to condition a monkey to push a "yes" or "no"
button whenever you give it a banana with an input string inscribed on
it?

If it just parses the regex, then I would be interested in it for both
languages/regex and languages/perl6. If it does more, then you're on
your own, because it's been difficult enough to graft the current
regex engine onto the languages/perl6 code; I have no problems with
someone doing the same for a different engine, but I'm not going to be
the one!

Also, I find that regex stack overflows can sometimes trigger the
monkey to begin wildly throwing feces, and I've no desire to
experience that again.

Re: Simple trinary ops?

2004-06-17 Thread Steve Fink

On Jun-16, Dan Sugalski wrote:
> At 8:24 PM +0200 6/16/04, Leopold Toetsch wrote:
> >Dan Sugalski <[EMAIL PROTECTED]> wrote:
> >> I'm wondering if it'd be useful enough to be worthwhile to have
> >> non-flowcontrol min/max ops. Something like:
> >
> >> min P1, P2, P3
> >> max P1, P2, P3
> >
> >Which cmp operation of the three we have? I smell opcode bloat.
> 
> Yeah, I've already given up on it. :)

### min P1, P2, P3 ###
  isgt I0, P2, P3
  choose P1, I0, P2, P3

Re: $ENV{ICU_DATA_DIR}

2004-05-31 Thread Steve Fink

On May-31, Nicholas Clark wrote:
> On Sat, May 29, 2004 at 11:03:12PM -0700, Steve Fink wrote:
> 
> > +/* DEFAULT_ICU_DATA_DIR is configured at build time, or it may be
> > +   set through the $ICU_DATA_DIR environment variable. Need a way
> > +   to specify this via the command line as well. */
> > +data_dir = Parrot_getenv("ICU_DATA_DIR", &free_data_dir);
> 
> Is ICU_DATA_DIR something the ICU folks define? Or something we define?
> And if the latter, shouldn't it be called PARROT_ICU_DATA_DIR?

Fair enough. The rename has been committed.

Q: MMD splice

2004-05-31 Thread Steve Fink

The Perl6 compiler often appends on array onto another. Right now, it
does this by iterating over the source array. I would like to just use
the C op, but I am now getting a mixture of PerlArrays (from
Perl6) and Arrays (from C), and the C vtable entry
only works if the types of the arrays is identical.

This sounded to me like a perfect application of MMD: I could define
splice(PerlArray,PerlArray), splice(PerlArray,Array), and
splice(other, other). But my initial foray into the MMD code raised
enough questions that I was hoping I could borrow a few clues from
someone who's been paying more attention to things.

Is this a reasonable thing to do with the current MMD setup? Where
should I register my routines? The MMD stuff I looked at seemed
intended only for binary C operators; splice
needs to dispatch on only its two PMC entries, but has a function
signature that takes a few other parameters (integer offset and
count).

Re: $ENV{ICU_DATA_DIR}

2004-05-31 Thread Steve Fink

On May-30, Leopold Toetsch wrote:
> Steve Fink <[EMAIL PROTECTED]> wrote:
> 
> > Anyone mind if I commit this?
> 
> The patch is fine.
> 
> > ... One thing I'm not sure of, though -- I
> > try to behave myself and use Parrot_getenv rather than a plain
> > getenv(), but I'm not convinced the API is complete -- Parrot_getenv
> > saves back a boolean saying whether to free the returned string or
> > not, but what should I call to free it?
> 
> It's for Win32. config/gen/platform/win32/env.c uses
> C, so it should be C.

Thanks, patch applied. I went to add documentation on Parrot_getenv,
but found that it was already there in platform_interface.h. Doh!

$ENV{ICU_DATA_DIR}

2004-05-29 Thread Steve Fink

Anyone mind if I commit this? One thing I'm not sure of, though -- I
try to behave myself and use Parrot_getenv rather than a plain
getenv(), but I'm not convinced the API is complete -- Parrot_getenv
saves back a boolean saying whether to free the returned string or
not, but what should I call to free it? I could call
Parrot_free_memalign, but who said anything about alignment? Perhaps
Parrot_getenv should simply return the string, and we should have a
platform-specific Parrot_freeenv that either frees the string or is a
no-op?

Whatever. For now, I'm just calling free() if Parrot_getenv() tells me
to. Which it probably never will.
Index: src/string.c
===
RCS file: /cvs/public/parrot/src/string.c,v
retrieving revision 1.202
diff -u -r1.202 string.c
--- src/string.c25 May 2004 08:34:24 -  1.202
+++ src/string.c30 May 2004 05:55:29 -
@@ -243,9 +243,18 @@
 string_init(Parrot_Interp interpreter)
 {
 size_t i;
-/* DEFAULT_ICU_DATA_DIR is configured at build time. Need a way to
-specify this at runtime as well. */
-string_set_data_directory(DEFAULT_ICU_DATA_DIR);
+char *data_dir;
+int free_data_dir = 0;
+
+/* DEFAULT_ICU_DATA_DIR is configured at build time, or it may be
+   set through the $ICU_DATA_DIR environment variable. Need a way
+   to specify this via the command line as well. */
+data_dir = Parrot_getenv("ICU_DATA_DIR", &free_data_dir);
+if (data_dir == NULL)
+data_dir = DEFAULT_ICU_DATA_DIR;
+string_set_data_directory(data_dir);
+if (free_data_dir)
+free(data_dir);
 /*
 encoding_init();
 chartype_init();

Re: First draft, IO & event design

2004-05-29 Thread Steve Fink

On May-25, Dan Sugalski wrote:
> At 10:31 AM +0200 5/25/04, Leopold Toetsch wrote:
> >Dan Sugalski <[EMAIL PROTECTED]> wrote:
> >> An unsolicited event, on the other hand, is one that parrot generates
> >> as the result of something happening external to itself, or as the
> >> result of some recurring event happening. Signals and GUI events, for
> >> example, are unsolicted as are recurring timer events.
> >
> >I don't think that there is much difference between these two types of
> >events. You don't get signals if you don't do the appropriate sigaction
> >call. You ask the OS for an one-shot timer or for a recurring one, so
> >you'll get one or more events. That's all known.
> 
> The difference there is that a solicited event is one you have asked 
> for *and* received an event/request object for, so you can identify 
> the request/event as it makes its way through a stream. You can't do 
> that with the unsolicited ones, since you don't know they exist until 
> they've shown up.

Perhaps that's a better thing to use to describe them, then. I
understand the intuitive difference between expected/unexpected or
solicited/unsolicited, but upon closer examination that particular
difference gets really fuzzy. Perhaps registered/unregistered? You
really want to say that you have a handle ("event"? "object"?)
associated with your solicited event, but I don't know how to turn
that into an adjective. Preallocated? Prepared? Identified? Labeled?
Named? Tracked? Hey, that last one might work.

Re: compiler-faq

2004-05-29 Thread Steve Fink

On May-29, Brent 'Dax' Royal-Gordon wrote:
> William Coleda wrote:
> >=head2 How do I generate a sub call with a variable-length parameter 
> >list in PIR?
> >
> >This is currently not trivial. 
> ...
> >=head2 How do I retrieve the contents of a variable-length parameter 
> >list being passed to me?
> >
> >The easiest way to do this is to use the C opcode to take a 
> >variable
> >number of PMC arguments and wrap them in an C PMC.
> 
> I may just be an idiot, but why can't someone just write C 
> (or somesuch) as the complement of C?

You mean .flatten_arg? It's not an op, it's a PIR directive, but it
sounds like what the question is looking for. (And an op would make it
faster.)

Re: PARROT_API, compiler and linker flags (was TODO: Linker magic step for configure)

2004-05-23 Thread Steve Fink

On May-15, Jeff Clites wrote:
> 
> When linking against ("using") a static library version of ICU, we need 
> a C++-aware linker (because ICU contains C++ code); with a 
> dynamic-library version of ICU presumably we wouldn't.

I don't know if this applies here, but there is a good reason to use a
C++-compatible linker even if you aren't including any C++ code. By
default, many C linkers will not allow C++ exceptions to propagate
through their stack frames. Unwinding the stack for an exception
requires some additional information stored in the stack frames.

I had to compile my own version of perl for my day job code, since we
are writing in C++ and embedding the Perl interpreter, and if C++
calls perl calls C++ throws an exception, then I need the outer C++
try block to catch the exception.

Re: P6C: Parser Weirdness

2004-05-11 Thread Steve Fink

Top-down and bottom-up are not mutually exclusive. At least not
completely. But self-modifying parsers are *much* easier to do with
top-down than bottom-up, because the whole point of bottom-up is that
you can analyze the grammar at "compile" (parser generation) time, and
propagate the knowledge throughout the rule engine.

A simple example is /fish|job|petunia/. Rather than trying to match
/fish/ and upon failure, trying /job/, and as a last resort /petunia/,
you could do a dispatch table on the first letter and never have to
fall back to anything. In the case of /fish|job|/, however,
you can't guarantee that FIRST() will always be "p".

You could generate both parsers and then use a notification or
dependency system to pick which to use, but done naively you end up
with an exponential number of parsers and the logic is likely to be
both slow and a bug magnet.

You could compile the parser assuming everything is final and then at
runtime regenerate the entire parser if anything changes. Which
wouldn't work for long, but perhaps you could break the grammar down
into components that are individually compiled bottom-up, but
coordinated top-down. Then you could limit the scope of recompiles in
the bottom-up components, and not need recompiles in the top-down
structure. The decisions of where to break things down would be quite
similar to a regular compiler's inlining decisions. It may be that
grammars are just too recursive for this to help much, though.

I suspect a slight variant of the above may work best. Rather than
doing a full-out LALR(1) parser for the bottom-up components, you'd do
a somewhat more naive but still table-driven (shift/reduce) parser,
carefully limiting what it is assuming about the FIRST() etc. of the
rules within it. That should limit the impact of changes, and simplify
the logic of what needs to be done differently when a change is
detected.

On May-11, Matt Fowles wrote:
>
> Perhaps Perl 6 grammars should provide an is parsed trait that
> allows one to specify which type of parsing to use, then we could
> dictate that the default behavior for parsing or perl itself is
> shift reduce parsing rather than recursive descent.

Optimization hints could also be very helpful, or we could even
default to a total recursive-descent parser and only attempt bottom-up
precomputation if the grammar author specifically says it's ok. The
main problem being that people will say lots of things if it makes
their code faster, without having any idea what it actually means.

Re: P6C: Parser Weirdness

2004-05-11 Thread Steve Fink

On May-10, Joseph Ryan wrote:
> 
> The Parse::RecDescent in parrot/lib is a hacked version that removes
> a bunch of stuff (tracing code, iirc) from the outputted grammer so
> that it runs many orders faster than the regular version.  Or, to
> put it another way, it increases P6C's runspeed from "infuriating"
> to "slow" :)

I think I've been told that at least once before, and forgotten. Is
there a good place on the wiki FAQ where this could be inserted?

Re: [PATCH: P6C] update calling convention

2004-05-10 Thread Steve Fink

On May-09, Allison Randal wrote:
> > BTW, should I keep working on P6C? As A12 has just come out P6C may be
> > heavily under construction, and I don't want to be in the way...
> 
> Please do. I'm working on a first rough implementation of classes, but
> it shouldn't interfere with general patches.

I am very slowly working on a set of changes that both your patch and
Allison's last patch interfere with -- but that's purely my problem,
because I still don't have the time to finish it off enough to commit
anytime soon. So please go ahead and work away; I'll deal with merging
my own mess eventually.

Is anyone other than the three of us currently working on P6C at all?
Just curious.

Re: P6C: Parser Weirdness

2004-05-10 Thread Steve Fink

On May-09, Abhijit A. Mahabal wrote:
> On Sat, 8 May 2004, Abhijit A. Mahabal wrote:
> 
> > I was writing a few tests for the P6 parser and ran into a weird problem.
> > If I have the following in a file in languages/perl6, it works as
> > expected:
> 
> [...]
> 
> > Now, if I put exactly the same contents in a file in
> > languages/perl6/t/parser, then I get the following error:
> 
> 
> Okay, I traced the problem to a "use FindBin" in P6C::Parser.pm. Is it
> okay to change
> 
>   use lib $FindBin::Bin/../../lib;
> 
> to
> 
>   use lib ../../lib;
> 
> or is there a good reason not to? 

Neither of those seems right to me. The first keys off of the position
of the binary, which could be anywhere with respect to the library
module you're in; the second is relative to whatever the current
directory is while you're running the script. I would think that
something like

  use File::Basename qw(dirname);
  use lib dirname($INC{"P6C/Parser.pm"})."/../../../../lib";

(untested and probably not quite the right path) would be better. Or
perhaps it should be ripped out entirely, and any script using
P6C::Parser should be required to set the lib path correctly? It
partly depends on whether we want to ensure that P6C::Parser
preferentially uses the Parse::RecDescent from parrot/lib rather than
a system-provided one. Which probably is not the case?

> The current version makes it necessary
> to put all files that "use P6C::Parser" in the same directory, and the
> change would allow:
> 
>   perl t/parser/foo.t
> 
> to work.

Just make sure

  cd t; perl parser/foo.t

works too.

Re: Q: status of IntList

2004-04-25 Thread Steve Fink

On Apr-21, Leopold Toetsch wrote:
> Is IntList used outside of some tests?
> Can we rename it to IntvalArray?

Yes, it is used in the languages/regex compiler (at least when
embedded in Perl6, but IIRC in all cases.)

But yes, go ahead and rename it.

Re: [CVS ci] cpu specfic config

2004-04-25 Thread Steve Fink

On Apr-24, Leopold Toetsch wrote:
> I've extended the config system by CPU specific functionality:
> - new config step config/gen/cpu.pl is run after jit.pl
> - this step probes for config/gen/cpu/$cpu/auto.pl and runs it if present
> 
> For i386 we have:
> - a new tool as2c.pl, which creates hopefully platform independent C 
> code from a gcc source file
> - memcpy_mmx*.c
> - and i386/auto.pl which runs this file as a test and sets config vars
> 
> Next step will be to incorporated these files in platform code.

And then we will have a natural choice for the next code name. Parrot
will be the fastest bird in existence, but still won't quite fly, so
let's call it a Cheeto (cheetah + dodo).

Re: ICU data file location issues

2004-04-17 Thread Steve Fink

On Apr-14, Jeff Clites wrote:
> For Unix platforms at least, you should be able to do this:
> 
>   executablePath = isAbsolute($0) ? dirname($0) : cwd().dirname($0)

Nope.

  sub executablePath {
return dirname($0) if isAbsolute($0);
return cwd().dirname($0) if hasSlash($0);
foreach dir in $PATH {
  return $dir if -x "$dir/$0";
}
return "bastard process";
  }

which is why on Linux I give up on portability and say:

  return readlink("/proc/self/exe");

(ok, to match that'd need to be dirname(readlink(...)))

Re: [RESEND] [PATCH] Interpreter PMC

2004-04-10 Thread Steve Fink

On Apr-09, Will Coleda wrote:
> Subject: [perl #16414] [PATCH] Interpreter PMC
> Created: 2004-04-09 02:59:29
> Content: There is now a ParrotInterpreter class which seems to provide 
> most of this functionality
> - Is there anything you feel is still missing, or can we resolve the 
> call?

Seems good enough. I resolved the ticket.

Re: Windows tinder builds

2004-03-28 Thread Steve Fink

On Mar-26, Dan Sugalski wrote:
> The VS/.NET build works fine, though three of the tests fail for odd 
> reasons. Those look like potential test harness errors.
> 
> The cygwin build sorta kinda works OK, but the link fails because of 
> a missing _inet_pton. I seem to remember this cropping up in the past 
> and I thought we'd gotten it fixed, but apparently not.
> 
> Anyway, these are set for hourly builds at half-hour offsets, so if 
> you check in any significant changes it'd be advisable to take a look 
> at the results. For those that don't know, all the tinderbox info is 
> web-accessable at 
> http://tinderbox.perl.org/tinderbox/bdshowbuild.cgi?tree=parrot

A couple of weeks back, I also beefed up the error parsing stuff a bit
on my Parrot tinderbox summarizer at
http://foxglove.dnsalias.org/parrot/ so that it should give a good
"what's broken and why" summary at a glance.

Note that because I only run the summarizer once an hour at 17 minutes
past the hour, it is not the right place to go if you're watching to
see if your commit broke anything. For that, see the official URL
above.

Re: Safety and security

2004-03-24 Thread Steve Fink

On Mar-24, Dan Sugalski wrote:
> At 12:36 PM +1100 3/24/04, [EMAIL PROTECTED] wrote:
> >On 24/03/2004, at 6:38 AM, Dan Sugalski wrote:
> >
> >This is a question without a simple answer, but does Parrot provide 
> >an infrastructure so that it would be possible to have 
> >proof-carrying[1] Parrot bytecode?
> 
> In the general sense, no. The presence of eval and the dynamic nature 
> of the languages we're looking at pretty much shoots down most of the 
> provable bytecode work. Unfortunately.

? I'm not sure if I understand why. (Though I should warn that I did
not read the referenced paper; my concept of PCC comes from reading a
single CMU paper on it a couple of years ago.) My understanding of PCC
is that it freely allows any arbitrarily complex code to be run, as
long as you provide a machine-interpretable (and valid) proof of its
safety along with it. Clearly, eval'ing arbitrary strings cannot be
proved to be safe, so no such proof can be provided (or if it is, it
will discovered to be invalid.) But that just means that you have to
avoid unprovable constructs in your PCC-boxed code.

Eval'ing a specific string *might* be provably safe, which means that
we should have a way for an external (untrusted) compiler to not only
produce bytecode, but also proofs of the safety of that bytecode. We'd
also need, of course, the trusted PCC-equipped bytecode loader to
verify the proof before executing the bytecode. (And we'd need that
anyway to load in and prove the initial bytecode anyway.)

This would largely eliminate one of the main advantages of PCC, namely
that the expensive construction of a proof need not be paid at
runtime, only the relatively cheap proof verification. But if it is
only used for small, easily proven eval's, then it could still make
sense. The fun bit would be allowing the eval'ed code's proof to
reference aspects of the main program's proof. But perhaps the PCC
people have that worked out already?

Let me pause a second to tighten the bungee cord attached to my
desk -- all this handwaving, and I'm starting to lift off a little.

The next step into crazy land could be allowing the proofs to express
detailed properties of strings, such that they could prove that a
particular string could not possibly compile down to unsafe bytecode.
This would only be useful for very restricted languages, of course,
and I'd rather floss my brain with diamond-encrusted piano wire than
attempt to implement such a thing, but I think it still serves as a
proof of concept that Parrot and PCC aren't totally at odds.

Back to reality. I understand that many of Parrot's features would be
difficult to prove, but I'm not sure it's fundamentally any more
difficult than most OO languages. (I assume PCC allows you to punt on
proofs to some degree by inserting explicit checks for unprovable
properties, since then the guarded code can make use of those
properties to prove its own safety.)

Re: imcc concat and method syntax

2004-03-13 Thread Steve Fink

On Mar-13, Luke Palmer wrote:
> luka frelih writes:
> > >But how should the two interpretations of x.x be resolved? Is that
> > >concatenation or method calling?
> > 
> > currently, the  pir line
> > S5 = S5 . 'foo'
> > produces
> > error:imcc:object isn't a PMC
> > 
> > concatenation with . seems to be gone
> > i cannot think of a good replacement for it
> 
> Well, Perl 6 uses ~ .  I think that would be a fair replacement:
> 
> S5 = S5 ~ 'foo'

That currently means binary xor in imcc, so if we changed it we'd
break compatibility with current compilers and scripts.

OTOH, it sounds like I already broke it by changing the outcome of the
ambiguous x.x interpretation -- oops. I can change it back with
precedence games, but would rather not exert the effort, since I think
using ~ is a better way to go. (Barring other better ideas, that is.)
I tend to use the 'concat' op in my own code anyway. So I'll abide by
Leo's or Melvin's ruling.

Re: [BUG] can not call methods with "self"

2004-03-13 Thread Steve Fink

On Mar-12, Leopold Toetsch wrote:
> Steve Fink <[EMAIL PROTECTED]> wrote:
> > The attached patch should remove all of the conflicts, and replace
> > them with a single shift/reduce conflict that appears to be a bug in
> > the actual grammar, namely:
> 
> >   x = x . x
> 
> Ah yes. Or course, Thanks a lot, applied.

But how should the two interpretations of x.x be resolved? Is that
concatenation or method calling?

Re: Methods and IMCC

2004-03-12 Thread Steve Fink

On Mar-12, Dan Sugalski wrote:
> At 9:49 AM +0100 3/12/04, Leopold Toetsch wrote:
> >Dan Sugalski wrote:
> >
> >>Calling a method:
> >>
> >>   object.variable(pararms)
> >
> >Do we need the more explicit pcc_call syntax too:
> >
> >   .pcc_begin
> >   .arg x
> >   .meth_call PObj, ("meth" | PMeth ) [, PReturnContinuation ]
> >   .result r
> >   .pcc_end
> 
> Sure. Or we could make it:
> 
>.pcc_begin
>.arg x
>.object y
>.meth_call "foo"
>.result r
>.pcc_end
> 
> to make things simpler.

I vote yes -- until we add AST input to imcc, making the args and
invocant be line-oriented makes code generation easier for the Perl6
compiler, at least. (Although I might do it the 1st way anyway, just
because I spend so much time staring at generated code.)

But I had to stare at the ".object" for a second before I realized you
weren't just giving the type of another arg -- would it be better to
use ".invocant"?

Re: [BUG] can not call methods with "self"

2004-03-12 Thread Steve Fink

On Mar-11, Leopold Toetsch wrote:
> Jens Rieks <[EMAIL PROTECTED]> wrote:
> 
> > attached is a patch to t/pmc/object-meths.t that adds a test that is
> > currently failing because IMCC rejects code like self."blah"()
> 
> Yep. It produces reduce/reduce conflicts. Something's wrong with
> precedence. I'd be glad if someone can fix it.

The attached patch should remove all of the conflicts, and replace
them with a single shift/reduce conflict that appears to be a bug in
the actual grammar, namely:

  x = x . x

can be parsed as

  x = x . x
  VAR '=' VAR '.' VAR
  target '=' var '.' var
  assignment

or

  x = x . x
  VAR '=' VAR '.' VAR
  target '=' target ptr target 
  target '=' the_sub
  target '=' sub_call
  assignment

Personally, I'd probably also rename 'target' to 'lhs', and 'var' (and
its variants) to 'rhs'. But maybe that's just me. Oh, and 'lhs' is
available because this patch eliminates it.

I didn't try the test mentioned, though.
Index: imcc/imcc.y
===
RCS file: /cvs/public/parrot/imcc/imcc.y,v
retrieving revision 1.125
diff -u -r1.125 imcc.y
--- imcc/imcc.y 11 Mar 2004 16:37:56 -  1.125
+++ imcc/imcc.y 12 Mar 2004 08:33:49 -
@@ -272,7 +272,7 @@
 %type  key keylist _keylist
 %type  vars _vars var_or_i _var_or_i label_op
 %type  pasmcode pasmline pasm_inst
-%type  pasm_args lhs
+%type  pasm_args
 %type  targetlist arglist
 %token  VAR
 %token  LINECOMMENT
@@ -784,7 +784,7 @@
 { $$ = MK_I(interp, cur_unit, "bxor", 3, $1, $3, $5); }
| target '=' var '[' keylist ']'
 { $$ = iINDEXFETCH(interp, cur_unit, $1, $3, $5); }
-   | var '[' keylist ']' '=' var
+   | target '[' keylist ']' '=' var
 { $$ = iINDEXSET(interp, cur_unit, $1, $3, $6); }
| target '=' NEW classname COMMA var
 { $$ = iNEW(interp, cur_unit, $1, $4, $6, 1); }
@@ -850,9 +850,9 @@
if ($1->set != 'P')
   fataly(1, sourcefile, line, "Sub isn't a PMC");
  }
-   | lhs ptr IDENTIFIER { cur_obj = $1; $$ = mk_sub_address($3); }
-   | lhs ptr STRINGC{ cur_obj = $1; $$ = mk_const($3, 'S'); }
-   | lhs ptr target { cur_obj = $1; $$ = $3; }
+   | target ptr IDENTIFIER { cur_obj = $1; $$ = mk_sub_address($3); }
+   | target ptr STRINGC{ cur_obj = $1; $$ = mk_const($3, 'S'); }
+   | target ptr target { cur_obj = $1; $$ = $3; }
;
 
 ptr:POINTY { $$=0; }
@@ -916,11 +916,6 @@
| reg
;
 
-lhs:
- VAR/* duplicated because of reduce conflict */
-   | reg
-   ;
-
 vars:
  /* empty */   {  $$ = NULL; }
| _vars {  $$ = $1; }
@@ -933,7 +928,7 @@
 
 _var_or_i:
  var_or_i  {  regs[nargs++] = $1; }
-   | lhs '[' keylist ']'
+   | target '[' keylist ']'
{
   regs[nargs++] = $1;
   keyvec |= KEY_BIT(nargs);
@@ -952,8 +947,7 @@
;
 
 var:
- VAR
-   | reg
+ target
| const
;

CVS update warning

2004-02-21 Thread Steve Fink

 .
 .
 .
 P docs/pmc/subs.pod
 cvs server: internal error: unsupported substitution string -kCOPY
 U docs/resources/parrot.small.png
 U docs/resources/perl-styles.css
 cvs server: internal error: unsupported substitution string -kCOPY
 U docs/resources/up.gif
 .
 .
 .

Should those perhaps be -kb or -ko? My version of CVS certainly
doesn't know COPY, nor have I ever heard of it.

Re: Objects: Now or forever (well, for a while) hold your peace

2004-02-19 Thread Steve Fink

On Feb-19, Dan Sugalski wrote:
> At 7:30 PM -0500 2/18/04, Simon Glover wrote:
> > One really pedantic comment: wouldn't it make sense to rename the
> > fetchmethod op to fetchmeth, for consistency with callmeth, tailcallmeth
> > etc?
> 
> Good point. I'll change that, then.

D yo reall wan t repea C's infamou "creat" (mis)spellin? Admittedl, i
i no ver ambiguou i thi cas, becaus ther ar alread s man letter, bu
stil, tw extr letter i a smal pric t pa...

Re: Re: RT Cleanup

2004-02-04 Thread Steve Fink

Andrew Dougherty wrote:

On Wed, 4 Feb 2004, Steve Fink wrote:

 

On Feb-02, Andrew Dougherty wrote:
   

[EMAIL PROTECTED] 19184  languages/perl6/t/rx/call test error 1 years

Keep this one open.  The tests still fail.
 

How recently did you check? I committed a reimplementation of perl6
regexes about a week ago. The above test still failed, but only due to
a parrot memory corruption bug, and I committed something else the
next day that coincidentally sidestepped the bug on my machine.
   

It's probably a different bug than #19184, but here's what I
just got for
cd languages/perl6
make test
(This is for perl5.00503, Solaris 8/SPARC, Sun Workshop compiler)

Try
   cd languages/perl6
   ./perl6 --force-grammar -e 1 # don't worry if it fails
   make test
Except I never do 'make test' because, as you noticed, it takes forever 
to run. Use
   ./perl6 --test

instead. (Or, in this case, maybe just ./perl6 --test t/rx/*.t)

The slowest part of the perl6 compiler is simply loading in the 
Parse::RecDescent parser. That line loads it in once and reuses it for 
all the tests.

I think nobody ever changed 'make test' to use it because if one test 
kills the process, then all remaining tests fail too. But perhaps I 
should have make test print at the very end:

 Hey, that took forever, didn't it? Maybe you should try using
   ./perl6 --test
 instead, as documented in [I forget where, and can't look it up right 
now].

t/rx/basic..Read on closed filehandle  at P6C/TestCompiler.pm line 71.
Use of uninitialized value at ../../lib/Parrot/Test.pm line 87.
# Failed test (t/rx/basic.t at line 7)
#  got: 'error:imcc:main: Error reading source file t/rx/basic_1.pasm.
# '
# expected: 'ok 1
# ok 2
# ok 3
# ok 4
# ok 5
# ok 6
# ok 7
# ok 8
# ok 9
# '
Odd... I'll take a look tonight, thanks.

Finally, is it just me, or do these tests take a long time for
everyone?  Today, it took 21 minutes to run the perl6 test suite.
While I appreciate the value of a comprehensive test suite, I wonder
if there might be some way to speed things up a bit  (apart from
buying a faster machine, of course!)
Oh, and the perl6 test suite is *far* from comprehensive. It's just slow.

Re: RT Cleanup

2004-02-04 Thread Steve Fink

On Feb-02, Andrew Dougherty wrote:
> [EMAIL PROTECTED] 19184  languages/perl6/t/rx/call test error
>  1 years
> 
> Keep this one open.  The tests still fail.

How recently did you check? I committed a reimplementation of perl6
regexes about a week ago. The above test still failed, but only due to
a parrot memory corruption bug, and I committed something else the
next day that coincidentally sidestepped the bug on my machine.

I would find it easy to believe that it is still happening on other
machines. Could you give it a try and let me know? All perl6 tests
should be passing right now.

IMC returning ints

2004-01-19 Thread Steve Fink

I did a cvs update, and it looks like imcc doesn't properly return
integers anymore from nonprototyped routines. Or maybe it never did,
and the switchover from nonprototype being the default to prototyped
is what triggered it (because I had to add some explicit
non_prototyped declarations, although I suspect they are incorrect.)

Test patch is attached, test case is:

.pcc_sub _main
$P0 = newsub _L_closure2
$I0 = 17
.pcc_begin non_prototyped
.arg $I0
.pcc_call $P0
L_after_call7:
.result $I1
.pcc_end
after_call:
print "returned "
print $I1
print "\n"
end
.end

.pcc_sub _L_closure2 non_prototyped
.param int value
.pcc_begin_return
.return value
.pcc_end_return
.end
? imcc/tc
Index: imcc/t/syn/pcc.t
===
RCS file: /cvs/public/parrot/imcc/t/syn/pcc.t,v
retrieving revision 1.31
diff -p -u -b -r1.31 pcc.t
--- imcc/t/syn/pcc.t20 Jan 2004 01:50:47 -  1.31
+++ imcc/t/syn/pcc.t20 Jan 2004 01:58:28 -
@@ -1,6 +1,6 @@
 #!perl
 use strict;
-use TestCompiler tests => 34;
+use TestCompiler tests => 36;
 
 ##
 # Parrot Calling Conventions
@@ -96,6 +96,60 @@ CODE
 10
 20
 30
+OUT
+
+output_is(<<'CODE', <<'OUT', "non-prototyped int return");
+.pcc_sub _main
+   $P0 = newsub _L_closure2
+$I0 = 17
+   .pcc_begin non_prototyped
+   .arg $I0
+   .pcc_call $P0
+L_after_call7:
+   .result $I1
+   .pcc_end
+after_call:
+print "returned "
+print $I1
+print "\n"
+end
+.end
+
+.pcc_sub _L_closure2 non_prototyped
+   .param int value
+.pcc_begin_return
+.return value
+.pcc_end_return
+.end
+CODE
+returned 17
+OUT
+
+output_is(<<'CODE', <<'OUT', "prototyped int return");
+.pcc_sub _main
+   $P0 = newsub _L_closure2
+$I0 = 17
+   .pcc_begin prototyped
+   .arg $I0
+   .pcc_call $P0
+L_after_call7:
+   .result $I1
+   .pcc_end
+after_call:
+print "returned "
+print $I1
+print "\n"
+end
+.end
+
+.pcc_sub _L_closure2 prototyped
+   .param int value
+.pcc_begin_return
+.return value
+.pcc_end_return
+.end
+CODE
+returned 17
 OUT
 
 ##

Re: cvs commit: parrot/imcc/t/syn pcc.t

2004-01-19 Thread Steve Fink

On Jan-15, Melvin Smith wrote:
> At 11:20 AM 1/15/2004 +0100, Leopold Toetsch wrote:
> >Melvin Smith <[EMAIL PROTECTED]> wrote:
> >>
> >>   For some reason 1 test in pcc.t is failing (the nci call)
> >
> >Off by one error caused by:
> >
> >>   -for (j = 0; j < 4; j++) {
> >
> >>   +for (set = 0; set < REGSET_MAX; set++) {
> >
> >As most loops inside Parrot use the former scheme, I'd rather keep it
> >then switching to an (it seems) error prone variant "set <= REGSET_MAX"
> 
> I like my version because it is self-documenting.

I think these are #defines, but for enums I always use the pattern:

enum {
CLR_BLUE,
CLR_RED,
CLR_VOMIT_GREEN,
NUM_COLORS/* or CLR_COUNT, or CLR_ENTRIES, or ... */
};

for (i = 0; i < NUM_COLORS; i++) ...

So how about a REGSET_SIZE?

Memory corruption

2004-01-19 Thread Steve Fink

Here's another obnoxious test case. I started to try to strip it down,
but it starts working again if I even delete nonsense lines from a
subroutine that is never called. And I'm working on something else and
not at all in the mood to re-learn how to debug parrot internals. It
turns out that I don't get the crash when running JITted, so I think
I'll just do that for now.

So, in case anyone is curious (hi leo!), attached is a 56KB (<9KB
gzipped) imc file.

It crashes on a memcpy inside compact_pool (triggered by new_hash).
b->buflen is obviously corrupted. Using -G to disable garbage
collection (does that work?) doesn't seem to help matters at all.

Deleting the __setup sub at the end of the file makes the problem go
away. (Note that __setup is never actually called, and the body of the
routine is irrelevant other than its length.)


dead.imc.gz
Description: GNU Zip compressed data

Re: [RFC] IMCC pending changes request for comments

2003-12-02 Thread Steve Fink

On Dec-02, Melvin Smith wrote:
> 
> 1) Currently typenames are not checked except with 'new '

I would vote for no aliases at all. I propagated the existing uses of
".local object" in the Perl6 compiler and introduced several more
uses, but that was only because I wasn't sure at the time whether it
was intended to (now or in the future) have slightly different
semantics. It wasn't, I'm pretty sure. So I'll switch Perl to using
'pmc' if you make the change.

> 2) In the absence of some sort of return instruction, subs 
> currently just
> run off the end of their code and continue merrily. This 
> feature really
> isn't useful as far as I can see because it is not supported to 
> assume
> any ordering between compilation unit, which a sub _is_.
> 
> It is easier to just assume a sub returns by the active 
> convention.
> 
> I'd like to just be able to write void subs as:
> 
> .sub _foo
>print "hello\n"
> .end

Do you really want to generate the extra unused code at the end of all
the subroutines? I don't think you want to autodetect whether the code
is guaranteed to return.

How about adding a return value specification if you want this
generated automatically:

  .sub _gorble () returns (int r)
r = 5
  .end

  .sub _foo () returns void
print "hello\n"
  .end

(This assumes you're creating implicit locals for return values as
well as parameters, as you described in #3.)

> 3) Strict prototyping mode shortcut (backwards compatible of 
> course):
> As usual, shortcuts are for hand-hackers, but its easier to 
> debug either way.
> 
>.sub _baz (pmc p, int i)
>   ...
>.end
> 
>Same as:
> 
>.sub _baz protoyped
>   .param pmc p
>   .param int i
>   ...
>.end

Sounds good to me; debugging the Perl6 sub calling stuff would have
been easier if I didn't have to read so much code to figure out what
was going on.

Re: Another minor task for the interested

2003-11-21 Thread Steve Fink

On Nov-21, Dan Sugalski wrote:
> 
> I was mostly thinking that some step or other in the Makefile has a
> dependency on that file, and some other step creates it, but the
> dependency's not explicit. I'd like to find the step(s) that require it
> and make it a dependency for them, then add in a dependency for the file
> for whatever step actually creates it, so things get ordered properly. It
> should (hopefully) be straightforward, but...

I have other evidence of dependency entanglement -- fairly often, I do
a 'make; make test' (which I think is equivalent to 'make test' both
theoretically and practically), and I'll have a bunch of tests fail.
Doing a 'make clean; make test' fixes the failures. (Ok, sometimes it
requires a re-Configure.pl too, but that's another issue.)

There is a known dependency gap due to the recursive invocation of
classes/Makefile, but I don't think that is causing either of these
problems.

Random idea for this problem and the zillions of similar problems
people face all the time with make: it would be cool to patch ccache
so that it reports its cache misses. And just to be anal, add a flag
saying 'for this run, do not evict things from the cache to restrict
space usage.'

Then, when you do a 'make' that doesn't remake enough, you could do

  make clean
  export CC="ccache --report-misses=/tmp/misses.txt --no-evictions gcc"
  make

and you could look at the first miss to see an example of something
that needed to be rebuilt, but did not have a dependency triggering
it.

(I can see at least one way in which this scheme is still not
guaranteed to be correct -- there's a reasonable chance you would hit
in the cache from an unrelated compile, and thus fail to see the first
missing dependency.)

Re: Do my debugging for me? (memory corruption)

2003-11-21 Thread Steve Fink

On Nov-21, Leopold Toetsch wrote:
> Steve Fink <[EMAIL PROTECTED]> wrote:
> > I'm staring at a crash
> 
> > I'll attach the 5KB compressed .imc file (25KB uncompressed; PIR code
> 
> Its really good, to have such short code snippets, that clearly show,
> where a bug is coming from ;) Anyway, it was again me causing this bug -
> sorry.
> 
> Fixed and updated the comment which I didn't understand when removing
> it.

You're awesome. Thank you.

I didn't boil that down to a small test case because I felt that it
was necessary to preserve the full context, so that people could
adequately appreciate the entirety of the problem with a complete
historical and cultural perspective. A stripped-down test case would
conceal the intention behind the code and forever prevent future
historians from...

Yeah, okay. It was late and I was lazy. Thanks again.

Do my debugging for me? (memory corruption)

2003-11-21 Thread Steve Fink

I'm staring at a crash, my eyes are glazing over, and I need sleep. So
I was wondering if anyone would be interested in taking a look at a
.imc file that is giving me a seg fault while marking a hash in a gc
run triggered by a hash PMC allocation. Or at least tell me whether
it's seg faulting on your machine too.

I'll attach the 5KB compressed .imc file (25KB uncompressed; PIR code
is redundant!) It's generated from the following Perl6 code, but you'd
need my local changes in order to regenerate it:

  rule thing() {
  <[a-z]>+
  }

  rule parens() {
  { print "entering parens\n"; }
  \( [  |  | \s ]* \)
  { print "leaving parens\n"; }
  }

  sub main() {
  my $t = "()()(((blah blah () blah))(blah))";
  my $t2 = $t _ ")";
  print "ok 8\n" if $t2 !~ /^  $/;
  }

(the "entering/leaving parens" printouts have no effect on the bug;
they're just remnants of my flailing.)

If you run with --gc-debug, it dies a little earlier, but in what
appears to be the same op.

Hopefully,
Steve


hash_bug.imc.gz
Description: GNU Zip compressed data

Re: Calling conventions. Again

2003-11-13 Thread Steve Fink

I'm getting a little confused about what we're arguing about. I will
take a stab at describing the playing field, so people can correct me
where I'm wrong:

Nonprototyped functions: these are simpler. The only point of
contention here is whether args should be passed in P5..P15,
overflowing into P3; or just everything in P3. Dan has stated at least
once that he much prefers the P5..P15, and there hasn't been much
disagreement, so I'll assume that that's the way it'll be.

Prototyped functions: there are a range of possibilities.

 1. Everything gets PMC-ized and passed in P3. (Oops, I wasn't going
to mention this. I did because Joe Wilson seemed to be proposing
this.) No arg counts.

 2. Everything gets PMC-ized and passed in P5..P15+P3. Ix is an arg
count for the number of args passed in P5..P15. P3 is empty if
argcount <= 11 (so you have to completely fill P5..P15 before
putting stuff in P3.)

 3. Same as above, but you can start overflowing into P3 whenever you
want. Mentioned for completeness. Not gonna happen.

In fact, anything above this point ain't gonna happen.

 4. PMCs get passed in P5..P15+P3, ints get passed in I5..I15+P3, etc.
Ix is a total argument count (number of non-overflowed PMC args +
number of non-overflowed int args + ...). Arguments are always
ordered, so it is unambiguous which ones were omitted in a varargs
situation. I think this is what Leo is arguing for.

 5. PMCs get passed in P5..P15+P3, ints get passed in I5..I15+P3, etc.
Ix is the number of non-overflowed PMC args, Iy is the number of
non-overflowed int args, etc. I think this is what Dan is arguing
for.

 6. PMCs get passed in Px..P15+P3, ints get passed in I5..I15+P4, etc.
Ix is the number of non-overflowed PMC args, Iy is the number of
non-overflowed int args, etc. I made this one up; see below.

Given that all different types of arguments get overflowed into the
same array (P3) in #4 and #5, #4 makes some sense -- if you want to
separate out the types, then perhaps it should be done consistently
with both argument counts _and_ overflow arrays. That's what #6 would
be. Note that it burns a lot of PMC registers.

The other question is how much high-level argument passing stuff (eg,
default values) should be crammed in. The argument against is that it
will bloat the interface and slow down calling. The argument for is
that it increases the amount of shared semantics between Parrot-hosted
languages. An example of how default values could be wedged in is to
say that any PMC parameter can be passed a Null PMC, which is the
signal to use the default value (which would need to be computed in
the callee, remember), or die loudly if the parameter is required.
Supporting optional integer, numeric, or string parameters would be
trickier. Or disallowed.

Hopefully I got all that right.

Re: [CVS ci] hash compare

2003-11-12 Thread Steve Fink

On Nov-12, Leopold Toetsch wrote:
> I've committed a change that speeds up hash_compare considerably[1], 
> when comparing hashes with mixed e.g. ascii and utf8 encodings.

I read the patch, and thought that we'll also have a lot of ($x eq $y)
and ($x ne $y) statements that this won't accelerate -- couldn't this be
done as another string vtable entry instead of being specific to
hash_compare? It seems like you're able to do the optimizations when
string_compare isn't purely because string_compare needs to test
ordering, not just equality. Am I missing something (as usual), or
would this be better done by adding a string_equal?

Re: #define version obj.version is a kind of generic thing

2003-10-27 Thread Steve Fink

On Oct-27, Leopold Toetsch wrote:
> Arthur Bergman <[EMAIL PROTECTED]> wrote:
> 
> > include/parrot/pobj.h:#  define version obj.version
> 
> Sorry for that :) We can AFAIK toss the version part of a PObj. Its
> almost unused and hardly needed. It could be renamed too inside parrot.

I'm the guilty one who added the version field (though back then, it
was added to a Buffer object, IIRC). I found it very helpful in
debugging GC problems -- I would have a problem where memory was being
stomped on, and trace it back to a prematurely freed and reused
header. But in order to figure out what was going on, I needed to
trace it back to the point of allocation of the header, but headers
get reused so frequently that setting a conditional breakpoint on the
address of the header was useless. So I added in the version number to
provide something to set a breakpoint on.

It is very possible that this is no longer useful; I haven't been
working on stuff where I have needed it for long enough that the code
has mutated significantly. Is it still useful? If not, then go ahead
and rip it out.

Re: [BUG] IMCC looking in P3[0] for 1st arg

2003-10-27 Thread Steve Fink

On Oct-26, Melvin Smith wrote:
> At 06:25 PM 10/26/2003 -0800, Steve Fink wrote:
> >  .pcc_sub _main prototyped
> >  .pcc_begin_return
> >  .return 10
> >  .return 20
> >  .pcc_end_return
> >  .end
> 
> It is still the same issue. This code explicitly mixes 2 call conventions.
> _main is declared as prototyped so it will return 1 in I0 which signals that
> it is returning its values in registers in prototyped convention. Your
> call explicitly calls in non_prototyped mode which does not generate any
> code to check the return convention since you are saying by your code
> that you _know_ what the call convention is.

Oops, I meant to leave the "prototyped" off of the _main sub. This
behaves indistinguishably from declaring it prototyped; it seg faults
if called non_prototyped. I believe it's supposed to work when called
with either style.

Sorry for the bad example.

Likewise, if I declare the .pcc_sub to be non_prototyped (so that both
the call and declaration are non_prototyped), I get the same error:

  .sub _main
  .local Sub myfunc
  myfunc = newsub _myfunc
  .pcc_begin non_prototyped
  .pcc_call myfunc
  ret:
  .result $I0
  .result $I1
  .pcc_end
  print "Returned "
  print $I0
  print ","
  print $I1
  print "\n"
  end
  .end

  .pcc_sub _myfunc non_prototyped
  .pcc_begin_return
  .return 10
  .return 20
  .pcc_end_return
  .end

% ./perl6 -Rt mini.imc

  .
  .
  .
  PC=60; OP=38 (invoke_p); ARGS=(P1=RetContinuation=PMC(0x40c04998))
  PC=21; OP=1003 (restoretop)
  PC=22; OP=801 (shift_i_p); ARGS=(I17=0, P3=NULL)
  Error: '/home/sfink/parrot/parrot -t -r mini.imc ' failed
  died with signal 11 (SIGSEGV)

> However, I see your point. To be orthogonal would suggest that we
> implement the same feature for .pcc_call that we do for the .pcc_sub
> declaration. If you left off the calling convention to .pcc_call it
> would generate code to check for either. Although this would really
> bloat the code, it might be wise to support the feature for some
> instances.

No, sorry, my bad example obscured the issue. I was not asking for a
.pcc_begin that works in either case; I just want it to be possible to
call a subroutine without a prototype and have it successfully return
values. True, I would also like to be able to call the same subroutine
*with* a prototype in another call site, but that is already
implemented. I don't think allowing people to leave off the
{non_,}prototyped declaration from .pcc_begin provides anything but
superficial syntactic orthogonality; a call site really ought to know
whether it can see the prototype of what it's calling or not! (Well...
except I sometimes call non_prototyped even when I know the prototype,
because pdd03 calling convention prototypes don't handle everything in
a Perl6 prototype. But that's irrelevant here.)

My brain doesn't seem to be working all that well this weekend...

I'll throw in one more thing just because I know a certain Mr. P.
Cawley dearly loves people to pile unrelated things into a single
thread: could there be a way to expose which continuation to invoke
when returning from a routine? In a regex, I'd really like a rule to
be invoked with a "success" continuation and a "fail, so backtrack"
continuation. And possibly with some more extreme failure
continuations for cuts and commits and things. But right now the
return continuation in P1 is hidden inside the PCC mechanism. (I guess
I could just manually overwrite P1, but that seems like it's working
against imcc rather than with it.)

I'm faking it for now by returning a boolean status code, but that
doesn't really feel like the "right" solution.

Re: [BUG] IMCC looking in P3[0] for 1st arg

2003-10-26 Thread Steve Fink

On Oct-26, Leopold Toetsch wrote:
> Steve Fink <[EMAIL PROTECTED]> wrote:
> > I am getting a seg fault when doing a very simple subroutine call with
> > IMCC:
> 
> > .sub _main
> > newsub $P4, .Sub, _two_of
> > $P6 = new PerlHash
> > .pcc_begin prototyped
>  ^^
> > .pcc_sub _two_of non_prototyped
>^^
> 
> You are stating explicitely conflicting call types. That can't work.
> When you remove "non_prototyped" in the sub, its prepared to be called
> either way and it works.

That is working for me now for the parameter passing, but not for
return values. The below code seg faults because it is attempting to
pry return values out of P3; it works if I switch the line
  .pcc_begin non_prototyped
to
  .pcc_begin prototyped

I'm not sure if this is implemented yet, though.

Code follows:

  .sub __main
  .local Sub main_sub
  main_sub = newsub _main
  .pcc_begin non_prototyped
  .pcc_call main_sub
  ret:
  .result $I0
  .result $I1
  .pcc_end
  print "Returned "
  print $I0
  print ","
  print $I1
  print "\n"
  end
  .end

  .pcc_sub _main prototyped
  .pcc_begin_return
  .return 10
  .return 20
  .pcc_end_return
  .end

Re: [BUG] IMCC looking in P3[0] for 1st arg

2003-10-26 Thread Steve Fink

On Oct-26, Leopold Toetsch wrote:
> Steve Fink <[EMAIL PROTECTED]> wrote:
> > I am getting a seg fault when doing a very simple subroutine call with
> > IMCC:
> 
> > .sub _main
> > newsub $P4, .Sub, _two_of
> > $P6 = new PerlHash
> > .pcc_begin prototyped
>  ^^
> > .pcc_sub _two_of non_prototyped
>^^
> 
> You are stating explicitely conflicting call types. That can't work.
> When you remove "non_prototyped" in the sub, its prepared to be called
> either way and it works.

Doh! Thanks, I definitely should have noticed that.

Although this does bring up another issue -- should parrot really be
seg faulting when it gets a uninitialized (null) PMC? It happens to me
quite often. In a way, the current behavior is rather nice, since the
errors tend to be more obvious. Then again, that's only because all my
test programs die early enough that no incorrect but non-null values
have snuck into my registers yet. Also, I wouldn't expect a VM to fall
flat on its face from something like this.

[BUG] IMCC looking in P3[0] for 1st arg

2003-10-25 Thread Steve Fink

I am getting a seg fault when doing a very simple subroutine call with
IMCC:

.sub _main
newsub $P4, .Sub, _two_of
$P6 = new PerlHash
.pcc_begin prototyped
.arg $P6
.arg 14
.pcc_call $P4
after:
.pcc_end
end
.end

.pcc_sub _two_of non_prototyped
.param PerlHash Sunknown_named3
.param int mode
.pcc_begin_return
.pcc_end_return
.end

The problem is that IMCC is checking to see whether the 1st argument
is of the correct type (PerlHash), but it looks for the argument in
P3[0], when in fact it isn't an overflow arg and so is in P5. P3, in
fact, is null and so parrot seg faults.

Oddly, if I take away the int parameter (and corresponding argument),
it does not crash. But this also seems to remove the typecheck
entirely.

Re: [perl #24226] [PATCH] Bad casts in interpreter.c

2003-10-15 Thread Steve Fink

On Oct-15, Adam Thomason wrote:
> # New Ticket Created by  "Adam Thomason" 
> # Please include the string:  [perl #24226]
> # in the subject line of all future correspondence about this issue. 
> # http://rt.perl.org/rt2/Ticket/Display.html?id=24226 >
> 
> 
> IBM VisualAge C 6 complains about some data<->function pointer casts in 
> interpreter.c:

Thanks, applied.

[COMMIT] perl6 sub calling

2003-10-13 Thread Steve Fink

For those of you not on the CVS list, I just committed a rather large
change to the perl6 compiler that implements a subset of the A6
subroutine signature rules. My implementation is rather ad-hoc, but it
is a decent representation of my slowly evolving understanding of how
this stuff's supposed to work. Eventually, I'd like to rewrite it to
be more encapsulated, and make it plug into the parser better. Ooh,
and it needs some runtime context stuff, but I don't think that really
exists anywhere in perl6 right now.

Even better, I'd love to see someone else rewrite it properly.

See languages/perl6/t/compiler/call.t for several examples of usage.
Briefly, it handles things like:

 sub f ($a, $b) { ... }
 sub g ($a, +$b) { ... }
 sub h ($a, [EMAIL PROTECTED]) { ... }

 f(1,2);
 f(a, b => 2);
 g(a => 1, b => 2);
 h(1, 2, 3, 4);
 h(1, 2, [EMAIL PROTECTED], 40);

It pretends to handle optional parameters, but if you don't pass in a
value to an optional parameter, it reuses whatever happened to be
lying around in that register, and there's no way of telling whether
the caller specified a value or not.

I doubt anyone will actually use any of this for a while, so unless I
get change requests I'm planning on leaving this in its current
half-baked state for now, and going back to using it for a regular
expression engine, which is why I went down this rabbit hole in the
first place.

Re: [COMMIT] new IO op 'pioctl'

2003-10-11 Thread Steve Fink

On Oct-11, Melvin Smith wrote:
> At 09:19 AM 10/11/2003 -0700, Steve Fink wrote:
> >On Oct-10, Melvin Smith wrote:
> >> At 08:31 AM 10/10/2003 -0400, Dan Sugalski wrote:
> >> >
> >> >I think it's time to start thinking about it. (And I think we need a new
> >> >name, but that's because I've always hated 'ioctl' :)
> >>
> >> :)
> >>
> >> I also considered iocmd, ioattr and ioset.
> >>
> >> IPop your favorite into the suggestion box...
> >
> >How about keyed access to the IO PMC?
> >
> >  set I0, P0[.CMDGETBUFSIZE]
> >  set P0[.CMDSETBUFSIZE], I0
> 
> I like that.
> 
> Actually it could look even simpler since we have separate setkeyed
> and getkeyed support:
> 
> set IO, P0[.BUFSIZE]
> set P0[.BUFSIZE], 8192

Actually, looking at that suggests that perhaps this should be done
through the setprop/getprop interface instead, since that seems like a
closer semantic fit to what you're doing.

Re: [COMMIT] new IO op 'pioctl'

2003-10-11 Thread Steve Fink

On Oct-10, Melvin Smith wrote:
> At 08:31 AM 10/10/2003 -0400, Dan Sugalski wrote:
> >
> >I think it's time to start thinking about it. (And I think we need a new
> >name, but that's because I've always hated 'ioctl' :)
> 
> :)
> 
> I also considered iocmd, ioattr and ioset.
> 
> IPop your favorite into the suggestion box...

How about keyed access to the IO PMC?

  set I0, P0[.CMDGETBUFSIZE]
  set P0[.CMDSETBUFSIZE], I0

Re: More fun with argument passing

2003-10-05 Thread Steve Fink

On Oct-05, Luke Palmer wrote:
> Steve Fink writes:
> > Ok, I'm back to argument passing. I'm starting a new thread because
> > I'm lazy and I have to scroll back too far in my mailer to see the old
> > arg passing thread. :-) And yes, most of this message should also be
> > on -languages.
> 
> Which it now is.   Although, there are some internals issues, too, so I
> wonder how we can do this.  How about, when someone responds to either
> an -internals- or a -language-specific question, he directs it only to
> the appropriate list.

And I guess cross-post one last time when moving a piece from one list
to another?

> > I can use names to pass required arguments, but all positional args
> > must precede all named args. So then is this legal:
> > 
> >  f(1, 2, b => 1.5)
> > 
> > or must all of the positional args referred to by named parameters
> > follow those passed positionally? (There are two orders -- argument
> > order and parameter order. In which of those two must all positionals
> > precede the named params?)
> 
> Both.  (In parameter order, named-only must come after positionals)  So
> f(1, 2, b => 1.5) was correct.

Huh? In argument order, clearly all the positionals precede the named.
But if 2 is bound to $c, then they are out of parameter order. Or does
that not bind 2 to $c? Are both 2 and 1.5 bound to $b (and resolved as
below)?

> > In
> > 
> >  sub j($a, ?$b, *%c) { ... }
> > 
> > can I actually pass %c after the rest of the params? If I call it with
> > 
> >  j(1, $blue => 'red')
> > 
> > then does that compile down to
> > 
> >  .param 1
> >  .param named_arg_table
> > 
> > ? How is the callee supposed to know whether the 2nd param is $b or
> > %c? What if $blue happened to be "b"?
> 
> If $blue was 'b', then j would get $b to be 'red'.  Run-time positionals
> are another one of those things I don't expect to see all that often
> (but that might be a different story in my code >:-).

Sorry, that was an implementation question, not a language question.
>From the language-level, clearly the effect you want is for 1 to be
bound to $a and 'red' to be bound to either $b or %c{$blue}, depending
on whether $blue eq "b". The question is in what order the parameters
should be passed. Both work, but both cause problems.

> The real problem arises in:
> 
> j(1, 2, $blue => 'red')
> 
> If $blue happens to be 'b'.  I think the behavior then would be $b gets
> 2, and %h gets { b => 'red' }.  In particular, I think it's wrong to
> croak with an error here.

Larry had some discussion of this:

 However, it is erroneous to simultaneously bind a parameter both by
 position and by name. Perl may (but is not required to) give you a
 warning or error about this. If the problem is ignored, the
 positional parameter takes precedence, since the name collision might
 have come in by accident as a result of passing extra arguments
 intended for a different routine. Problems like this can arise when
 passing optional arguments to all the base classes of the current
 class, for instance. It's not yet clear how fail-soft we should be
 here.

Oh, and in discussing this, I'm wondering about one bit of vocabulary:
do you bind an argument to a parameter, a parameter to an argument, or
do you bind an argument and parameter together? E6 binds arguments to
parameters. What if you are binding multiple arguments to a single
parameter, as is the case with slurpy params?

It doesn't matter, really, but in my documentation and discussion I'd
like to be consistent, just because this stuff is already a ways
beyond my mental capacity and anything that simplifies things is
greatly appreciated!

More fun with argument passing

2003-10-05 Thread Steve Fink

Ok, I'm back to argument passing. I'm starting a new thread because
I'm lazy and I have to scroll back too far in my mailer to see the old
arg passing thread. :-) And yes, most of this message should also be
on -languages.

Could somebody tell me where I go wrong:

If you have a prototype
  sub f ($a, $b, $c) { ... }
then you should pass $a in P5, $b in P6, etc. So the code will look
like:
 .param $a
 .param $b
 .param $c

If you declare a sub without a prototype, it should default to ([EMAIL PROTECTED]).

A slurpy array parameter puts its corresponding arguments in a list
context, which is the same as a flattening context. This is stated in
E6 and S6, though not in A6 as I read it (but it doesn't disagree
either.)

Let's add a prototype-less sub for use in discussion:
  sub g { ... }

Is there any way to create a prototype that, when called with any
number of array variables, would pass all of the arrays as objects?
So, for example, f(@x, @y, @z) would do the right thing for exactly
three arrays, but couldn't handle f(@x,@y,@z,@w). g(@x, @y, @z) seems
to flatten all of them together. I'm sure something like
g([EMAIL PROTECTED],[EMAIL PROTECTED],[EMAIL PROTECTED]) would work, but what if I 
want to do the call without
backslashes?

The calls f(1, 2, 3) and g(1, 2, 3) should both generate
  .arg 1
  .arg 2
  .arg 3
...except instead of constant ints, you'd probably need PMCs.

Splatted array params are aliases to the actual args. So

  sub h ([EMAIL PROTECTED]) { @params[1] = 'tuna!'; }
  h($x, $y, $z);

should set $y to 'tuna!'.

Would h(@x) set @x[1] to 'tuna!'? If so, then does

  h(@x, $y)

change $y's value depending on the number of elements in @x? It seems
that @params is either a magical array where lookups trigger a runtime
computation of where that index would be found in the original
argument list, or it is an array of references to either variables or
 pairs, and all of those references are built at runtime
immediately when the call is made. (Which rather defeats the default
laziness of arrays.) Actually, "proxies" might be a more accurate
term. You should be able to pass @params into another function just
like any other array, or do

  $gloof = (rand(100) < 30) ? @params : @normal_array;

Or maybe h(@x) does NOT set @x[1] to 'tuna!'?

Ok, the whole aliasing thing was something of a tangent. Back to f()
and h(), which are really f($,$,$) and h(*@).

I can use names to pass required arguments, but all positional args
must precede all named args. So then is this legal:

 f(1, 2, b => 1.5)

or must all of the positional args referred to by named parameters
follow those passed positionally? (There are two orders -- argument
order and parameter order. In which of those two must all positionals
precede the named params?)

In

 sub j($a, ?$b, *%c) { ... }

can I actually pass %c after the rest of the params? If I call it with

 j(1, $blue => 'red')

then does that compile down to

 .param 1
 .param named_arg_table

? How is the callee supposed to know whether the 2nd param is $b or
%c? What if $blue happened to be "b"?

If I do it the other way around,

 .param named_arg_table
 .param 1

then at least I can always assume the named args are passed first, and
use the arg count to directly determine whether $b was passed or not.
But then all Perl6 subroutines would have to take a named arg table as
their first argument, by convention, and cross-language calls would
need to be aware of this -- even when calling unprototyped. ("Oh,
yeah, if you're calling a Perl6 routine you have to pass an empty
hashtable as the first param.")

I have a first cut at Perl6 parameter passing. It doesn't do runtime
context, the named params are in there but I assume they don't work,
and it reflects my earlier misconception that a [EMAIL PROTECTED] array should
NOT flatten its arguments. Or, in other words, I compile

  sub g { ... }
  g(10,20,30)

down to

  $P0[0] = 10
  $P0[1] = 20
  $P0[2] = 30
  .arg (empty hash)
  .arg $P0

and
  
  sub f ($a, $b, $c) { ... }
  f(10,20,30)

down to

  .arg (empty hash)
  .arg 10
  .arg 20
  .arg 30

which means that if you call f($a,$b,$c) without its prototype, then
it compiles to the former code which results in $a getting 3 (the
length of the @_ array), while $b and $c get bad values that seg fault
parrot.

I'm hesitating to manually pull all of the [EMAIL PROTECTED] elements out of
P5..P15 + P3[0..] until someone makes me a little more confident that
it's the right thing to do. And even then, perhaps it should be
handled by some kind of .flattened_param declaration. But neither
would handle the magical aliasing I talked about above, if that is
required. And it's also slowing down the callee in what could easily
be a common case.

The patch for this is mixed in with some other stuff I've been working
on, but the relevant part is more or less

 Addcontext.pm   |  137 +-
 Builtins.pm |  411 ++--
 Context.pm  |

Re: IMCC parsing weirdness

2003-09-28 Thread Steve Fink

On Sep-28, Steve Fink wrote:
> 
> I've attached a diff to languages/imcc/t/syn/pcc.t, but I'm not sure
> if that's the right place for the test.

Oops. Except CVS is being very flaky right now, so the patch hadn't
been written to the file before I sent it.

Oh well. I'm committing a fix for the bug, as well as resolving all
shift/reduce conflicts via precedence. I'll also commit the test. I'll
let someone else move it around if they want.

I didn't bother allowing line and filename comments in the middle of
.param sections, although I suppose that might make sense if you're
splitting your parameter declarations across multiple lines. Oh well.

IMCC parsing weirdness

2003-09-28 Thread Steve Fink

I am getting strange behavior from IMCC if the first line after
.pcc_sub is a comment. It seems to misinterpret things and ends up
emitting a restore_p op that triggers a "No entries on UserStack!"
exception.

I've attached a diff to languages/imcc/t/syn/pcc.t, but I'm not sure
if that's the right place for the test.

I'm looking at fixing this now, but the grammar rules relating to this
are a bit hairy. I've eliminated the existing shift/reduce conflict by
assigning precedence to a dummy rule, but I'm still working on
changing the grammar to accept stuff in the parameter list.

Re: Pondering argument passing

2003-09-28 Thread Steve Fink

On Sep-28, Leopold Toetsch wrote:
> Steve Fink <[EMAIL PROTECTED]> wrote:
> 
> > I'm not sure IMCC is going to be able to autogenerate the prototyped
> > and nonprototyped versions of the callee's entry code from the same
> > set of .param declarations.
> 
> This is currently implemented as the "unprototyped" case. If a sub does
> expect to be called either prototyped or non-prototyped, the proto
> specifier of the .pcc_sub SHOULD be omitted. The entry code then looks
> at C and does either case.
> 
> So we should specify, what to do with wrong param counts or wrong
> types. pcc.t has some examples for this (labeled "unproto" or
> "exception").

I was arguing that this isn't enough. We need the set of parameters to
really be different in the two cases, so we need two sets of ".param"
statements, not just one.

> I think we als need the concept of a default value (for at least
> Pie-Thon's sake - does Perl6 also have default values?)
> 
>   sub f($a, $b=1, $c="Default");
>   ...
>   &f(42);

Yes, this is what I was talking about in the big block comment in the
sample code at the end of my last message. Perl6 does have them. I
don't know whether Perl6 or any other language we want to be nice to
has *non-constant* defaults. If so, and if we want direct support for
them, then it means we need to evaluate them in the context of the
callee. Which is the natural way to do it anyway, but having the
caller fill in default values for a prototyped call could work around
some of the issues with argument passing that would otherwise require
more native support to handle. (I'd prefer the native support,
whatever it might be.)

There's the issue of detecting whether a parameter was passed or not,
in order to decide whether to fill in the default value. (See my last
message for more discussion of this.)

Re: Pondering argument passing

2003-09-28 Thread Steve Fink

On Sep-26, Leopold Toetsch wrote:
> Dan Sugalski <[EMAIL PROTECTED]> wrote:
> 
> [ splatted function args ]
> 
> > ... For this, I think we're
> > going to need a "setp Ix, Py" op which does indirect register addressing.
> 
> Done.

Cool, thanks!

With it, I've been able to get a bit farther. Found a minor bug where
it multiply defines a label if you make two calls.

I'm not sure IMCC is going to be able to autogenerate the prototyped
and nonprototyped versions of the callee's entry code from the same
set of .param declarations.

 sub f3($a, $b, ?$c)

If that is converted to

 .param $a
 .param $b
 .param $c

then IMCC is going to get very unhappy when you call it without a
prototype and only pass it two arguments. If it is converted to

 .param $a
 .param $b
 .param remaining_args
 .local $c
 if remaining_args > 0
   $c = shift remaining_args

then I'll need to pass in the extra remaining_args array in all calls,
prototyped or non-. That's not a huge deal, and it fixes this problem.
But you'll hit it again if you call &f3 unprototyped with

 &f3(3, b => 4);

Named arguments cause a lot more trouble than that, but I still don't
want to go into that yet. But the above example should be enough to
demonstrate that one set of .param declarations won't be enough unless
we add more metadata to the declarations, and that feels like it might
be too perl6-specific.

How bad would it be to do:

 .pcc_sub _main
 .pcc_non_prototyped
   # .
   # .
   # .
   # code to place params into the registers that they would
   # be in if called with a prototype. I'm not sure if IMCC
   # can autogenerate any of this or not.
   # .
   # .
   # .
   # IMCC inserts a jump to the main body of the routine
 .pcc_prototyped
   .param _SV_a
   .param _SV_b
   .param _SV_c # This is a little strange. The caller knows the prototype,
# so can pass in an undef if the 3rd argument wasn't given.
# But the callee may have a different default value in
# mind than undef, and if it's an expression then it
# probably needs to be evaluated by the callee in its
# local environment. We could create a new UnpassedArg PMC,
# or we could add something to the calling conventions
# so that you know how many arguments were actually
# passed.
 .pcc_body
   # subroutine body
 .end

The non-prototyped section would want to use the same _SV_? variables,
so perhaps the prototyped section should come first.

Re: CVS checkout hints for the bandwidth-limited

2003-09-26 Thread Steve Fink

On Sep-26, Leopold Toetsch wrote:
> Filtering the output through a small
> script, that just does something like:
> 
> if ($_ !~ /^cvs server: Updating/) {
> print $_;
> }
> 
> helps to unclutter update results.

cvs -q will suppress those lines for you.

Re: Pondering argument passing

2003-09-24 Thread Steve Fink

On Sep-24, Leopold Toetsch wrote:
> No. But you are right. That's the code (/s\$I2/\$I1/) that ".args"
> should produce. Perhaps we shoud name the directive ".flatten_arg".

Yes, that makes its purpose more clear than calling it ".args".

> 
> Is it supposed to do deep flattening? Do we need ".deeply_flatten_arg"
> too?

It should not deeply flatten, and I didn't see anywhere in A6 that
indicated that we ever deeply flatten. There is a ** prefix operator,
but it is only "more splattier" than * in that it is required to
immediately evaluate its operand (eg, **1..Inf is supposed to do
something bad). But I don't even want to think about lazy lists yet,
anyway.

Re: Pondering argument passing

2003-09-24 Thread Steve Fink

Ah, I reread one of your earlier posts. It appears that you are
proposing to pass the arguments in a PerlArray. So flattening is
possible.

Then what I am saying is that

  sub f($a,$b) { ... }

is going to expect $a to be in P5 and $b to by in P6. In your scheme,
$a would be in P5[0] and $b would be in P5[1]. While I personally am
not fundamentally opposed to that idea, I believe it's not going to
fly because

 1. the whole point of using register parameter passing is to avoid
exactly this.

 2. the existing Perl6 builtin functions could not be given prototypes

 3. other Parrot-hosted languages would not interoperate -- they would
need to treat all functions using this calling convention as
single-argument functions that took an array

Or are you saying that this is only used for non-prototyped calls? I
believe this directly violates something Dan said. He expects an
unprototyped call passing two scalars to pass those scalars in P5 and
P6 and have no speed cost as compared to calling the same function
with a prototype. Which makes sense, though one could certainly argue
about the frequencies of various sorts of calls -- it might be enough
to streamline prototyped functions involving no flattening, and not
worry about non-prototyped calls, simple or not.

1 2 3 4 5 6 >

1 - 100 of 528 matches

Mail list logo