Re: Python, Parrot, and lexical scopes
Sam Ruby wrote: Consider the following code: def f(x): return len(x) for i in [0,1]: print f("foo") f = lambda x: x.upper() No, don't. Consider the following code instead: def f(x): return len(x) for i in [0,1]: print f("foo") len = lambda x: x.upper() Key difference is the last line. In this example, there is only one definition for f, one that will call whatever function is defined as "len" at the time of the call. - Sam Ruby
Re: Python, Parrot, and lexical scopes
Dan Sugalski wrote: At 7:55 AM -0400 10/18/04, Sam Ruby wrote: I've been trying to make sense of Python's scoping in the context of Parrot, and posted a few thoughts on my weblog: http://www.intertwingly.net/blog/2004/10/18/Python-Parrot-and-Lexical-Scopes While I posted it on my weblog for formatting and linking reasons, feel free to respond on the mailing list. Suggestions welcome, in particular, a PIR equivalent to the Perl would be most helpful. I responded (sorta) on the weblog, but I'll redo it here since it gets into some of the fundamental bits of namespaces, which checking the calendar I see we're scheduled to grovel over again. The code (for folks playing along at home) is: scope1.py: from scope2 import * print f(), foo foo = 1 print f(), foo and scope2.py: foo = 2 def f(): return foo I'll make two assumptions in the explanation here. First, that lexical scopes are inappropriate. In this case, they'd do exactly what you want them, but the problem there is that you can't really have multiple files sharing the same scope, so that makes splitting modules into multiple files untenable. In Python file == module. But in any case, my thoughts were to arrange for Python it so that at depth zero in the scratchpad stack is always the python __builtins__. At depth one in the scratchpad stack is always the module namespace. Starting at depth two is the true lexical variables. Globals would be a good place to store the set of loaded modules for quick access. Second, that named namespaces are inappropriate. Again, in this case they could do what you wanted if scope1.py and scope2.py were in different basic namespaces. Modules split across multiple files could, with named namespaces (that is, things in the main module have their variables in main:, the foo module in foo:, and so forth), work just fine, but that's not what we need here, since python wants un-qualified names to look up, at runtime, in the current module namespace and then the main namespace. Agreed. So, the solution here is to have a chain of overlapping namespaces. Each sub or method has a handle on a namespace, which itself has a link to the namespace it's occluding, and so on up to the top, basic namespace. (If, indeed, we even have one that's universal -- code could twiddle with that if it really wanted to) The namespaces act much like the lexical pads do (or would, if they were fully functional) only with globally visible names instead. The nice thing here is that this is transparent to the code -- the find_global and store_global ops may have to jump through some hoops to do the right thing, but most bytecode won't know it's happening. I just want to make sure that you have any particular semantics in mind when you say "global" here. In Python, global essentially means module. Locals tend to obsure globals, so essentially what is desired is a find_local which does the right thing. Consider the following code: def f(x): return len(x) for i in [0,1]: print f("foo") f = lambda x: x.upper() The desired result is that the global len function is called once, and the global upper function is called once - via the local lambda function. For this to work, f(x) needs to be unaware of any local/global distinction. The find_whatever opcode it uses needs to start with the locals and follow links until a match is found. What we need to do is define and add the ops to add in and remove layers of namespaces, and get the packfile format set so that the proper layers can be anchored to the sub PMCs when bytecode's loaded in from wherever. -- Dan - Sam Ruby
Re: Cross-compiling Parrot
On Oct-17, Dan Sugalski wrote: > At 9:49 AM -0400 10/17/04, Jacques Mony wrote: > >Hello, > > > >I'm trying to port parrot to the unununium operating system, which > >uses a modified version of 'diet lib c'. Can anyone tell me if this > >is actually possible to force the use of this library using the > >current Configure.pl script or if I will need to change it a lot... > >or even replace it with my own? > > There's a pretty good bet you're going to have to alter the configure > script quite a bit, but it shouldn't require a full rewrite. Teaching > it to read from a pre-done configuration data file would be a good > place to start, which'd let us feed in the cross-compilation > settings. (And we could then leverage for the upgrade settings too) It's not exactly that, but you can set pretty much anything you want in a config/init/hints/local.pl file.
Re: Python, Parrot, and lexical scopes
At 7:55 AM -0400 10/18/04, Sam Ruby wrote: I've been trying to make sense of Python's scoping in the context of Parrot, and posted a few thoughts on my weblog: http://www.intertwingly.net/blog/2004/10/18/Python-Parrot-and-Lexical-Scopes While I posted it on my weblog for formatting and linking reasons, feel free to respond on the mailing list. Suggestions welcome, in particular, a PIR equivalent to the Perl would be most helpful. I responded (sorta) on the weblog, but I'll redo it here since it gets into some of the fundamental bits of namespaces, which checking the calendar I see we're scheduled to grovel over again. The code (for folks playing along at home) is: scope1.py: from scope2 import * print f(), foo foo = 1 print f(), foo and scope2.py: foo = 2 def f(): return foo I'll make two assumptions in the explanation here. First, that lexical scopes are inappropriate. In this case, they'd do exactly what you want them, but the problem there is that you can't really have multiple files sharing the same scope, so that makes splitting modules into multiple files untenable. Second, that named namespaces are inappropriate. Again, in this case they could do what you wanted if scope1.py and scope2.py were in different basic namespaces. Modules split across multiple files could, with named namespaces (that is, things in the main module have their variables in main:, the foo module in foo:, and so forth), work just fine, but that's not what we need here, since python wants un-qualified names to look up, at runtime, in the current module namespace and then the main namespace. So, the solution here is to have a chain of overlapping namespaces. Each sub or method has a handle on a namespace, which itself has a link to the namespace it's occluding, and so on up to the top, basic namespace. (If, indeed, we even have one that's universal -- code could twiddle with that if it really wanted to) The namespaces act much like the lexical pads do (or would, if they were fully functional) only with globally visible names instead. The nice thing here is that this is transparent to the code -- the find_global and store_global ops may have to jump through some hoops to do the right thing, but most bytecode won't know it's happening. What we need to do is define and add the ops to add in and remove layers of namespaces, and get the packfile format set so that the proper layers can be anchored to the sub PMCs when bytecode's loaded in from wherever. -- Dan --it's like this--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Re: Python, Parrot, and lexical scopes
Sam Ruby wrote: It seems like everything on that page boils down to: all functions are module-scoped closures. A closer translation: "How do I implement module-scoped closures in Parrot?" OK, I've roughed out an implementation: http://intertwingly.net/stories/2004/10/18/scopes.pir http://intertwingly.net/stories/2004/10/18/pymodule.pmc In the process, I've made a large number of assumptions. Undoubtably, many of them are wrong. - Sam Ruby
Re: [ANNOUNCE] Test::Simple 0.49
On Thursday 14 October 2004 10:20 pm, Michael G Schwern wrote: > Its about freakin' time. Has it really been two years since the last > stable release? Yes it has. t\fail-more.t fails on Win32 with an error on test 12. The problem is related to an unescaped "\" path separator in the test script. An example of the failure and a patch are included below. C:\.cpan\build\Test-Simple-0.49>prove -b -v t\fail-more.t t\fail-more1..12 ok 1 ok 2 ok 3 ok 4 ok 5 ok 6 ok 7 ok 8 ok 9 ok 10 ok 11 - failing output not ok 12 - failing errors # Failed test (t\fail-more.t at line 84) # Tried to use 'Hooble::mooble::yooble'. # Error: Can't locate Hooble/mooble/yooble.pm in @INC (@INC contains: t/lib blib\arch blib\lib c:/Perl/lib c:/Perl/site/lib . c:/Perl/lib c:/Perl/site/lib .) at (eval 14) line 2. # BEGIN failed--compilation aborted at t\fail-more.t line 84. # Failed test (t\fail-more.t at line 85) # Tried to require 'ALL::YOUR::BASE::ARE::BELONG::TO::US::wibble'. # Error: Can't locate ALL/YOUR/BASE/ARE/BELONG/TO/US/wibble.pm in @INC (@INC contains: t/lib blib\arch blib\lib c:/Perl/lib c:/Perl/site/lib . c:/Perl/lib c:/Perl/site/lib .) at (eval 15) line 2. # Looks like you failed 29 tests of 29. FAILED test 12 Failed 1/12 tests, 91.67% okay Failed Test Stat Wstat Total Fail Failed List of Failed --- t\fail-more.t 121 8.33% 12 Failed 1/1 test scripts, 0.00% okay. 1/12 subtests failed, 91.67% okay. --- ../Test-Simple-0.49/t/fail-more.t 2004-10-14 22:07:33.000 +++ t/fail-more.t 2004-10-18 20:23:44.0 -0500 @@ -254,7 +254,7 @@ # Failed test \\($filename at line 84\\) # Tried to use 'Hooble::mooble::yooble'. # Error: Can't locate Hooble.* in [EMAIL PROTECTED] .* -# BEGIN failed--compilation aborted at $0 line 84. +# BEGIN failed--compilation aborted at $filename line 84. # Failed test \\($filename at line 85\\) # Tried to require 'ALL::YOUR::BASE::ARE::BELONG::TO::US:: # Error: Can't locate ALL.* in [EMAIL PROTECTED] .* This bug has been reported in RT at http://rt.cpan.org/NoAuth/Bug.html?id=8022 Thanks, Steve Peters
Re: [perl #32035] [PATCH] tests and fixes for Integer and Undef PMCs
A I have started a test script for the Integer PMC. In that process I found strangeness in get_string(). set_integer_native() can be inherited from the Scalar PMC. For the Undef PMC I fixed an error in set_number_native(). A patch is attached. The file t/pmc/integer.t is new. Applied, though the patch didn't have t/pmc/integer.t in it. -- Dan --it's like this--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Re: [perl #32022] [PATCH] push_* for resizable PMCs
At 5:56 AM -0700 10/17/04, Bernhard Schmalhofer (via RT) wrote: this patch adds some relevant 'push' ops to the resizable PMCs, described in pdd_17. There are also a couple of POD improvements and tests in t/pmc/resizable*.t. Applied, thanks. -- Dan --it's like this--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Re: Parrot Forth 0.1
On Mon, 18 Oct 2004 17:31:05 -0500 (CDT), Michel Pelletier <[EMAIL PROTECTED]> wrote: > > The second PIR sequence is longer. It will take > > IMCC more time to > > compile that than the first example. As the > > words become less trivial, > > this will become more true. > > But one can't weigh the compile-time overhead > against the run-time overhead like that. What > if your inner loop runs for many days when > compilation of the NCG code takes only a couple > of milliseconds more? Hence my statement: > > So it may be programs can fall on either side of the fence of this > > issue. Building words in terms of other words will give NCG an > > advantage. But using relatively simple words many times will give > > direct threading an advantage. But I do believe you when you say that > > NCG is fastest overall (read, for most programs). > Well that's a good question because I haven't > dont it yet. The simple hack aproach is to > pattern replace back to back PUSH/POP pairs > right out with string search and replace. I > estimate this could eliminate up to half of the > data stack traffic on average, depending on the > code of course. Right. I'm not used to thinking without regexes. `index` and `substr` can be used for this. > The whole-hog way to do it would be to have a > stack data analysis algorithm try to cache as > much of the stack in P registers as possible, > spilling when necessary. This could probably > eliminate almost all stack data traffic except > for extremely pathological cases. I'm not /that/ interested in speed. > The Python interpreter could use this method too > to really spank CPython, which has implicit > stack traffic that cannot be easily optimized > out. Here I'd be interested in it. :) But this assumes the Python implementation will be stack based. Just because CPython is doesn't mean Parrot Python will be, does it? > > Furthermore, our two models will behave > > differently when you redefine > > a word. Consider this Forth example: > > > > : inc 1 + ; > > : print+ inc . ; > > : inc 2 + ; > > > > Should print+ increment by one or by two? gforth > > increments by one. > > I"ve made a pretty big mistake so far by calling > indirect threading direct threading. The > lookup/invoke sequence is really indirect > threading. Direct threading would be if we > could somehow compile into PIR a literal > "pointer" to a sub instead of having to look it > up by name. I don't think this can be done. I think it can. (Sorry for being so argumentative. :-) The get_integer_keyed method gets the address of an Eval PMC at the moment, IIRC. (This was used for the previous Forth implementation.) GC might be a problem though. Emulation could be done too, but I doubt that's desirable. > In the case of words defined with ":" and ";" > even NCG still does indirect threading via a > lookup and invoke, NCG only inlines CORE word > definitions, words that are defned in PIR and > form the basis for all high level words, but the > high level words themselves are indirect > threaded. Ahh... okay, that makes a bit more sense (or at least it did, I'm not sure I remember why). It would actually be easier to inline all words though, as you wouldn't have to treat core words differently (and it makes it easier to redefine them). > > I'd be interesting in knowing which was the > > "correct" behavior. > > I suspect it is implementation defined, but > unfortunately taygeta.com is not working for me > right now. Gotcha. -- matt
[perl #32036] [BUG] t/pmc/signal.t fails
# New Ticket Created by Will Coleda # Please include the string: [perl #32036] # in the subject line of all future correspondence about this issue. # http://rt.perl.org:80/rt3/Ticket/Display.html?id=32036 > I have a little smoke script I threw together that does a cvs checkout, config, make, make test: Failed TestStat Wstat Total Fail Failed List of Failed --- t/pmc/signal.t1 256 31 33.33% 1 4 tests and 52 subtests skipped. Failed 1/122 test scripts, 99.18% okay. 1/1943 subtests failed, 99.95% okay. oolong:~ coke$ uname -a Darwin oolong 7.5.0 Darwin Kernel Version 7.5.0: Thu Aug 5 19:26:16 PDT 2004; root:xnu/xnu-517.7.21.obj~3/RELEASE_PPC Power Macintosh powerpc oolong:~ coke$ perl -v This is perl, v5.8.1-RC3 built for darwin-thread-multi-2level (with 1 registered patch, see perl -V for more detail) Seems to work fine with: oolong:~/research/parrot_24673 coke$ perl t/harness t/pmc/signal.t t/pmc/signalok 1/3 skipped: works standalone but not in test All tests successful, 1 subtest skipped. Files=1, Tests=3, 3 wallclock secs ( 1.18 cusr + 0.27 csys = 1.45 CPU) My machine did happen to be under a bit of a load at the time the test ran, but that doesn't seem like much of an excuse. =) Going back through the output of the original harness, I get: t/pmc/signal...Hangup oolong:~ coke$ # Failed test (t/pmc/signal.t at line 87) # got: 'start # never # ' # expected: 'start # ' t/pmc/signal...NOK 1# Looks like you failed 1 tests of 3. t/pmc/signal...dubious Test returned status 1 (wstat 256, 0x100) DIED. FAILED test 1 Regards.
Re: Parrot Forth 0.1
> This still doesn't seem right. The compilation > from Forth to PIR only > happens once, yes. But each time the defined > word is used, the PIR > code, which is injected, must be compiled to > bytecode. RIght. > The second PIR sequence is longer. It will take > IMCC more time to > compile that than the first example. As the > words become less trivial, > this will become more true. But one can't weigh the compile-time overhead against the run-time overhead like that. What if your inner loop runs for many days when compilation of the NCG code takes only a couple of milliseconds more? > But like you said, this only happens at (a) > compile time or (b) at the > interactive prompt. Right. > And optimizing out push/pop > combos will speed > things up more, though I'm not sure how to > implement that optimization > using PIR. Well that's a good question because I haven't dont it yet. The simple hack aproach is to pattern replace back to back PUSH/POP pairs right out with string search and replace. I estimate this could eliminate up to half of the data stack traffic on average, depending on the code of course. The whole-hog way to do it would be to have a stack data analysis algorithm try to cache as much of the stack in P registers as possible, spilling when necessary. This could probably eliminate almost all stack data traffic except for extremely pathological cases. The Python interpreter could use this method too to really spank CPython, which has implicit stack traffic that cannot be easily optimized out. > Furthermore, our two models will behave > differently when you redefine > a word. Consider this Forth example: > > : inc 1 + ; > : print+ inc . ; > : inc 2 + ; > > Should print+ increment by one or by two? gforth > increments by one. I"ve made a pretty big mistake so far by calling indirect threading direct threading. The lookup/invoke sequence is really indirect threading. Direct threading would be if we could somehow compile into PIR a literal "pointer" to a sub instead of having to look it up by name. I don't think this can be done. gforth keeps the old behavior because it uses direct threading, the pointer never changes inside the compiled body of print+ even though the word definition later does. In the case of words defined with ":" and ";" even NCG still does indirect threading via a lookup and invoke, NCG only inlines CORE word definitions, words that are defned in PIR and form the basis for all high level words, but the high level words themselves are indirect threaded. I should have mentioned this before, but this doesn't invalidate my previous example: : square dup * ; : square_to_thousand 1000 0 do i square . loop ; 1000 lookups and invokes are still required to find the high level word "square" in either indirect threading or NCG (and direct threading still requires 1000 invokes), but 2000 lookups and invokes are still eliminated from the inner loop with NCG because "dup" and "*", which are core words, are inlined. Whether or not an old definition is retained if a word is redefined is a different question, in the case of Parakeet, it will increment by two because all high level words are looked up by name at run-time via indirect threading. > I'd be interesting in knowing which was the > "correct" behavior. I suspect it is implementation defined, but unfortunately taygeta.com is not working for me right now. -Michel
Re: [perl #32021] [PATCH] fix --tree option of pmc2c.pl
At 5:33 AM -0700 10/17/04, Bernhard Schmalhofer (via RT) wrote: this patch fixes the --tree option of classes/pmc2s.pl. Applied, thanks. -- Dan --it's like this--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
[perl #32035] [PATCH] tests and fixes for Integer and Undef PMCs
# New Ticket Created by Bernhard Schmalhofer # Please include the string: [perl #32035] # in the subject line of all future correspondence about this issue. # http://rt.perl.org:80/rt3/Ticket/Display.html?id=32035 > Hi, I have started a test script for the Integer PMC. In that process I found strangeness in get_string(). set_integer_native() can be inherited from the Scalar PMC. For the Undef PMC I fixed an error in set_number_native(). A patch is attached. The file t/pmc/integer.t is new. CU, Bernhard -- /* [EMAIL PROTECTED] */ GMX ProMail mit bestem Virenschutz http://www.gmx.net/de/go/mail +++ Empfehlung der Redaktion +++ Internet Professionell 10/04 +++ pmc_integer_20041018.patch Description: Binary data
Fwd: CPAN Upload: T/TE/TELS/devel/Devel-Size-Report-0.06.tar.gz
-BEGIN PGP SIGNED MESSAGE- Moin, just a small maintainance release which fixes the bug with the ref-to-scalar, and polishes the doc/comments/tests. I plan on adding a few more features that let it handle massive data structures much better by providing summaries/excluding things. Also the hash that keeps track of what we have seen so far is fairly ineffective - slow, uses a lot of memory. Maybe I need to wrap something up in XS to improve it. Hope someone finds this usefull, Tels - -- Forwarded Message -- Subject: CPAN Upload: T/TE/TELS/devel/Devel-Size-Report-0.06.tar.gz Date: Monday 18 October 2004 21:58 From: PAUSE <[EMAIL PROTECTED]> The uploaded file Devel-Size-Report-0.06.tar.gz has entered CPAN as file: $CPAN/authors/id/T/TE/TELS/devel/Devel-Size-Report-0.06.tar.gz size: 13308 bytes md5: 77f5ff1f35804799e0f0311d735972fd No action is required on your part Request entered by: TELS (Tels) Request entered on: Mon, 18 Oct 2004 19:56:35 GMT Request completed: Mon, 18 Oct 2004 19:58:22 GMT Thanks, - -- paused, v460 - --- - -- Signed on Mon Oct 18 22:13:34 2004 with key 0x93B84C15. Visit my photo gallery at http://bloodgate.com/photos/ PGP key on http://bloodgate.com/tels.asc or per email. "My other computer is your Windows box." -- Dr. Brad (19034) on 2004-08-13 at /. -BEGIN PGP SIGNATURE- Version: GnuPG v1.2.4 (GNU/Linux) iQEVAwUBQXQkr3cLPEOTuEwVAQGYaQf5ATDY16McJvRC1U6/hSn9PK9hWhccZTOS M0gOI4N9nMryMrYUm8jGtBdVAgJDAlwYnSS6DlwyY4z4WC852yaEy4KHuUXl7JDj fS/OHhTcaUzLSzJ/oJgdfd1ZJc7COliibSxkierMbhC49+0vK1AGD0eJ4ZkJl+VD P4yx7XFRs++CzdX5bq38LNRcuGdMCbgpiA3eRTyPm6M2/GzScJIeaYku0r3mFROU xHT/hHiCkXULIbVVqTZZ1RUz3yzz0QpI+O436/gZrTaDhoyJLfu+j/lha1V42syX 6lss8u0fNXXXUY16Hq3ANsW47FIkZrjVySxZbKiPK3um2jMkLc+klg== =rm3g -END PGP SIGNATURE-
Re: Python, Parrot, and lexical scopes
Sam Ruby wrote: Unfortunately, these PMCs don't seem to have test cases. Correction: t/pmc/sub.t - Sam Ruby
Re: Parrot Forth 0.1
On Mon, 18 Oct 2004 14:17:59 -0500 (CDT), Michel Pelletier <[EMAIL PROTECTED]> wrote: > Okay, note that the code I mentioned (the > speration of core from core words) is not > checked in right now, but the version in CVS > does do NCG. Noted. > Using the direct threading model, this does 2000 > global lookups and subroutine invokes, which in > turn, do the actual "work" of 1000 > multiplications and the associated stack > traffic. The lookups and invokes are pure > inner-loop overhead. > > Using NCG this does 1000 multiplications and the > associated stack traffic (which can be optimized > out for the most part) with no lookups or > invokes. > > The overhead of diect threading vs. NCG does not > need to be benchmarked, it can be proven by > argument: both methods execute the same code the > same way, but the NCG method does 2000 less > global lookups and invokes. Indeed. Pardon my ignorance. I hadn't thought things all the way through. > The "extra" compiler overhead is trivial, and it > only applies to compile-time; generally when a > program is started. At run-time (when all those > lookups and invokes are happening in the direct > thread case) there is no additional compilation > overhead because a word is compiled only once. This still doesn't seem right. The compilation from Forth to PIR only happens once, yes. But each time the defined word is used, the PIR code, which is injected, must be compiled to bytecode. You said earlier that: > direct thrading this would rsult in the > execution of: > > find_global $P0, "dup" > invoke $P0 > find_global $P0, "mul" > invoke $P0 > > in NCG it would result in the execution of: > > .POP > .NOS = .TOS > .PUSH2# this can be optimized out > .POP2 # of NCG, but not direct threading > .TOS = .TOS * .NOS > .PUSH The second PIR sequence is longer. It will take IMCC more time to compile that than the first example. As the words become less trivial, this will become more true. But like you said, this only happens at (a) compile time or (b) at the interactive prompt. And optimizing out push/pop combos will speed things up more, though I'm not sure how to implement that optimization using PIR. So it may be programs can fall on either side of the fence of this issue. Building words in terms of other words will give NCG an advantage. But using relatively simple words many times will give direct threading an advantage. But I do believe you when you say that NCG is fastest overall (read, for most programs). Furthermore, our two models will behave differently when you redefine a word. Consider this Forth example: : inc 1 + ; : print+ inc . ; : inc 2 + ; Should print+ increment by one or by two? gforth increments by one. I'd be interesting in knowing which was the "correct" behavior. -- matt
Re: Parrot Forth 0.1
>> I propose you and I work together to make a >> totally Forth-language agnostic Forth >> micro-kernel. This kernel can be very >> minimalistic, a stacik, a machine state hash, >> and definitions for the words "code", "next", >> "word", and "'" (tick) all having standard >> Forth >> behavior, a simple dictionary and a simple >> eval >> loop. > > I'll reply to this portion of your email later, > when I get time to > think and to look at the Parakeet code. Okay, note that the code I mentioned (the speration of core from core words) is not checked in right now, but the version in CVS does do NCG. >> Some Parakeet ideas might also be used in your >> code, for example, it looks to me like your >> code >> does direct threading: > > ... > >> Direct threading is a common Forth >> implementation technique, but it was most >> often >> used because it could be implemented portably >> in >> C with only a small bit of asm. For smaller >> ops >> like @ !, math ops, amd many others, it is >> more >> optimal to use direct code generation to >> "inline" the PIR code itself instead of >> inlineing an invoke to the PIR code compiled >> as >> as sub. > > ... > >> resulting in a lot less overhead for core >> words. >> NCG was usually either a commercial feature >> or >> rarely seen in Forth because it was >> non-portable, being written in ASM, and >> expensive to maintain and multiple platforms. >> We can kick that problem to the door. > > I'm not sure that's right. I did think about > putting the code inline > (and it would be a trivial change to do so), but > I'm not convinced it > would be faster. Yes, you wouldn't have to deal > with the overhead > involved with making subroutine calls, but IMCC > would also have to > re-parse and re-compile the code every time. But only at compile time or interactive interpretation time. Not at runtime. Consider the code typed into the Forth interpreter: 2 dup * . would of course print '4'. Your correct that using NCG this would require compiling new PIR every time it is typed in, but *only* when you are working interactively. The time it takes to do this is infinitessimal compared to the time it takes to type it in. For an already compiled word being executed, however, NCG is *much* faster than calling subroutines. Consider: : square dup * ; in psudo-pir, given the definitions of dup and *: .sub dup .POP .NOS = .TOS .PUSH2 .end .sub mul: .POP2 .TOS = .TOS * .NOS .PUSH2 .end using direct thrading this would rsult in the execution of: find_global $P0, "dup" invoke $P0 find_global $P0, "mul" invoke $P0 in NCG it would result in the execution of: .POP .NOS = .TOS .PUSH2# this can be optimized out .POP2 # of NCG, but not direct threading .TOS = .TOS * .NOS .PUSH Now call this word from a loop: : square_to_thousand 1000 0 do i square . loop ; Using the direct threading model, this does 2000 global lookups and subroutine invokes, which in turn, do the actual "work" of 1000 multiplications and the associated stack traffic. The lookups and invokes are pure inner-loop overhead. Using NCG this does 1000 multiplications and the associated stack traffic (which can be optimized out for the most part) with no lookups or invokes. The overhead of diect threading vs. NCG does not need to be benchmarked, it can be proven by argument: both methods execute the same code the same way, but the NCG method does 2000 less global lookups and invokes. The "extra" compiler overhead is trivial, and it only applies to compile-time; generally when a program is started. At run-time (when all those lookups and invokes are happening in the direct thread case) there is no additional compilation overhead because a word is compiled only once. Almost all other Forth's that you may see either direct or indirect thread; this is not because it is faster (it isn't) or simpler (not much), but because it is portable and requires no or little asm. If there were only one assembly language in the world then NCG would be the *only* way to write a forth interpreter, threading of any kind wouldn't make sense. -Michel
Re: Python, Parrot, and lexical scopes
Aaron Sherman wrote: On Mon, 2004-10-18 at 07:55, Sam Ruby wrote: I've been trying to make sense of Python's scoping in the context of Parrot, and posted a few thoughts on my weblog: http://www.intertwingly.net/blog/2004/10/18/Python-Parrot-and-Lexical-Scopes It seems like everything on that page boils down to: all functions are module-scoped closures. A closer translation: "How do I implement module-scoped closures in Parrot?" Your example: [snip] Is also useful for context, but I don't think you need the Perl translation to explain it. You elided the reason why I included it: Suggestions welcome, in particular, a PIR equivalent to the Perl would be most helpful. If I look at the description of the scratchpad opcodes, I don't see all the pieces that I need (save_context, restore_context, mark_context, swap_context, etc). However, by poking around enough, and with a little bit of dumb luck, I have stumbled across src/sub.c. The functions it defines aren't used by any opcodes, but are used by a few PMCs. Those PMCs have delightful names like continuation, coroutine, and retcontinuation. So the prefered approach is either use one of these, or package the desired functionality into a pyfunction.pmc? Unfortunately, these PMCs don't seem to have test cases. Clearly, I'm fumbling around in the dark. A well placed RTFM (including an indication of *which* FM) would be most welcome. Until then, I will continue to ask questions, make observations, and submit patches to bring the code base in line of where I'm guessing it wants to go - even if many or most of these get rejected. - Sam Ruby
Re: Python, Parrot, and lexical scopes
On Mon, 2004-10-18 at 07:55, Sam Ruby wrote: > I've been trying to make sense of Python's scoping in the context of > Parrot, and posted a few thoughts on my weblog: > > http://www.intertwingly.net/blog/2004/10/18/Python-Parrot-and-Lexical-Scopes It seems like everything on that page boils down to: all functions are module-scoped closures. Your example: Consider the following scope1.py: from scope2 import * print f(), foo foo = 1 print f(), foo and scope2.py: foo = 2 def f(): return foo The expected output is: 2 2 2 1 Is also useful for context, but I don't think you need the Perl translation to explain it. -- â 781-324-3772 â [EMAIL PROTECTED] â http://www.ajs.com/~ajs
Re: [Summary] Register stacks again
All~ This feels similar in spirit to the old framestacks that we used to have. I throught that we moved away from those to single frame things so that we did not have to perform special logic around continuations. I would feel more comfortable if someone explained both the initial motivation of leaving the chunked system and why this does not violate that motivation or that motivation was wrong. Thanks, Matt On Mon, 18 Oct 2004 15:09:53 +0200, Miroslav Silovic <[EMAIL PROTECTED]> wrote: > This is a summary of a private mail conversation between Leo and myself. > No, it didn't start by me forgetting to fix Reply-To when trying to > post follow-up on the list. ;) > > Essentially we whipped up a GC scheme for collecting the register stacks > that doesn't make call/cc-using code, well, unusably slow. > > In addition to LT's original post on the register stack, here's how to > allocate them and clean up after their use. LT, feel free to hit me with > wet noodles if I forgot anything. > > Terminology: > --- > > Register frame is an actual frame of 4x16 registers. > > Register chunk is a flat chunk of memory containing multiple register > frames. It has a water mark that points where a new frame should be > pushed into the chunk. I'm using stack and chunk interchangably. > > Frames are subject to DoD, and chunks are subject to GC. There are > plenty of tricks that can prevent GC from happening in many cases (read > on for details). DoD is necessary anyway (to retrieve the live PMCs > pointed from the register frames). > > A chunk is pinned if GC currently can't copy it over and kill it (read > on for details). > > Allocation: > --- > > Register stacks should be allocated in fairly big chunks. However, since > there must be at least one active chunk per thread (and per coroutine), > choosing anything actually huge will pose a scaling problem. > > Frames are allocated from the current stack, simply by advancing the > water mark of the currently active chunk. If this causes the water mark > to overflow, a new chunk needs to be allocated. > > Note that if a continuation has been captured and then invoked, the > water mark will not necessarily point at the end of the current frame > (since the invoked continuation may keep its registers deeper in the chunk) > > Deallocation: > --- > > The stack frame can only be popped if the current continuation hasn't > been captured (from the frame being popped). Here, pop means changing > both frame pointer and the watermark. This ammounts to adding a flag to > the current frame and bumping the flag if the return continuation gets > moved into any directly accessible location. If the frame can't be > popped, only the frame pointer should be changed to the caller's. > > GC: > --- > > During DoD, the register frames should be marked (as parts of their > chunks). Then the dead frames get dealt with in the following manner: > > Remove the trailing dead frames from each chunk (by just lowering the > water mark). > > If after this the water mark remains high (e.g. past 50% of the chunk) > but more than certain ammount of the chunk is salvagable as dead frames > (50% seems like a good number again), the chunk should be copied, all > the frame pointers fixed up, then the chunk gets killed. Essentially the > chunks are treated as buffers. The watermark lowering won't help in > cases when continuations get invoked in a round-robin manner (wish I > could think of some simple Scheme example here that behaves this way), > and start littering the chunk with interlaced dead frames. > > Two caveats: > > The frame pointer of the currently active frames (can be more than 1 due > to threads) may be kept in a CPU register and can't be fixed-up. So the > chunk containing currently active frame is pinned until it either > overflows into another chunk or gets freed by popping. > > Chunks can contain reverse pointers to the continuations that use its > frames. When copying the frame, just go through these reverse pointers > and fix the continuations they point to. > > Performance: > --- > > This scheme requires some GC flags to each frame (as well as reverse > pointers). Possibly also next pointers to the frames, if they are not > cut to equal size. > > Without continuations, this behaves like C function calling. Nothing > will read return continuation and so the frames will simply get popped > from the stack on return. > > When continuations begin to prevent popping, the stack will start > growing from the last captured continuation (even if its dead). > Watermark will drop in GC if the GC happens to hit while the active > frame is way down the stack (i.e. just between two function calls). > Otherwise, GC will wait till the chunk overflows (so that the active > frame is in a higher chunk) and then will copy the live frames to a > newly allocated chunk, compacting several chunks together if possible. > Copying can be skipped if the chunk is near-full of the live frames. > >
Re: Parrot Forth 0.1
On Sun, 17 Oct 2004 22:07:11 -0500 (CDT), Michel Pelletier <[EMAIL PROTECTED]> wrote: > This is my first chance to take a look at it but > I'm sorry I've nto been able to run it because > I'm on a different machine. I did look at the > code though. Thanks for the feedback. I don't have time to respond to everything right now, but I thought I'd at least send an initial reply. > good chance to analyze it. There ar some > differences, like I keep the stack in a > register, you keep yours in a global, and you > store your core words in an "operations" global, > and I use a Parrot lexical pad. I've renamed this global to "words" since the release. Naming it "operations" reflects my Forth naivety. > I have no idea if storng something in a register > is worse than a global. I have certainly had > problems with IMCC stomping my well known > registers with $P? temp vars. Lexical pads vs > keeping your own hash are probably equivalent, > but perhaps pads have some cross-language > benefit. I think using a global is probably the only way to get around the register stomping. I don't think there would be any cross-language benefit from using lexical pads. In other languages, yes, but since Forth is built around a stack, another language can't very well call a Forth subroutine. It makes more sense to eval something like "1 2 sub .". I'd be interested in knowing how store_global/find_global differ in terms of speed/implementation. The same goes for the lexical opcodes. Dan or Leo? > I propose you and I work together to make a > totally Forth-language agnostic Forth > micro-kernel. This kernel can be very > minimalistic, a stacik, a machine state hash, > and definitions for the words "code", "next", > "word", and "'" (tick) all having standard Forth > behavior, a simple dictionary and a simple eval > loop. I'll reply to this portion of your email later, when I get time to think and to look at the Parakeet code. > Some Parakeet ideas might also be used in your > code, for example, it looks to me like your code > does direct threading: ... > Direct threading is a common Forth > implementation technique, but it was most often > used because it could be implemented portably in > C with only a small bit of asm. For smaller ops > like @ !, math ops, amd many others, it is more > optimal to use direct code generation to > "inline" the PIR code itself instead of > inlineing an invoke to the PIR code compiled as > as sub. ... > resulting in a lot less overhead for core words. > NCG was usually either a commercial feature or > rarely seen in Forth because it was > non-portable, being written in ASM, and > expensive to maintain and multiple platforms. > We can kick that problem to the door. I'm not sure that's right. I did think about putting the code inline (and it would be a trivial change to do so), but I'm not convinced it would be faster. Yes, you wouldn't have to deal with the overhead involved with making subroutine calls, but IMCC would also have to re-parse and re-compile the code every time. I really ought to benchmark it (later, when there's more time, I guess), but I wouldn't be surprised if calling the subroutines was faster. I can also investigate using the fastcall pragma since there are no parameters. Of course, I would need to add in the cost of doing a hash lookup as well, so you may be right. Ultimately, I don't care about speed so much for Forth. Maybe I should. I don't really plan on using it, so this is more of an exercise than anything else. It's more important to me that the implementation be clean and readable than for it to be fast. I want there to be a low learning curve. -- matt
Re: [ANNOUNCE] Test::Simple 0.49
On Mon 18 Oct 2004 16:34, "H.Merijn Brand" <[EMAIL PROTECTED]> wrote: > On Fri 15 Oct 2004 05:20, Michael G Schwern <[EMAIL PROTECTED]> wrote: > > Its about freakin' time. Has it really been two years since the last > > stable release? Yes it has. > > > > This is 0.48_02 plus a minor test and MANIFEST fix. > > > > INCOMPATIBILITIES WITH PREVIOUS VERSIONS > > * Threading is no longer automatically turned on. You must turn it on > > before you use > > Test::More if you want it. See BUGS and CAVEATS for info. > > Please consider 0.50 very soon, in which you fix 'err' calls that are an > obvious mistake given defined-or functionality in blead and 5.8.x-dor: That would be too easy to call. here's a patch ... All tests successful, 2 tests and 7 subtests skipped. Files=46, Tests=290, 11 wallclock secs ( 8.79 cusr + 0.97 csys = 9.76 CPU) --8<--- TS49_01.diff diff -r -pu Test-Simple-0.49/t/fail-more.t Test-Simple-0.49_01/t/fail-more.t --- Test-Simple-0.49/t/fail-more.t 2004-10-15 05:07:33 +0200 +++ Test-Simple-0.49_01/t/fail-more.t 2004-10-18 16:37:22 +0200 @@ -38,7 +38,7 @@ sub ok ($;$) { } -sub main::err ($) { +sub main::Err ($) { my($expect) = @_; my $got = $err->read; @@ -65,7 +65,7 @@ $tb->use_numbers(0); # Preserve the line numbers. #line 38 ok( 0, 'failing' ); -err(can('that') failed @@ -149,7 +149,7 @@ isa_ok(bless([], "Foo"), "Wibble"); isa_ok(42,"Wibble", "My Wibble"); isa_ok(undef, "Wibble", "Another Wibble"); isa_ok([],"HASH"); -err( builder->no_ending(1); #line 62 fail( "this fails" ); -err( <8--- -- H.Merijn BrandAmsterdam Perl Mongers (http://amsterdam.pm.org/) using perl-5.6.1, 5.8.3, & 5.9.x, and 809 on HP-UX 10.20 & 11.00, 11i, AIX 4.3, SuSE 9.0, and Win2k. http://www.cmve.net/~merijn/ http://archives.develooper.com/[EMAIL PROTECTED]/ [EMAIL PROTECTED] send smoke reports to: [EMAIL PROTECTED], QA: http://qa.perl.org TS.diff Description: Binary data
[Summary] Register stacks again
This is a summary of a private mail conversation between Leo and myself. No, it didn't start by me forgetting to fix Reply-To when trying to post follow-up on the list. ;) Essentially we whipped up a GC scheme for collecting the register stacks that doesn't make call/cc-using code, well, unusably slow. In addition to LT's original post on the register stack, here's how to allocate them and clean up after their use. LT, feel free to hit me with wet noodles if I forgot anything. Terminology: --- Register frame is an actual frame of 4x16 registers. Register chunk is a flat chunk of memory containing multiple register frames. It has a water mark that points where a new frame should be pushed into the chunk. I'm using stack and chunk interchangably. Frames are subject to DoD, and chunks are subject to GC. There are plenty of tricks that can prevent GC from happening in many cases (read on for details). DoD is necessary anyway (to retrieve the live PMCs pointed from the register frames). A chunk is pinned if GC currently can't copy it over and kill it (read on for details). Allocation: --- Register stacks should be allocated in fairly big chunks. However, since there must be at least one active chunk per thread (and per coroutine), choosing anything actually huge will pose a scaling problem. Frames are allocated from the current stack, simply by advancing the water mark of the currently active chunk. If this causes the water mark to overflow, a new chunk needs to be allocated. Note that if a continuation has been captured and then invoked, the water mark will not necessarily point at the end of the current frame (since the invoked continuation may keep its registers deeper in the chunk) Deallocation: --- The stack frame can only be popped if the current continuation hasn't been captured (from the frame being popped). Here, pop means changing both frame pointer and the watermark. This ammounts to adding a flag to the current frame and bumping the flag if the return continuation gets moved into any directly accessible location. If the frame can't be popped, only the frame pointer should be changed to the caller's. GC: --- During DoD, the register frames should be marked (as parts of their chunks). Then the dead frames get dealt with in the following manner: Remove the trailing dead frames from each chunk (by just lowering the water mark). If after this the water mark remains high (e.g. past 50% of the chunk) but more than certain ammount of the chunk is salvagable as dead frames (50% seems like a good number again), the chunk should be copied, all the frame pointers fixed up, then the chunk gets killed. Essentially the chunks are treated as buffers. The watermark lowering won't help in cases when continuations get invoked in a round-robin manner (wish I could think of some simple Scheme example here that behaves this way), and start littering the chunk with interlaced dead frames. Two caveats: The frame pointer of the currently active frames (can be more than 1 due to threads) may be kept in a CPU register and can't be fixed-up. So the chunk containing currently active frame is pinned until it either overflows into another chunk or gets freed by popping. Chunks can contain reverse pointers to the continuations that use its frames. When copying the frame, just go through these reverse pointers and fix the continuations they point to. Performance: --- This scheme requires some GC flags to each frame (as well as reverse pointers). Possibly also next pointers to the frames, if they are not cut to equal size. Without continuations, this behaves like C function calling. Nothing will read return continuation and so the frames will simply get popped from the stack on return. When continuations begin to prevent popping, the stack will start growing from the last captured continuation (even if its dead). Watermark will drop in GC if the GC happens to hit while the active frame is way down the stack (i.e. just between two function calls). Otherwise, GC will wait till the chunk overflows (so that the active frame is in a higher chunk) and then will copy the live frames to a newly allocated chunk, compacting several chunks together if possible. Copying can be skipped if the chunk is near-full of the live frames. I think this about sums it up. Comments, corrections, wet noodles? Miro
Re: [ANNOUNCE] Test::Simple 0.49
On Fri 15 Oct 2004 05:20, Michael G Schwern <[EMAIL PROTECTED]> wrote: > Its about freakin' time. Has it really been two years since the last > stable release? Yes it has. > > This is 0.48_02 plus a minor test and MANIFEST fix. > > INCOMPATIBILITIES WITH PREVIOUS VERSIONS > * Threading is no longer automatically turned on. You must turn it on > before you use > Test::More if you want it. See BUGS and CAVEATS for info. Please consider 0.50 very soon, in which you fix 'err' calls that are an obvious mistake given defined-or functionality in blead and 5.8.x-dor: NOTE: There have been API changes between this version and any older than version 0.48! Please see the Changes file for details. Checking if your kit is complete... Looks good Writing Makefile for Test::Simple cp lib/Test/Simple.pm blib/lib/Test/Simple.pm cp lib/Test/Builder.pm blib/lib/Test/Builder.pm cp lib/Test/More.pm blib/lib/Test/More.pm cp lib/Test/Tutorial.pod blib/lib/Test/Tutorial.pod PERL_DL_NONLAZY=1 /pro/bin/perl "-MExtUtils::Command::MM" "-e" "test_harness(0, 'blib/lib', 'blib/arch')" t/*.t t/00test_harness_checkok t/bad_planok t/buffer..ok t/Builder.ok t/curr_test...ok t/details.ok t/diagok t/eq_set..ok t/exitok t/extra...ok t/extra_one...ok t/fail-like...ok t/fail-more...Ambiguous call resolved as CORE::err(), qualify as suc h or use & at t/fail-more.t line 39. syntax error at t/fail-more.t line 39, near "err" Execution of t/fail-more.t aborted due to compilation errors. t/fail-more...dubious Test returned status 255 (wstat 65280, 0xff00) t/failok t/fail_oneok t/filehandles.ok t/forkok t/harness_active..Ambiguous call resolved as CORE::err(), qualify as suc h or use & at t/harness_active.t line 63. Ambiguous call resolved as CORE::err(), qualify as such or use & at t/harness_ac tive.t line 73. syntax error at t/harness_active.t line 63, near "err" syntax error at t/harness_active.t line 73, near "err" Execution of t/harness_active.t aborted due to compilation errors. t/harness_active..dubious Test returned status 255 (wstat 65280, 0xff00) t/has_planok -- H.Merijn BrandAmsterdam Perl Mongers (http://amsterdam.pm.org/) using perl-5.6.1, 5.8.3, & 5.9.x, and 809 on HP-UX 10.20 & 11.00, 11i, AIX 4.3, SuSE 9.0, and Win2k. http://www.cmve.net/~merijn/ http://archives.develooper.com/[EMAIL PROTECTED]/ [EMAIL PROTECTED] send smoke reports to: [EMAIL PROTECTED], QA: http://qa.perl.org
Python, Parrot, and lexical scopes
I've been trying to make sense of Python's scoping in the context of Parrot, and posted a few thoughts on my weblog: http://www.intertwingly.net/blog/2004/10/18/Python-Parrot-and-Lexical-Scopes While I posted it on my weblog for formatting and linking reasons, feel free to respond on the mailing list. Suggestions welcome, in particular, a PIR equivalent to the Perl would be most helpful. - Sam Ruby
Re: Problems with 0.1.1 release on x86-64
Sorry for the delay...work interfered with my playing and I had to transfer my CVS repo to my x86-64 machine. I don't know if I'd classify it as "silence thereafter..." as in the summary, but its pretty close :) Here's the diff against the current CVS. It doesn't mess with the default class that needs the split for the return & exception. Brian Wheeler [EMAIL PROTECTED] Index: config/auto/jit.pl === RCS file: /cvs/public/parrot/config/auto/jit.pl,v retrieving revision 1.33 diff -u -r1.33 jit.pl --- config/auto/jit.pl 8 Mar 2004 08:49:05 - 1.33 +++ config/auto/jit.pl 18 Oct 2004 05:25:57 - @@ -171,9 +171,9 @@ else { Configure::Data->set( jitarchname => 'nojit', - jitcpuarch => 'i386', - jitcpu => 'I386', - jitosname => 'nojit', + jitcpuarch => $cpuarch, + jitcpu => $cpuarch, + jitosname => $osname, jitcapable => 0, execcapable => 0, cc_hasjit => '', Index: config/auto/memalign.pl === RCS file: /cvs/public/parrot/config/auto/memalign.pl,v retrieving revision 1.10 diff -u -r1.10 memalign.pl --- config/auto/memalign.pl 13 Oct 2004 14:37:59 - 1.10 +++ config/auto/memalign.pl 18 Oct 2004 05:25:57 - @@ -42,6 +42,13 @@ Configure::Data->set('malloc_header', 'stdlib.h'); } +if (Configure::Data->get('ptrsize') == Configure::Data->get('intsize')) { + Configure::Data->set('ptrcast','int'); + } +else { + Configure::Data->set('ptrcast','long'); + } + cc_gen('config/auto/mema On Thu, 2004-10-14 at 06:37, Leopold Toetsch wrote: > Brian Wheeler wrote: > > > > * cast warnings in default.pmc. Changing static int cant_do_method to > > static long cant_do_method makes it compile without warnings, but its > > not the right fix. > > Better would be to split the return statement and the exception in the > generated code. > > > Below is a patch which fixes the first 3. > > Doesn't apply. Please rediff to current CVS and attach the patch. > > Thanks, > leolign/test_c.in'); eval { cc_build(); }; unless ($@ || cc_run_capture() !~ /ok/) { Index: config/auto/memalign/test_c.in === RCS file: /cvs/public/parrot/config/auto/memalign/test_c.in,v retrieving revision 1.4 diff -u -r1.4 test_c.in --- config/auto/memalign/test_c.in 13 Jul 2003 18:52:37 - 1.4 +++ config/auto/memalign/test_c.in 18 Oct 2004 05:25:57 - @@ -9,6 +9,6 @@ int main(int argc, char **argv) { void *ptr = memalign(256, 17); - puts(ptr && ((int)ptr & 0xff) == 0 ? "ok" : "nix"); + puts(ptr && ((${ptrcast})ptr & 0xff) == 0 ? "ok" : "nix"); return 0; } Index: config/auto/memalign/test_c2.in === RCS file: /cvs/public/parrot/config/auto/memalign/test_c2.in,v retrieving revision 1.3 diff -u -r1.3 test_c2.in --- config/auto/memalign/test_c2.in 13 Jul 2003 18:52:37 - 1.3 +++ config/auto/memalign/test_c2.in 18 Oct 2004 05:25:57 - @@ -20,6 +20,6 @@ * arbitrary allocation size) */ int i = posix_memalign(&p, s, 177); - puts(((int)p & 0xff) == 0 && i == 0 ? "ok" : "nix"); + puts(((${ptrcast})p & 0xff) == 0 && i == 0 ? "ok" : "nix"); return i; }