Re: continuation enhanced arcs
Piers Cawley [EMAIL PROTECTED] wrote: Leopold Toetsch [EMAIL PROTECTED] writes: ... While S registers hold pointers, they have value semantics. Is that guaranteed? Because it probably needs to be. It's the current implementation and tested. This would restore the register contents to the first state shown above. That is, not only I and N registers would be clobbered also S registers are involved. That's correct. What's the problem? Okay, you've created an infinite loop, but what you're describing is absolutely the correct behaviour for a continuation. Ok. It's a bit mind-twisting but OTOH it's the same as setjmp/longjmp with all implications on CPU registers. C has the volatile keyword to avoid clobbering of a register due to a longjmp. Above code could only use P registers. Or in other words: I, N, and S registers are almost[1] useless. No they're not. But you should expect them to be reset if you take a (full) continuation back to them. The problem I have is: do we know where registers may be reset? For example: $I0 = 10 loop: $P0 = shift array dec $I0 if $I0 goto loop What happens if the array PMC's Cshift get overloaded and does some fancy stuff with continuations. My gut feeling is that the loop might suddenly turn into an infinite loop, depending on some code behind the scenes ($I0 might be allocated into the preserved register range or not depending on allocation pressure). Second: if we don't have a notion that a continuation may capture and restore a register frame, a compiler can hardly use any I,S,N registers because some library code or external function might just restore these registers. Presumably if foo() doesn't store a full continuation, the restoration just reuses an existing register frame and, if foo has made a full continuation its return does a restore by copying? Yes, that should be a reasonable implementation. leo
Re: Premature pessimization
Sam Ruby [EMAIL PROTECTED] wrote: Leopold Toetsch wrote: So *all* lookups (complete with the asterisks) does not mean *all* lookups. How about invoke? Let's first concentrate on simpler stuff like infix operators. Citing S06: Operators are just subroutines with special names. That statement is true for Perl. Same statement is true for Python. But the names vary based on the language. Yes. So let's factor out the common part and have that in Parrot core, usable for Python and Perl and ... The PyInt PMC currently duplicates almost all functionality that *should* be in the Integer PMC. We have first to fix the Integer PMC to do the Right Thing. Then we need some syntax for multiple inheritance in PMCs. The same holds for other PMCs. It was already proposed that we should have a language-neutral Hash PMC. So given that we have a set of language-neutral PMCs in core that do the right thing, Python or Perl PMCs can inherit a lot of functionality from core PMCs. Language-specific behavior is of course implemented in the specific PMC. Second: method dispatch. I've looked a bit into PyObject. It seems that you start rolling your own method dispatch. Please don't get me wrong, I'm not criticizing your implementation. It might also be needed for some reasons I'm just overlooking and it's currently needed because core functionality isn't totally finished. Anyway - and please correct me if my assumptions are not true - I'll try to factor out the common part again. You have in PyObject e.g.: METHOD PMC* __add__(PMC *value) { PMC * ret = pmc_new(INTERP, dynclass_PyObject); mmd_dispatch_v_ppp(INTERP, SELF, value, ret, MMD_ADD); return ret; } I see six issues with that kind of approach: * The __add method should be in Parrot core. That's what I've described in the MMD dispatch proposal. * the method is returning a new PMC. This doesn't follow the signature of Parrot infix MMD operations. * well, it's dispatching twice. First the __add__ method for PyObjects has to be searched for then the mmd_dispatch is done. * it'll very likely not work together with other HLLs. It's a python-only solution. * rolling your own dispatch still doesn't help, if a metaclass overloads the C+ operation * code duplication So how would I do it: * prelim: above mentioned core PMC cleanup is done. Inheritance works: a PyInt isa(PyObject, Integer) * the core PMCs define methods, like your __add__ except that our naming conventions is __add. The Python translator needs just a translation table for the common core methods. * Method dispatch is done at the opcode level. add Px, Py, Pz just does the right thing. It calls the thingy that implements the __add method, being in core or overloaded shouldn't and doesn't matter. If inheritance changes at runtime it just works. And the other way round: Py.__add(Pz, Px) is the same. Again it doesn't matter, if it's a core PMC, a Python PMC or an overloaded PASM/PIR multi sub (or a Python metaclass). The only difference is the changed signature. But that's how Parrot core defines overloaded infix operations. We have to do that anyway. It's just the correct way to go. (And please no answers WRT efficiency ;-) leo
Devel::Cover cover command uses to much memory
Hi, I ran the codestriker (http://codestriker.sourceforge.net/) test set using Devel::Cover. The test cases ran over a day and a half and generated a cover_db directory that is 127 megs. Attempting to run the cover command keeps using up all of the available memory causing cover to be killed by the OS. I have my swap file up to 1 gig, and after two days of the computer swapping its brains out, it still was not enough memory. How much memory is need for cover to process a 126 meg cover_db? Are there any switches or other tricks I could do to reduce the memory consumption of cover? If somebody is willing to work on the problem, I can zip up the directory and send it too them for testing. I am using cover 0.51, on Debian 3.0. Perl is version 5.6.1. Otherwise it did seem to work ok on my previous smaller test runs. Lastly, some documentation on how to use with with a normal cgi script would be helpful. The way I finally got it to work was to rename codestriker.pl (the main cgi perl script), to codestriker_test.pl. Write a new codestriker.pl that just does a system call with the Devel::Cover switch. Perl would not let me add it to the #!/usr/bin/perl line at the start of the script. I would be interested in knowing if a cleaner way is possible, as this is kind of lame. Thanks Jason.
Re: C implementation of Test::Harness' TAP protocol
On Dec 7, 2004, at 9:25 PM, Andrew Savige wrote: /* Horrible hacky thread-unsafe version but no XX */ ... static const char* g_file; static unsigned long g_line; i forgot to mention, the way around the non-thread-safety here is to use thread-local storage. c.f. pthread_key_create() and pthread_getspecific(). for a similarly evil trick, the GNU C library defines the global errno like this: /* function that fetches the address of the calling thread's errno from TLS */ int * __get_errno_address (void); #define errno (*__get_errno_address()) -- He's so good, you're gonna rock, and if you don't rock, it's your own fault. -- kk, describing the perks of having a very good drummer.
RE: C implementation of Test::Harness' TAP protocol
Clayton, Nik wrote: You might want to throw it in as an option. I'm going to change Test::More so it no longer mucks with the exit code by default, you'll have to turn this feature on. OK. I'll track changes to Test::Harness, and libtap'll stop doing it when T::H stops. Or, more simply, I'll just document it as something the test author can do for completeness, if they're so inclined, but that it's not mandatory. N -- 11 2 3 4 5 6 77 0 0 0 0 0 0 05 -- The 75 column-ometer Not speaking on behalf of my employer. /bush
base scalar PMC semantics
First, there was some dicussion not too long ago: Subject: Numeric semantics for base pmcs [1] Subject: Last bits of the basic math semantics The current Integer PMC doesn't yet follow the results of these threads. Basic behavior of that type is Perl6 or Python semantics, which is: it's basically an arbitrary precision integer, like Python's int/long type after merging. To achieve this functionality it silently morphs results to a Big type capable of doing the arbitrary precision. The summary in [1] also mentions type coercion: 10) The destination PMC is responsible for final conversion of the inbound value E.g. when we have MMD add(PyInt + PyInt) a) no overflow: VTABLE_set_integer_native(interp, dest, the_sum) the set_integer_native vtable is responsibe to convert the Cdest PMC into a PyInt. For Perl types it'll be PerlInt. And base PMCs use Integer. Following strictly this scheme does allow the inheritance of all common functionality. b) overflow: if (self == dest) { VTABLE_set_bignum(interp, self, self.intval) // redispatch } else { VTABLE_set_bignum(interp, dest, self.intval) temp = new dest.type VTABLE_set_bignum(interp, temp, self.intval) // redispatch or a similar scheme. Float and String needs the same refactoring, but that's simpler. To use that functionality we need a better notion for multiple inheritance inside PMCs. PerlInt isa (PerlAny, Integer) PyInt isa (PyObject, Integer) Comments? leo
Re: Premature pessimization
Ah! Now we are getting somewhere! Leopold Toetsch wrote: Sam Ruby [EMAIL PROTECTED] wrote: Leopold Toetsch wrote: So *all* lookups (complete with the asterisks) does not mean *all* lookups. How about invoke? Let's first concentrate on simpler stuff like infix operators. OK, but the point is that there will always be multiple mechanisms for dispatch. Citing S06: Operators are just subroutines with special names. That statement is true for Perl. Same statement is true for Python. But the names vary based on the language. Yes. So let's factor out the common part and have that in Parrot core, usable for Python and Perl and ... The PyInt PMC currently duplicates almost all functionality that *should* be in the Integer PMC. We have first to fix the Integer PMC to do the Right Thing. Then we need some syntax for multiple inheritance in PMCs. The same holds for other PMCs. It was already proposed that we should have a language-neutral Hash PMC. No question that that is the intended final goal. What you see in the current python dynclasses is not representative of that final goal. So, why have I proceeded in this manner? Two reasons. First, I am not about to make random, unproven changes to the Parrot core until I am confident that the change is correct. Cloning a class temporarily gives me a playground to validate my ideas. Second, I am not going to wait around for Warnocked questions and proposals to be addressed. Now, neither of the above are absolutes. You have seen me make changes to the core - but only when I was relatively confident. And I *have* put on hold trying to reconcile object oriented semantics as this is both more substantial and seemed to be something that was likely to be addressed. Also, while I am not intending to make speculative changes to the core of Parrot, I don't have any objections to anybody making changes on my behalf. If you see some way of refactoring my code, go for it. It isn't mine - it is the community's. The one thing I would like to ask is that test cases that currently pass continue to pass. The dynclass unit tests are part of the normal test. Additionally, the tests in languages/parrot have been the ones driving most of my implementation lately. I do realize that that means checking out Pirate. Even though I don't agree with it, I do understand Michal's licensing issues. The reason I am not investing much time in resolving this issue is that Pirate is exactly one source file and could quickly be rewritten using the Perl 6 Grammar engine once that functionallity becomes sufficiently complete. So given that we have a set of language-neutral PMCs in core that do the right thing, Python or Perl PMCs can inherit a lot of functionality from core PMCs. Language-specific behavior is of course implemented in the specific PMC. Agreed. One area that will require a bit more thought is error cases. The behavior of integer divide by zero is likely to be different in each language. This could be approached in a number of different ways. One is by cloning such methods, like I have done. Another is to wrap such methods, catch the exception that is thrown, and handle it in a language specific manner. A better approach would be for the core to call out to a method on such error cases. Subclasses could simply inherit the common core behavior and override this one method. It also means that the normal execution path length (i.e., when dividing by values other than zero) is optimal, it is only the error paths that involve extra dispatches. That's an easy case. Overflow is a bit more subtle. Some languages might want to wrap the results (modulo 2**32). Some languages might want an exception. Other languages might want promotion to BigInt. Even if promotion to BigInt were the default behavior, subclasses would still want to override it. In Python's case, promotion to PyLong (which ideally would inherit from and trivially specialize and extend BigIt) would be the desired effect. Even this is only one aspect of a more general case: all morphing behavior needs to be overridable by subclasses. I believe that this can be easily handled by the current Parrot architecture by virtue of the fact that destination objects must be created before methods are called, and such destination objects can override morph methods). But it would help the cause if code were written to promote things to Integer instead of PerlInt. Yes, at the moment, I'm guilty of this too. Second: method dispatch. I've looked a bit into PyObject. It seems that you start rolling your own method dispatch. Please don't get me wrong, I'm not criticizing your implementation. It might also be needed for some reasons I'm just overlooking and it's currently needed because core functionality isn't totally finished. I'll address your questions below, but for reference, here is the code that Pirate generates for a=b+c: find_type $I0, 'PyObject' new $P0, $I0
Re: Exceptions, sub cleanup, and scope exit
Dan Sugalski [EMAIL PROTECTED] wrote: pushmark 12 popmark 12 pushaction Psub I've now implemented these bits. I hope it's correct, specifically, if a return continuation in only captured, the action handler is not run. See t/pmc/exceptions.t Still missing is the throw opcode. Or better that exists, just exception creation and the extended attributes like language is missing. I'm still voting for a more object-ish exception constructor to better accomodate HLLs different exception usage. E.g. e = new PyKeyError # presumably a constant singleton throw e That ought to be enough for heavily used exception and for Perl6 control exceptions. OTOH e = new Exception setattribute e, message, Pmsg setattribute e, language, PLang ... throw e construct a full exception object. Currently it is: e[_message] = foo e[_error] e[_severity] ... And it could be even something like: cl = getclass Exception e = cl.instantiate(foo, Perl, .error, .severity, ...) leo
Re: Premature pessimization
Sam Ruby [EMAIL PROTECTED] wrote: Ah! Now we are getting somewhere! Yeah. That's the goal. So, why have I proceeded in this manner? Two reasons. Fair enough, both. So given that we have a set of language-neutral PMCs in core that do the right thing, Python or Perl PMCs can inherit a lot of functionality from core PMCs. Language-specific behavior is of course implemented in the specific PMC. Agreed. One area that will require a bit more thought is error cases. Yep. But let's just figure that out later. First the basics. I'll address your questions below, but for reference, here is the code that Pirate generates for a=b+c: find_type $I0, 'PyObject' new $P0, $I0 find_lex $P1, 'b' find_lex $P2, 'c' $P0 = $P1 + $P2 store_lex -1, 'a', $P0 Good. Now Evil Leo (who can't program in Python ;) writes some piece of code like this: $ cat m.py class M(type): def __new__(meta, name, base, vars): cls = type.__new__(meta, name, base, vars) cls.__add__ = myadd return cls def myadd(self, r): return 44 - r I = M('Int', (int,), {}) i = I(5) print i print i + 2 $ python m.py 5 42 What this means is that the __add__ method will not be directly used for either PyInt or PyString objects Well, and that's not true, IMHO. See above. It has to be part of Parrot's method dispatch. What if your translator just sees the last 3 lines of the code and M is in some lib? That implies that you either can't translate to $P0 = $P1 + $P2, or that you just translate or alias __add__ to Parrot's __add and let Parrot fiddle around to find the correct method. * the method is returning a new PMC. This doesn't follow the signature of Parrot infix MMD operations. Here I do think you are misunderstanding. The __add__ method with precisely that signature and semantics is defined by the Python language specification. It is (somewhat rarely) used directly, and therefore must be supported exactly that way. | __add__(...) | x.__add__(y) == x+y Parrot semantics are that the destination exists. But having a look at above myadd, we probably have to adjust the calling conventions for overloaded infix operators, i.e. return the destination value. Or provide both schemes ... dunno. In the general case, looking for reserved method names at compile time doesn't work. __add__ is reserved in Python and corresponds directly to __add in Parrot. I don't think that doesn't work. ... As everything can be overridden, this dispatch must be done at runtime. Exactly and that's what I want to achieve. I personally don't think that performance considerations should be out of bounds in these discussions I've already shown that it's possible to go with fully dynamic dispatch *and* 30% faster for MMD and 70% faster for overloaded operations. First correct and complete, then speed considerations. - Sam Ruby leo
Re: Premature pessimization
Leopold Toetsch wrote: Good. Now Evil Leo (who can't program in Python ;) writes some piece of code like this: $ cat m.py class M(type): def __new__(meta, name, base, vars): cls = type.__new__(meta, name, base, vars) cls.__add__ = myadd return cls def myadd(self, r): return 44 - r I = M('Int', (int,), {}) i = I(5) print i print i + 2 $ python m.py 5 42 What this means is that the __add__ method will not be directly used for either PyInt or PyString objects Well, and that's not true, IMHO. See above. It has to be part of Parrot's method dispatch. What if your translator just sees the last 3 lines of the code and M is in some lib? That implies that you either can't translate to $P0 = $P1 + $P2, or that you just translate or alias __add__ to Parrot's __add and let Parrot fiddle around to find the correct method. Here's the part that you snipped that addresses that question: And there is a piece that I haven't written yet that will do the reverse: if MMD_ADD is called on a PyObject that has not provided such behavior, then an any __add__ method provided needs to be called. * the method is returning a new PMC. This doesn't follow the signature of Parrot infix MMD operations. Here I do think you are misunderstanding. The __add__ method with precisely that signature and semantics is defined by the Python language specification. It is (somewhat rarely) used directly, and therefore must be supported exactly that way. | __add__(...) | x.__add__(y) == x+y Parrot semantics are that the destination exists. But having a look at above myadd, we probably have to adjust the calling conventions for overloaded infix operators, i.e. return the destination value. Or provide both schemes ... dunno. Since you provided an Evil Leo sample, let me provide an Evil Sam sample: d = { __init__: lambda self,x: setattr(self, value, x), __add__: lambda self,x: str(self.value) + str(x.value) } def dict2class(d): class c: pass c.__dict__.update(d) return c c = dict2class(d) a=c(2) b=c(3) print a+b Things to note: 1) classes which are created every time a function is called 2) classes are thin wrappers over a dictionary object Now, given the above sample, let's revisit the statement that The Python translator needs just a translation table for the common core methods. How, exactly, would that be done? Given that the method name is simply a string... used as a key in dictionary... with a different parameter signature than the hypothetical Parrot __add method. That's why I say: In the general case, looking for reserved method names at compile time doesn't work. __add__ is reserved in Python and corresponds directly to __add in Parrot. I don't think that doesn't work. __add__ is *not* reserved in Python. There just is some syntatic sugar that provide a shorthand for certain signatures. I am free to define __add__ methods that have zero or sixteen arguments. I won't be able to call such methods with the convenient shorthand, but other than that, they should work. I personally don't think that performance considerations should be out of bounds in these discussions I've already shown that it's possible to go with fully dynamic dispatch *and* 30% faster for MMD and 70% faster for overloaded operations. First correct and complete, then speed considerations. Neither of which match Python semantics. We are going to need a system where classes are anonymous, not global. Where methods are properties that can be added simply by calling the equivalent of set_pmc_keyed. - Sam Ruby
Re: S05 question
On Tue, Dec 07, 2004 at 10:36:53PM -0800, Larry Wall wrote: : But somehow I expect that when someone writes (foo) they probably : usually meant («foo»). If we're going to stick with the notion that foo captures and something else doesn't, I'm beginning to think that the other thing isn't «foo» for a couple of reasons. First, if other languages are going to borrow this notation, they're probably not going to buy into the French quotes. Second, I can think of several other possible uses for the French quotes to cure perceived ills such as the (...) vs {...} confusion. Third, it now bothers me to have a ! without a ?. So what if «foo» is instead written ?foo, meaning you only want to evaluate its success. (Unlike !foo, it's not zero-width, but that's just how success/failure works.) So we'd get things like / $bar := [ (?ident) = (\N+) ]* / And people would have to get used to seeing ? as non-capturing assertions: ?before ... ?after ... ?ws ?sp ?null This has a rather Ruby-esque I am a boolean feeling to it. I think I like it. It's pretty easy to type, at least on my keyboard. Now suppose that we extend that I am a boolean feeling to ?{ code } which might take the place of the confusing (...), and make consistent the notion that we always use {...} to invoke real code. : : Or is it that hypotheticals only bind to things captured by parens? : : If so, it might need clarification (or perhaps I'm overlooking the part : : that makes it clear). : : No, I think you just found a blind spot in the design. I think I'm leaning toward the idea that anything in angles that begins alpha is a capture to just the alpha part, so the ? prefix is merely a no-op that happens to make the assertion not start with an alpha. Interestingly, that gives these implicit bindings: after ... $after$` before ...$before $' Thought that's an argument for changing them to pre ... and post ..., I suppose, since if users are going to refer to $after in their main program, it doesn't look like a declarative assertion anymore. Another problem we've run into is naming if there are multiple assertions of the same name. If the capture name is just the alpha part of the assertion, then we could allow an optional number, and still recognize it as a ws: ws1 ws2 ws3 Except I can well imagine people wanting numbered rules. Drat. Could force people to say ws_1 if they want that, I suppose. Or we could use some standard delim for that: ws-1 ws-2 ws-3 which is vaguely reminiscent of our version syntax. Indeed, if we had quantifications, you might well want to have wildcards ws-* and let the name be filled in rather than autogenerating a list. But maybe we just stick with lists in that case. For captures of non-alpha assertions, we could say that ? is the same as true (just as with regular operators), and so true-3 +alpha-[aeiou] would capture to $true-3. (And one could always do an explicit binding for a different name.) Actually, I think people would find $match-3 more meaningful than Ctrue-3. I'm still thinking about what «...» might mean, if anything. Bonus points for interpolative and/or word-splitty. Anyway, that's where I am this week/day/hour/minute/second. Larry
Re: S05 question
Larry Wall wrote: Another problem we've run into is naming if there are multiple assertions of the same name. If the capture name is just the alpha part of the assertion, then we could allow an optional number, and still recognize it as a ws: ws1 ws2 ws3 Except I can well imagine people wanting numbered rules. Drat. Could force people to say ws_1 if they want that, I suppose. Or we could use some standard delim for that: ws-1 ws-2 ws-3 which is vaguely reminiscent of our version syntax. Indeed, if we had quantifications, you might well want to have wildcards ws-* and let the name be filled in rather than autogenerating a list. But maybe we just stick with lists in that case. For captures of non-alpha assertions, we could say that ? is the same as true (just as with regular operators), and so true-3 +alpha-[aeiou] would capture to $true-3. (And one could always do an explicit binding for a different name.) Actually, I think people would find $match-3 more meaningful than Ctrue-3. PHP's use of $array[] as push might work for this: true[] +alpha-[aeiou] or @true +alpha-[aeiou] or true=1.. +alpha-[aeiou] or true@ +alpha-[aeiou] I like the idea of being able to continue versus chunk patterns. How do you say This is a continuation of the other thing versus This is a separate thing ? =Austin
Re: S05 question
On Wed, Dec 08, 2004 at 08:19:17AM -0800, Larry Wall wrote: And people would have to get used to seeing ? as non-capturing assertions: ?before ... ?after ... ?ws ?sp ?null This has a rather Ruby-esque I am a boolean feeling to it. I think I like it. It's pretty easy to type, at least on my keyboard. FWIW, for some reason in rule contexts I tend to conflate I am a boolean feelings with zero-width assertion, so that each of those look vaguely to me as though I'm testing a zero-width proposition and not consuming any text. And I still tend to think of '?' in it's zero or one matches or minimal match connotations. Oh well, I suppose I could get used to that. Now suppose that we extend that I am a boolean feeling to ?{ code } which might take the place of the confusing (...), and make consistent the notion that we always use {...} to invoke real code. Hmm, this is nice, however. Another problem we've run into is naming if there are multiple assertions of the same name. If the capture name is just the alpha part of the assertion, then we could allow an optional number, and still recognize it as a ws: ws1 ws2 ws3 Except I can well imagine people wanting numbered rules. Drat. Could force people to say ws_1 if they want that, I suppose. I had been thinking that /ws foo ws bar/ would simply cause $ws to be a list of captured elements, similar to what might happen for $1 in / [ (.*?) , ]* / If someone really needs the contents of the first and second ws, they could do (ws) foo (ws) and get them as $1 and $2. But, seeing this tells me that perhaps (rule) should be used for capturing rules, analogous to the capturing parens, and leave rule to be the non-capturing version. But maybe that's anti-Huffman overall. Maybe the parens could also help for disambiguating (ws) foo (ws) so that we end up with $/ws[1], $/ws[2], etc. But then we might have to always subscript our named captures, which is icky, or maybe we'd only make $/ws act like list when there's more than one capturing (ws) in the rule. I dunno. I kinda like (rule) for capturing, but maybe it just doesn't work. Pm
Re: Devel::Cover cover command uses to much memory
On Tue, Dec 07, 2004 at 07:21:09PM -0800, Jason Remillard wrote: I ran the codestriker (http://codestriker.sourceforge.net/) test set using Devel::Cover. The test cases ran over a day and a half and generated a cover_db directory that is 127 megs. Attempting to run the cover command keeps using up all of the available memory causing cover to be killed by the OS. I have my swap file up to 1 gig, and after two days of the computer swapping its brains out, it still was not enough memory. How big is this test suite? How long does it usually take to run? Just trying to get an order-of-magnitude feel here. Lastly, some documentation on how to use with with a normal cgi script would be helpful. The way I finally got it to work was to rename codestriker.pl (the main cgi perl script), to codestriker_test.pl. Write a new codestriker.pl that just does a system call with the Devel::Cover switch. Perl would not let me add it to the #!/usr/bin/perl line at the start of the script. I would be interested in knowing if a cleaner way is possible, as this is kind of lame. You just have to say use Devel::Cover in your program. That's what -MDevel::Cover means. -- Michael G Schwern[EMAIL PROTECTED] http://www.pobox.com/~schwern/ It's Yellowing Laudanum time!
Re: S05 question
Larry Wall writes: If we're going to stick with the notion that foo captures and something else doesn't, I'm beginning to think that the other thing isn't foo for a couple of reasons. I just sat down to say the exact same thing. I'm glad you beat me to it. And people would have to get used to seeing ? as non-capturing assertions: ?before ... ?after ... ?ws ?sp ?null This has a rather Ruby-esque I am a boolean feeling to it. I think I like it. It's pretty easy to type, at least on my keyboard. Yeah, I like it pretty well too. Better than the french quites for sure. Now suppose that we extend that I am a boolean feeling to ?{ code } which might take the place of the confusing (...), and make consistent the notion that we always use {...} to invoke real code. Hmm... I'm just so attached to (...). I find it quite beautiful. It also somehow communicates the feeling you shouldn't be putting side-effects here. I think I'm leaning toward the idea that anything in angles that begins alpha is a capture to just the alpha part, so the ? prefix is merely a no-op that happens to make the assertion not start with an alpha. Interestingly, that gives these implicit bindings: after ... $after$` before ... $before $' I don't quite follow. Wouldn't that mean that these guys would get clobbered if you used lookaheads or lookbehinds in your rules? Or we could use some standard delim for that: ws-1 ws-2 ws-3 which is vaguely reminiscent of our version syntax. Indeed, if we had quantifications, you might well want to have wildcards ws-* and let the name be filled in rather than autogenerating a list. But maybe we just stick with lists in that case. I can imagine this being a lot cleaner if the thing after the dash can be any sort of identifier: ws-indent if ?ws condition ws-comment On the other hand, it could be misleading, since the standard naming of BNF uses dashes instead of underscored. I don't think it should be a big problem though. I'm still thinking about what ... might mean, if anything. Bonus points for interpolative and/or word-splitty. Yeah... umm... nope. I got nothin. Luke
Re: S05 question
On Wed, 8 Dec 2004 08:19:17 -0800, Larry Wall [EMAIL PROTECTED] wrote: / $bar := [ (?ident) = (\N+) ]* / You know, to be honest I don't know that I want rules in one-liners to capture by default. I certainly want them to capture in rules, though. And people would have to get used to seeing ? as non-capturing assertions: ?before ... ?after ... ?ws ?sp ?null This has a rather Ruby-esque I am a boolean feeling to it. I think I like it. It's pretty easy to type, at least on my keyboard. I like it. It reads to me as if before ..., if null. Sounds good. I think I'm leaning toward the idea that anything in angles that begins alpha is a capture to just the alpha part, so the ? prefix is merely a no-op that happens to make the assertion not start with an alpha. Interestingly, that gives these implicit bindings: after ... $after$` before ...$before $' Again, I don't see the utility of that in a one-liner. In a grammar, you would create a real rule which would assert after ... and capture the result in a reasonable name. Anyway, that's where I am this week/day/hour/minute/second. I'm thinking capturing rules should be default in rules, where they're downright useful. Your hour/minute/second comment brings up parsing ISO time: grammar ISO8601::DateTime { rule year { \d4 } rule month { \d2 } rule day { \d2 } rule hour { \d2 } rule minute { \d2 } rule second { \d2 } rule fraction { \d+ } rule date { year -? month -? day } rule time { hour \:? minute \:? second [\. fraction]? } rule datetime { date T time } } For a grammar, that works perfectly! In a one-liner, I'd rather just use: $datetime ~~ /$year := (\d+) -? $month := (\d+) -? ./ and specify the vars I want to save directly in my own scope. Ashley Winters
Re: continuation enhanced arcs
Leopold Toetsch [EMAIL PROTECTED] writes: Piers Cawley [EMAIL PROTECTED] wrote: Leopold Toetsch [EMAIL PROTECTED] writes: ... While S registers hold pointers, they have value semantics. Is that guaranteed? Because it probably needs to be. It's the current implementation and tested. This would restore the register contents to the first state shown above. That is, not only I and N registers would be clobbered also S registers are involved. That's correct. What's the problem? Okay, you've created an infinite loop, but what you're describing is absolutely the correct behaviour for a continuation. Ok. It's a bit mind-twisting but OTOH it's the same as setjmp/longjmp with all implications on CPU registers. C has the volatile keyword to avoid clobbering of a register due to a longjmp. Above code could only use P registers. Or in other words: I, N, and S registers are almost[1] useless. No they're not. But you should expect them to be reset if you take a (full) continuation back to them. The problem I have is: do we know where registers may be reset? For example: $I0 = 10 loop: $P0 = shift array dec $I0 if $I0 goto loop What happens if the array PMC's Cshift get overloaded and does some fancy stuff with continuations. My gut feeling is that the loop might suddenly turn into an infinite loop, depending on some code behind the scenes ($I0 might be allocated into the preserved register range or not depending on allocation pressure). Second: if we don't have a notion that a continuation may capture and restore a register frame, a compiler can hardly use any I,S,N registers because some library code or external function might just restore these registers. This is, of course, why so many languages that have full continuations use reference types throughout, even for numbers. And immutable strings...
Python method overloading (was: Premature pessimization)
Sam Ruby [EMAIL PROTECTED] wrote: Leopold Toetsch wrote: Here's the part that you snipped that addresses that question: And there is a piece that I haven't written yet that will do the reverse: if MMD_ADD is called on a PyObject that has not provided such behavior, then an any __add__ method provided needs to be called. Ok. But that would imply that HLL interoperbility isn't really possible. Or just at a minimal surface level. But see below. Since you provided an Evil Leo sample, let me provide an Evil Sam sample: d = { __init__: lambda self,x: setattr(self, value, x), __add__: lambda self,x: str(self.value) + str(x.value) } def dict2class(d): class c: pass c.__dict__.update(d) ^^^ This is the critical part of it. The __dict__ of your class provides the namespace. Setting a key in that namespace (or an attribute of your class with that key) has a special meaning in Python, *if* that key happens to be one of the method names. While the Python people aren't stopping to talk about the clearness of their language, nothing is clear and explicit, when it comes to overloading or metaclasses. Anyway, IMHO, class.__add__ = foo or your example manipulating class.__dict__ (another special attribute name!) is the point, where you can install Parrot semantics WRT method overloading. Now, given the above sample, let's revisit the statement that The Python translator needs just a translation table for the common core methods. We both know that's a simplification :) You've to install the methods of course ... How, exactly, would that be done? Given that the method name is simply a string... used as a key in dictionary... with a different parameter signature than the hypothetical Parrot __add method. The class.__dict__ dictionary is special. Setting an __add__ key too. The combined meaning is overloading. The different signature is a problem, yes - I've already mentioned that. And Parrot's __add method is not hypothetical :-) $ grep __add t/pmc/object*.t That's why I say: In the general case, looking for reserved method names at compile time doesn't work. __add__ is reserved in Python and corresponds directly to __add in Parrot. I don't think that doesn't work. __add__ is *not* reserved in Python. Does it matter if the name is actually reserved? The meaning is important. ... There just is some syntatic sugar that provide a shorthand for certain signatures. I am free to define __add__ methods that have zero or sixteen arguments. I won't be able to call such methods with the convenient shorthand, but other than that, they should work. I'd say, if you define an '__add__' method with 16 arguments, Python will throw an exception, if you try to use C+ with an object of that class: TypeError: myadd() takes exactly 16 arguments (2 given) So that's rather hypothetical. And if you always use x.__add__(16 args) Parrot will just run the function. I personally don't think that performance considerations should be out of bounds in these discussions I've already shown that it's possible to go with fully dynamic dispatch *and* 30% faster for MMD and 70% faster for overloaded operations. First correct and complete, then speed considerations. Neither of which match Python semantics. We are going to need a system where classes are anonymous, not global. Why? And how do you find your class then: c = C() ... 3 22 LOAD_NAME1 (C) 25 CALL_FUNCTION0 ... Where methods are properties that can be added simply by calling the equivalent of set_pmc_keyed. Nah. Methods aren't properties, but ... The set_pmc_keyed on __dict__ (or an equivalent setattribute call) of your type system is responsible to create Parrot semantics for method calls :-) - Sam Ruby leo
Re: continuation enhanced arcs
Piers Cawley [EMAIL PROTECTED] wrote: Leopold Toetsch [EMAIL PROTECTED] writes: The problem I have is: do we know where registers may be reset? For example: $I0 = 10 loop: $P0 = shift array dec $I0 if $I0 goto loop What happens if the array PMC's Cshift get overloaded and does some fancy stuff with continuations. My gut feeling is that the loop might suddenly turn into an infinite loop, depending on some code behind the scenes ($I0 might be allocated into the preserved register range or not depending on allocation pressure). Second: if we don't have a notion that a continuation may capture and restore a register frame, a compiler can hardly use any I,S,N registers because some library code or external function might just restore these registers. This is, of course, why so many languages that have full continuations use reference types throughout, even for numbers. And immutable strings... So my conclusion that (in combination with restoring registers to the values of continuation creation) I,S,N registers are almost unusable is correct? What about my proposal Lexicals, continuations, and register allocation? Would that provide proper semantics for continuations? leo
Re: [perl #32545] [PATCH] [TODO] remove Perl dependancy on split opcode
Attached is a patch that changes the split opcode to use an Array instead of a PerlArray. It also updates the documentation to note this. All the tests still pass, and a grep in the languages/ directory shows that no language implementations are effected. - James Will Coleda (via RT) wrote: # New Ticket Created by Will Coleda # Please include the string: [perl #32545] # in the subject line of all future correspondence about this issue. # URL: http://rt.perl.org:80/rt3/Ticket/Display.html?id=32545 The split opcode currently uses a PerlArray to house its result. It should use a non-language specific class. ? classes/.array.pmc.swp Index: ops/string.ops === RCS file: /cvs/public/parrot/ops/string.ops,v retrieving revision 1.28 diff -u -r1.28 string.ops --- ops/string.ops 28 Sep 2004 11:26:49 - 1.28 +++ ops/string.ops 6 Dec 2004 19:16:59 - @@ -561,7 +561,7 @@ =item Bsplit(out PMC, in STR, in STR) -Create a new PerlArray PMC $1 by splitting the string $3 with +Create a new Array PMC $1 by splitting the string $3 with regexp $2. Currently implemented only for the empty string $2. =cut @@ -589,7 +589,7 @@ } op split(out PMC, in STR, in STR) :base_core { -PMC *res = $1 = pmc_new(interpreter, enum_class_PerlArray); +PMC *res = $1 = pmc_new(interpreter, enum_class_Array); STRING *r = $2; STRING *s = $3; int slen = string_length(interpreter, s); @@ -599,6 +599,7 @@ goto NEXT(); if (string_length(interpreter, r)) internal_exception(1, Unimplemented split by regex); +VTABLE_set_integer_native(interpreter, res, slen); for (i = 0; i slen; ++i) { STRING *p = string_substr(interpreter, s, i, 1, NULL, 0); /* TODO first set empty string, then replace */
Re: Python method overloading
Leopold Toetsch wrote: Sam Ruby [EMAIL PROTECTED] wrote: Leopold Toetsch wrote: Here's the part that you snipped that addresses that question: And there is a piece that I haven't written yet that will do the reverse: if MMD_ADD is called on a PyObject that has not provided such behavior, then an any __add__ method provided needs to be called. Ok. But that would imply that HLL interoperbility isn't really possible. Or just at a minimal surface level. But see below. I don't believe that to be the case. If a Perl subroutine were to call a Python function and pass a PerlInt as a parameter, the receiving function should expect to be able to do addition via tha + operator, but should not be expect to find an __add__ method on such objects. Instead, and if they cared to, they could explicitly call the __add method provided. The reverse should also be true, if Python function were to call a Perl subroutine and pass a PyInt as a parameter, the receiving subroutine should expect to be able to do addition via the + operator, but not expect to find an __add method on such objects. Instead, and if they cared to, they could explicitly call the __add__ method provided. I would consider that significant interoperability with only minimal restrictions. While the Python people aren't stopping to talk about the clearness of their language, nothing is clear and explicit, when it comes to overloading or metaclasses. Please don't do that. I am not trying to extoll the virtues of Python, merely trying to implement it. Anyway, IMHO, class.__add__ = foo or your example manipulating class.__dict__ (another special attribute name!) is the point, where you can install Parrot semantics WRT method overloading. Hold that thought. I'll answer this below. Now, given the above sample, let's revisit the statement that The Python translator needs just a translation table for the common core methods. We both know that's a simplification :) You've to install the methods of course ... Again, it can't be done exclusively at translation time. It needs to be done at runtime. And if it is done at runtime, it need not be done at translation time at all. More below. How, exactly, would that be done? Given that the method name is simply a string... used as a key in dictionary... with a different parameter signature than the hypothetical Parrot __add method. The class.__dict__ dictionary is special. Setting an __add__ key too. The combined meaning is overloading. The different signature is a problem, yes - I've already mentioned that. And Parrot's __add method is not hypothetical :-) $ grep __add t/pmc/object*.t Here I'll apologize for being unclear. Yes, there is code in the existing object class in support of Perl's S06. What's hypothetical is the presumption that all languages will adopt Perl 6's naming convention for methods. That's why I say: In the general case, looking for reserved method names at compile time doesn't work. __add__ is reserved in Python and corresponds directly to __add in Parrot. I don't think that doesn't work. __add__ is *not* reserved in Python. Does it matter if the name is actually reserved? The meaning is important. It does matter. Python classes are dictionaries of objects, some of which may be functions. You may extract objects from that dictionary and access them later. The meaning in such a scenario is not apparent until well after all interaction with the compile and runtime dictionaries is over. ... There just is some syntatic sugar that provide a shorthand for certain signatures. I am free to define __add__ methods that have zero or sixteen arguments. I won't be able to call such methods with the convenient shorthand, but other than that, they should work. I'd say, if you define an '__add__' method with 16 arguments, Python will throw an exception, if you try to use C+ with an object of that class: If I define an __add__ method with 16 arguments, Python will not throw an exception. I've already shown that it's possible to go with fully dynamic dispatch *and* 30% faster for MMD and 70% faster for overloaded operations. First correct and complete, then speed considerations. Neither of which match Python semantics. We are going to need a system where classes are anonymous, not global. Why? And how do you find your class then: c = C() ... 3 22 LOAD_NAME1 (C) 25 CALL_FUNCTION0 $pirate -d c.py ... find_lex $P0, 'C' $P1=$P0() store_lex -1, 'c', $P1 The important part isn't simply in which hash a given class name is looked up in, but that classes themselves in Python are transient objects subject to garbage collection. ... Where methods are properties that can be added simply by calling the equivalent of set_pmc_keyed. Nah. Methods aren't properties, but ... No? Try the following: x = abcdef.find print x('c') The set_pmc_keyed on __dict__ (or an equivalent setattribute
Re: continuation enhanced arcs
Leo~ On Wed, 8 Dec 2004 20:29:00 +0100, Leopold Toetsch [EMAIL PROTECTED] wrote: So my conclusion that (in combination with restoring registers to the values of continuation creation) I,S,N registers are almost unusable is correct? I would disagree. Let me take the above example and work with it a little: $I0 = 10 loop: $P0 = shift array dec $I0 if $I0 goto loop We are (for the moment) assuming that shift array somehow causes a full continuations to be taken and then invoked it in a subsequent call. Then this code would infinite loop; however, so would this code as the second call is returning through the first calls continuation. $P0 = shift array $P1 = shift array On the other hand, if every call to shift array took a full continuation, did some stuff, and eventually returned through its return continuation. Then neither would infinite loop, as every call to shift array would have its own return continuation. What this means is that care must be taken when you are writing code that you expects to be invoked multiple times. However, if you are a function that on your second invocation returns via the continuation from you first invocation, you should probably expect to be called again because it happened the first time! If you are expecting other behavior, it is probably because one person wrote the whole chain of calls and had some extra knowledge about the caller. This author may have to be a little wary about value vs reference semantics, but programmers are fairly used to that pitfall by now. Matt -- Computer Science is merely the post-Turing Decline of Formal Systems Theory. -???
Re: Python method overloading
Sam Ruby [EMAIL PROTECTED] wrote: [ snipped - all ok } If I define an __add__ method with 16 arguments, Python will not throw an exception. I didn't write that. I've said: *if* you call it via a + b, Python throws an exception - that one I've shown. Anyway... If this is done at runtime, the it need not be done at compile time. ... Yes. That's the overall conclusiom it seems. It can be done partially at compile time, and it isn't worth the effort to try it, because languages we are targeting are too dynamic. However, it doesn't stop here. Just like methods can be added dynamically by name at runtime, they can be accessed dynamically by name. That means that all method lookups will need to be preceeded by a hash lookup. An not just on Python objects, but *all* objects. ... preceeded by some kind of lookup, which is defined by class-vtable-find_method() of the responsible metaclass. Being it one or 100 hash lookups in properties, dicts, globals and what not. It doesn't matter. Dot. That's why I object to characterizations like dynamic dispatch is 30% faster than What will ultimately result if it is mandated that all languages adopt Perl6's semantics is that an ADDITIONAL dynamic dispatch will be required to make non-Perl6 functions work. You are still not getting the principal of the scheme, IMHO. It has nothing to do with Perl6 or any other language, nor with Python. The original subject: premature pessimization strikes back :) Whe just do a dynamic lookup at runtime - that's all. The e.g. add opcode calls left-vtable-find_method(), and probably more if the return results inidicates MMD. Eventually one of the find_method calls returns a function that does implement the __add method for the involved types. Or a (possibly user provided) distance function decides, which function to call. It doesn't matter. Then the *runcore* calls the function and *caches* the function pointer. Next time the call is instantaneous, given that the language is able to call a cache invalidation function, if method lookup order (for that class) changes. The call to the invalidation function is possible, even for Python. *Iff* you can roll your own method dispatch, you eventually need to know, which method you call. That has to be defined. You can as well call a cache invalidation function, if something changes here (I hope) PerlScalar's implementation of the add will know about how to implement Perl 6's multi sub *infix. PyObject won't, but it will know about Python's __meta__ and __init_class__. If even an add instruction doesn't work outside of one HLL, we can forget any interoperbility. __meta__ and what not Python semantics can be added - or not :-) But let's first concentrate on the basics. - Sam Ruby leo
Re: Python method overloading
Leopold Toetsch wrote: Sam Ruby [EMAIL PROTECTED] wrote: [ snipped - all ok } If I define an __add__ method with 16 arguments, Python will not throw an exception. I didn't write that. I've said: *if* you call it via a + b, Python throws an exception - that one I've shown. Anyway... What you wrote (and snipped) was I'd say, if you define an '__add__' method with 16 arguments, Python will throw an exception,... To which I responded with the above. You are still not getting the principal of the scheme, IMHO. It has nothing to do with Perl6 or any other language, nor with Python. Either that, or I *am* getting the principle of the scheme. I guess that this is the point where I need to return back to writing code and test cases. Leo - at one point you indicated that you might be interested in helping to factor out the common code again. Please feel free to do so whenever you are ready. All I ask is that you don't break the test cases. - Sam Ruby P.S. No fair changing the test cases either. ;-)
Re: S05 question
Ashley Winters writes: I'm thinking capturing rules should be default in rules, where they're downright useful. Your hour/minute/second comment brings up parsing ISO time: grammar ISO8601::DateTime { rule year { \d4 } rule month { \d2 } rule day { \d2 } rule hour { \d2 } rule minute { \d2 } rule second { \d2 } rule fraction { \d+ } rule date { year -? month -? day } rule time { hour \:? minute \:? second [\. fraction]? } rule datetime { date T time } } For a grammar, that works perfectly! Yep. In a one-liner, I'd rather just use: $datetime ~~ /$year := (\d+) -? $month := (\d+) -? ./ Then go ahead and use that. If you're going to use subrules, you can either use the ?subrule form or just the regular old subrule form and ignore the result. There's nothing forcing you to pay attention to those. The number variables only get incremented when you use parentheses. I'd suspect that the return value of a rule only accounts for parenthecized captures as well. Or are you asking something different than that? Luke
Anon-repo access for libtap
I've set up anonymous read-only access to my Subversion repo, so anyone that wants to play with libtap easily can now: svn checkout svn://jc.ngo.org.uk/nik/libtap/trunk/ Share and enjoy. N
Re: S05 question
Warning: excessive nitpicking ahead. Ashley Winters skribis 2004-12-08 10:51 (-0800): rule year { \d4 } \d**{4} Or, well, \d**{2,4} rule month { \d2 } \d**{2} rule date { year -? month -? day } rule week { \d**{2} } rule yday { \d**{3} } rule date { year [ -? [ yday | [ [ Wweek | month ] [ -? day ]? ] ] ]? } # :) rule time { hour \:? minute \:? second [\. fraction]? } Likewise making parts optional, and . can also be ,. rule datetime { date T time } rule timezone { Z | [+-] hour [ \:? minute ]? } rule datetime { date [ T time timezone? ]? } And still this isn't a full ISO8601 grammar. But I it now covers every notation that I have seen in the wild so far. A useful source of information, apart from the ISO standard itself, is DateTime-Format-ISO8601. Juerd
Re: Is object representation per class or per object?
On Tue, Dec 07, 2004 at 12:32:50PM -0500, Abhijit Mahabal wrote: : According to S12, it is possible to supply the object layout to bless(), : like so: : : $object = $class.bless(:CREATE[:reprP6opaque] :k1($v1) :k2($v2)) : : But in the section Introspection, layout is a class trait. Does this : mean that classes have a default layout that can be overriden for : individual objects? Er, no. It's probably just a braino. If it works at all, I think it's probably for when the class doesn't specify a layout, or has a meta-layout that can handle multiple layouts. It might not even make sense for that. In general, a class should have a consistent layout. I think I was thinking about the fact that Perl 5's bless can just use whatever data structure you hand it. So maybe $object = $class.bless(:CREATE[:reprP6Hash] :k1($v1) :k2($v2)) is equivalent to $object = $class.bless({}, :k1($v1) :k2($v2)) But mostly I was just looking for an example option to pass to :CREATE. Perhaps :repr is a bit too violent for that. Larry
Re: S05 question
On Wed, Dec 08, 2004 at 11:09:30AM -0700, Patrick R. Michaud wrote: : On Wed, Dec 08, 2004 at 08:19:17AM -0800, Larry Wall wrote: : And people would have to get used to seeing ? as non-capturing assertions: : ?before ... : ?after ... : ?ws : ?sp : ?null : This has a rather Ruby-esque I am a boolean feeling to it. I think : I like it. It's pretty easy to type, at least on my keyboard. : : FWIW, for some reason in rule contexts I tend to conflate : I am a boolean feelings with zero-width assertion, so that each : of those look vaguely to me as though I'm testing a zero-width : proposition and not consuming any text. And I still tend to think of : '?' in it's zero or one matches or minimal match connotations. : Oh well, I suppose I could get used to that. Yes, there are those interferences, which was one of the reasons for removing ? the last time we had it in that position (albeit on the captures rather than the non-captures). I think we'll have to let it set a while to see how it feels in this role. For the purpose of being a non-alpha no-op, any other non-alpha character would do as well, so maybe the I am a boolean feeling is not that useful. : Now suppose that we extend that I am a boolean feeling to : ?{ code } : which might take the place of the confusing (...), and make consistent : the notion that we always use {...} to invoke real code. : : Hmm, this is nice, however. In some ways, and not so nice in others, as Luke pointed out. : Another problem we've run into is naming if there are multiple assertions : of the same name. If the capture name is just the alpha part of the : assertion, then we could allow an optional number, and still recognize : it as a ws: : ws1 ws2 ws3 : Except I can well imagine people wanting numbered rules. Drat. Could : force people to say ws_1 if they want that, I suppose. : : I had been thinking that : : /ws foo ws bar/ : : would simply cause $ws to be a list of captured elements, similar to : what might happen for $1 in : : / [ (.*?) , ]* / That's what happens by default whenever there is a name conflict. This would just be a way of giving a rule a long name as well as a short one, much like abscomplex is the long name of abs when dispatched on a complex number, whereas abs is just the set of all abs() multis, if there is such a beastie. : If someone really needs the contents of the first and second ws, they : could do : :(ws) foo (ws) : : and get them as $1 and $2. But, seeing this tells me that perhaps : (rule) should be used for capturing rules, analogous to the : capturing parens, and leave rule to be the non-capturing version. : But maybe that's anti-Huffman overall. Maybe the parens could also : help for disambiguating : :(ws) foo (ws) : : so that we end up with $/ws[1], $/ws[2], etc. But then we might : have to always subscript our named captures, which is icky, or maybe : we'd only make $/ws act like list when there's more than one : capturing (ws) in the rule. : : I dunno. I kinda like (rule) for capturing, but maybe it just : doesn't work. I thought about that a long time, which was part of the reason I also thought about freeing up (...). But it just seems a little icky to mix together the named captures and numbered captures visually if not semantically. It starts not being at all clear which parentheses count and which ones not. Which is perhaps another reason for changing current (...) to ?{...}. We could, I suppose use a subscript inside: ws[0] foo ws[1] ws«first» foo ws«second» but then you'd reference it as $ws[0] $wsfirst which is a gratuitous difference, and suffers the same problem as the parenthese in confusing real arrays/hashes with sorta fake ones. So I think we'll stick with the hyphen names for now, which have the benefit of looking the same and not sending us to bracket heaven. ws-1 foo ws-2 ws-first foo ws-second $ws-1 $ws-first Larry
Re: S05 question
On Wed, Dec 08, 2004 at 11:50:51AM -0700, Luke Palmer wrote: : Now suppose that we extend that I am a boolean feeling to : : ?{ code } : : which might take the place of the confusing (...), and make consistent : the notion that we always use {...} to invoke real code. : : Hmm... I'm just so attached to (...). I find it quite beautiful. It : also somehow communicates the feeling you shouldn't be putting : side-effects here. Well, there is that. On the other hand, {...} is usually just as side-effect free. I'm still of two minds about ?{...} vs (...). Course, if we used «...» to interpolate something then «{...}» might interpolate a rule, which would free up {...} for the code assertion. Doesn't have your side-effectlessness feeling, but it is at least symmetrical. : I think I'm leaning toward the idea that anything in angles that : begins alpha is a capture to just the alpha part, so the ? prefix is : merely a no-op that happens to make the assertion not start with an : alpha. Interestingly, that gives these implicit bindings: : : after ... $after$` : before ...$before $' : : I don't quite follow. Wouldn't that mean that these guys would get : clobbered if you used lookaheads or lookbehinds in your rules? The point is that you don't get the $`/$' equivalents unless you explicitly put a lookbehind/lookahead assertion in your pattern: /after .* foo before .*/ That has the benefit of telling the rule engine when it has to worry about saving the prefix/postfix. Not knowing that is part of why we had the sawampersand problem in Perl 5. My other point is that the Perl 6 names of $` and $' fall out naturally if we name the assertions appropriately. Unfortunately, $after and $before don't work as well for variable names as they do for assertion names. Maybe we just have pre and post forms that really mean after .* and before .*. : Or we could use some standard delim for that: : : ws-1 ws-2 ws-3 : : which is vaguely reminiscent of our version syntax. Indeed, if we : had quantifications, you might well want to have wildcards ws-* and : let the name be filled in rather than autogenerating a list. But : maybe we just stick with lists in that case. : : I can imagine this being a lot cleaner if the thing after the dash can : be any sort of identifier: : : ws-indent if ?ws condition ws-comment Funny thing, I just wrote that into S05.pod. : On the other hand, it could be misleading, since the standard naming of : BNF uses dashes instead of underscored. I don't think it should be a : big problem though. Me either, since it's difficult to define a rule with a hyphen in the name. And other delimiter candidates run into various problems too. Larry
Re: Pipe dream - Devel::Cover::Regex
On Tue, Dec 07, 2004 at 11:33:54AM -0800, Kevin Scaldeferri wrote: I'm wondering if I'm the only one who would love to see Devel::Cover::Regex? Many (most?) perl programs are pretty regex heavy, and if we are honest with ourselves, we have to admit that each regex is actually a program in itself. You can try to throw lots of inputs at it and hope that you were thorough enough, but most of us aren't that good at figuring out all the crazy ways a regex could execute. I think this would be a very useful extension to Devel::Cover, although I imagine that it's pretty tricky to do. Even figuring out how to display the results might be tough to do well. This is something I mentioned early in the development of Devel::Cover. I think the display should map fairly well into the statements, branches and conditions we have at the moment. Atoms map to statements. Quantifiers map to branches. Alternation maps to conditions. It won't be quite that simple of course, but I think that should be the basics. Occasionally I have fantasies of having enough free time to really dig into the internals of the regex engine and trying to do this, but to be honest I don't really see it happening for me. So, I figure the next best thing is to throw this idea out here and see if anyone else runs with it. Micheal suggested mjd's Rx might be useful. Jeff Pinyan's Regexp::Parser might also help as a base. -- Paul Johnson - [EMAIL PROTECTED] http://www.pjcj.net
svn
Is there a plan at any point to move to an svn repository from cvs? I'd like to work on a patch to move all the perl* pmcs into dynclasses, which would involve quite a bit of file moving, and I'll happily wait for svn if we're going that way, since it'll be smoother.
Re: Test labels
On Mon, Dec 06, 2004 at 10:28:45PM -0600, Andy Lester wrote: I think even better than ok( $expr, name ); or ok( $expr, comment ); is ok( $expr, label ); RJBS points out that comment implies not really worth doing, and I still don't like name because it implies (to me) a unique identifier. We also talked about description, but description is just s overloaded. I prefer name or label to comment. Name does not imply 'unique' for me, just like 'John Smith' is not expected to a unique name of a person. Mark -- http://mark.stosberg.com/
Re: svn
Will~ On Wed, 08 Dec 2004 19:19:07 -0500, William Coleda [EMAIL PROTECTED] wrote: Is there a plan at any point to move to an svn repository from cvs? I'd like to work on a patch to move all the perl* pmcs into dynclasses, which would involve quite a bit of file moving, and I'll happily wait for svn if we're going that way, since it'll be smoother. While I personally like the idea, I think it is unlikely given how much slower svn is on sizable repositories. Of course I have not tried it recently, so maybe that has changed... All that being said, I am in absolutely no position of authority about this... Matt -- Computer Science is merely the post-Turing Decline of Formal Systems Theory. -???
Re: S05 question
On Wed, 8 Dec 2004 16:07:43 -0700, Luke Palmer [EMAIL PROTECTED] wrote: Ashley Winters writes: For a grammar, that works perfectly! Yep. In a one-liner, I'd rather just use: $datetime ~~ /$year := (\d+) -? $month := (\d+) -? ./ Then go ahead and use that. If you're going to use subrules, you can either use the ?subrule form or just the regular old subrule form and ignore the result. There's nothing forcing you to pay attention to those. The number variables only get incremented when you use parentheses. I'd suspect that the return value of a rule only accounts for parenthecized captures as well. I was working on the (possibly misguided) assumption that there's a cost to capturing, and that perhaps agressive capturing isn't worth having on in a one-liner. Some deep part of my mind remembers $` being bad, I think. If there's no consequence to having capture being on, then ignoring it is fine. I don't have a problem with that. As I said before, ?foo reads fine to me. I'm still going to prefer using :=, simply as a good programming practice. My mind sees a big difference between building a parse-tree object and just grepping for some word I want in a string. Within a rule{} block, there is no place except the rule object to keep your data (hypothetically -- haha), so it makes sense to have everything capture unless otherwise specified. There's no such limitation in a regular code block, so I don't see the need. I may change my mind after using $/URI::URLpath_segment[2] Ashley Winters
Re: svn
While I personally like the idea, I think it is unlikely given how much slower svn is on sizable repositories. Of course I have not tried it recently, so maybe that has changed... All that being said, I am in absolutely no position of authority about this... This is, and always has been, (since 1.0 at least), a myth. We have always been at war with Apache has moved most of their projects to SVN. It's probably ready. -R
Re: S05 question
On Wed, 8 Dec 2004 16:07:43 -0700, Luke Palmer [EMAIL PROTECTED] wrote: Ashley Winters writes: In a one-liner, I'd rather just use: $datetime ~~ /$year := (\d+) -? $month := (\d+) -? ./ I'm starting to think that this '$year := ' syntax is an obfuscator. We couldn't refer to that capture with $year even inside a regex, right? We should use $year instead. Maybe $year := (\d+) would be less obfuscating.. but it's longer :) (year:= \d+) and [year:= \d+] are somewhat better, IMHO, but I'm not sure if : in := is unambigous here. but if /year/ and /$year:=.../ both capture to $year, why not make thoose two more similar? things like year:\d+ or year[\d+] or year: [\d+] come to mind. or that (now unused) year [\d+] Then go ahead and use that. If you're going to use subrules, you can either use the ?subrule form or just the regular old subrule form and ignore the result. There's nothing forcing you to pay attention to those. The number variables only get incremented when you use parentheses. I'd suspect that the return value of a rule only accounts for parenthecized captures as well. ..and ignore the result? hm. what if someone lazy will put $a ~~ /rule/ instead of $a ~~ /?rule/, would be there any copying overhead after $a = something else (to keep $rule, which he isn't even going to use). (Some perl5 programmers use (...) where (?:...) would be sufficient, just because they are too lazy to put extra two characters, and because it's noisier. ?rule is better than rule for noncapturing behaviour in that sense, but I could imagine thoose ?ws everywhere.. um, just moaning.. maybe old, nonswapped behaviour, was better: ws to not capture, ws to capture (I don't think and are appropriate.
Re: svn
On Wed, Dec 08, 2004 at 10:16:21PM -0500, Matt Fowles wrote: While I personally like the idea, I think it is unlikely given how much slower svn is on sizable repositories. Of course I have not tried it recently, so maybe that has changed... If you wish to try out a recent Subversion on some sizable source there's a mirror of the maint and bleadperl Perforce repositories here. http://svn.clkao.org/svnweb/perl You can pull them out using svn://svn.clkao.org/perl Subversion has improved a lot. I'm using it now. If you do try it I recommend going straight to 1.1.1 and using fsfs based repositories. Keep in mind that SVN is slower on checkouts than CVS. However diff is a purely local operation. And if you're using something like SVK network traffic isn't much of an issue after all after the initial mirror. -- Michael G Schwern[EMAIL PROTECTED] http://www.pobox.com/~schwern/ Now we come to that part of the email you've all been waiting for--the end.
RE: C implementation of Test::Harness' TAP protocol
--- Clayton, Nik wrote: Any Writing thread safe libraries for dummies texts you could point me at? I recommend Programming with POSIX Threads by David Butenhof. Re the varargs ok() business, I assume you'll be using some sort of config.h with your libtap library. Any plans on using autoconf or similar tool? One way around this __VA_ARGS__ portability issue is to let configure work it out and write your code for some sort of VA_ARGS capability. There doesn't appear to be a standard autoconf symbol for this, at least I couldn't find one. Googling for HAVE_VA_ARGS uncovered only two dubious hits. http://gcc.gnu.org/ml/gcc-help/2004-05/msg00181.html asked a question about this issue, but no response. I noticed this in a issue with glib.h gtk-devel-list thread: I think we should just use the __STDC_VERSION__ define -- no need for autoconf. #if defined __STDC_VERSION__ __STDC_VERSION__ = 199901L # define g_message(...) g_log (DOM, LOG_MSG, __VA_ARGS__) #elif defined __GNUC__ # define g_message(format_args...) g_log (DOM, LOG_MSG, format_args) #else ... #endif Finally, ACE C++ library uses: #if defined (__GNUC__) (__GNUC__ = 3 || __GNUC_MINOR__ 95) // use GNU __VA_ARGS__ capability ... HTH, /-\ Find local movie times and trailers on Yahoo! Movies. http://au.movies.yahoo.com