date:20041208

Re: continuation enhanced arcs

2004-12-08 Thread Leopold Toetsch

Piers Cawley [EMAIL PROTECTED] wrote:
 Leopold Toetsch [EMAIL PROTECTED] writes:

  ... While S registers hold pointers, they have
  value semantics.

 Is that guaranteed? Because it probably needs to be.

It's the current implementation and tested.

  This would restore the register contents to the first state shown above.
  That is, not only I and N registers would be clobbered also S registers
  are involved.

 That's correct. What's the problem? Okay, you've created an infinite
 loop, but what you're describing is absolutely the correct behaviour for
 a continuation.

Ok. It's a bit mind-twisting but OTOH it's the same as setjmp/longjmp
with all implications on CPU registers. C has the volatile keyword to
avoid clobbering of a register due to a longjmp.

  Above code could only use P registers. Or in other words: I, N, and S
  registers are almost[1] useless.

 No they're not. But you should expect them to be reset if you take a
 (full) continuation back to them.

The problem I have is: do we know where registers may be reset? For
example:

$I0 = 10
  loop:
$P0 = shift array
dec $I0
if $I0 goto loop

What happens if the array PMC's Cshift get overloaded and does some
fancy stuff with continuations. My gut feeling is that the loop might
suddenly turn into an infinite loop, depending on some code behind the
scenes ($I0 might be allocated into the preserved register range or not
depending on allocation pressure).

Second: if we don't have a notion that a continuation may capture and
restore a register frame, a compiler can hardly use any I,S,N registers
because some library code or external function might just restore these
registers.

 Presumably if foo() doesn't store a full continuation, the restoration
 just reuses an existing register frame and, if foo has made a full
 continuation its return does a restore by copying?

Yes, that should be a reasonable implementation.

leo

Re: Premature pessimization

2004-12-08 Thread Leopold Toetsch

Sam Ruby [EMAIL PROTECTED] wrote:
 Leopold Toetsch wrote:

 So *all* lookups (complete with the asterisks) does not mean *all* lookups.

 How about invoke?

Let's first concentrate on simpler stuff like infix operators.

 Citing S06: Operators are just subroutines with special names.

 That statement is true for Perl.  Same statement is true for Python.
 But the names vary based on the language.

Yes. So let's factor out the common part and have that in Parrot core,
usable for Python and Perl and ...

The PyInt PMC currently duplicates almost all functionality that
*should* be in the Integer PMC. We have first to fix the Integer PMC to
do the Right Thing. Then we need some syntax for multiple inheritance in
PMCs. The same holds for other PMCs. It was already proposed that we
should have a language-neutral Hash PMC.

So given that we have a set of language-neutral PMCs in core that do the
right thing, Python or Perl PMCs can inherit a lot of functionality from
core PMCs. Language-specific behavior is of course implemented in the
specific PMC.

Second: method dispatch. I've looked a bit into PyObject. It seems that
you start rolling your own method dispatch. Please don't get me wrong,
I'm not criticizing your implementation. It might also be needed for
some reasons I'm just overlooking and it's currently needed because core
functionality isn't totally finished.

Anyway - and please correct me if my assumptions are not true - I'll try
to factor out the common part again.

You have in PyObject e.g.:

METHOD PMC* __add__(PMC *value) {
PMC * ret = pmc_new(INTERP, dynclass_PyObject);
mmd_dispatch_v_ppp(INTERP, SELF, value, ret, MMD_ADD);
return ret;
}

I see six issues with that kind of approach:

* The __add method should be in Parrot core. That's what I've
  described in the MMD dispatch proposal.
* the method is returning a new PMC. This doesn't follow the signature
  of Parrot infix MMD operations.
* well, it's dispatching twice. First the __add__ method for
  PyObjects has to be searched for then the mmd_dispatch is done.
* it'll very likely not work together with other HLLs. It's a
  python-only solution.
* rolling your own dispatch still doesn't help, if a metaclass
  overloads the C+ operation
* code duplication

So how would I do it:

* prelim: above mentioned core PMC cleanup is done. Inheritance works:
  a PyInt isa(PyObject, Integer)
* the core PMCs define methods, like your __add__ except that our
  naming conventions is __add. The Python translator needs just
  a translation table for the common core methods.
* Method dispatch is done at the opcode level.

  add Px, Py, Pz

  just does the right thing. It calls the thingy that implements the
  __add method, being in core or overloaded shouldn't and doesn't
  matter. If inheritance changes at runtime it just works.

  And the other way round:

  Py.__add(Pz, Px)

  is the same. Again it doesn't matter, if it's a core PMC, a Python
  PMC or an overloaded PASM/PIR multi sub (or a Python metaclass).
  The only difference is the changed signature. But that's how Parrot
  core defines overloaded infix operations.

We have to do that anyway. It's just the correct way to go.

(And please no answers WRT efficiency ;-)

leo

Devel::Cover cover command uses to much memory

2004-12-08 Thread Jason Remillard

Hi,

I ran the codestriker (http://codestriker.sourceforge.net/) test set
using Devel::Cover. The test cases ran over a day and a half
and generated a cover_db directory that is 127 megs. Attempting to run
the cover command keeps using up all of the available memory causing
cover to be killed by the OS. I have my swap file up to 1 gig, and after
two days of the computer swapping its brains out, it still was not
enough memory. 

How much memory is need for cover to process a 126 meg cover_db?
Are there any switches or other tricks I could do to reduce the memory
consumption of cover?

If somebody is willing to work on the problem, I can zip up the
directory and send it too them for testing. 

I am using cover 0.51, on Debian 3.0. Perl is version 5.6.1.

Otherwise it did seem to work ok on my previous smaller test runs.

Lastly, some documentation on how to use with with a normal cgi script
would be helpful. The way I finally got it to work was to rename
codestriker.pl (the main cgi perl script), to codestriker_test.pl. 
Write a new codestriker.pl that just does a system call with the 
Devel::Cover switch. Perl would not let me add it to the 
#!/usr/bin/perl line at the start of the script. I would be interested 
in knowing if a cleaner way is possible, as this is kind of lame.

Thanks
Jason.

Re: C implementation of Test::Harness' TAP protocol

2004-12-08 Thread muppet

On Dec 7, 2004, at 9:25 PM, Andrew Savige wrote:
/* Horrible hacky thread-unsafe version but no XX  */
...
static const char* g_file;
static unsigned long g_line;
i forgot to mention, the way around the non-thread-safety here is to 
use thread-local storage.
c.f.  pthread_key_create() and pthread_getspecific().

for a similarly evil trick, the GNU C library defines the global errno 
like this:

   /* function that fetches the address of the calling thread's errno 
from TLS */
   int * __get_errno_address (void);

   #define errno  (*__get_errno_address())
--
He's so good, you're gonna rock, and if you don't rock, it's your own 
fault.
  -- kk, describing the perks of having a very good drummer.

RE: C implementation of Test::Harness' TAP protocol

2004-12-08 Thread Clayton, Nik

Clayton, Nik wrote:
 You might want to throw it in as an option.  I'm going to change
 Test::More so it no longer mucks with the exit code by default, you'll
 have to turn this feature on.
 
 OK.  I'll track changes to Test::Harness, and libtap'll stop
 doing it when T::H stops.

Or, more simply, I'll just document it as something the test author can do
for completeness, if they're so inclined, but that it's not mandatory.

N
-- 
11 2 3 4 5 6 77
 0 0 0 0 0 0 05
-- The 75 column-ometer
Not speaking on behalf of my employer.  /bush

base scalar PMC semantics

2004-12-08 Thread Leopold Toetsch

First, there was some dicussion not too long ago:
 Subject: Numeric semantics for base pmcs [1]
 Subject: Last bits of the basic math semantics
The current Integer PMC doesn't yet follow the results of these threads.
Basic behavior of that type is Perl6 or Python semantics, which is: it's 
basically an arbitrary precision integer, like Python's int/long type 
after merging. To achieve this functionality it silently morphs results 
to a Big type capable of doing the arbitrary precision.

The summary in [1] also mentions type coercion:
10) The destination PMC is responsible for final conversion of the 
inbound value

E.g. when we have
MMD add(PyInt + PyInt)
a) no overflow: VTABLE_set_integer_native(interp, dest, the_sum)
the set_integer_native vtable is responsibe to convert the Cdest PMC 
into a PyInt. For Perl types it'll be PerlInt. And base PMCs use 
Integer. Following strictly this scheme does allow the inheritance of 
all common functionality.

b) overflow:
   if (self == dest) {
  VTABLE_set_bignum(interp, self, self.intval)
  // redispatch
   }
   else {
  VTABLE_set_bignum(interp, dest, self.intval)
  temp = new dest.type
  VTABLE_set_bignum(interp, temp, self.intval)
  // redispatch
or a similar scheme.
Float and String needs the same refactoring, but that's simpler.
To use that functionality we need a better notion for multiple 
inheritance inside PMCs.

  PerlInt isa (PerlAny, Integer)
  PyInt   isa (PyObject, Integer)
Comments?
leo

Re: Premature pessimization

2004-12-08 Thread Sam Ruby

Ah!  Now we are getting somewhere!
Leopold Toetsch wrote:
Sam Ruby [EMAIL PROTECTED] wrote:
Leopold Toetsch wrote:

So *all* lookups (complete with the asterisks) does not mean *all* lookups.

How about invoke?
Let's first concentrate on simpler stuff like infix operators.
OK, but the point is that there will always be multiple mechanisms for 
dispatch.

Citing S06: Operators are just subroutines with special names.

That statement is true for Perl.  Same statement is true for Python.
But the names vary based on the language.
Yes. So let's factor out the common part and have that in Parrot core,
usable for Python and Perl and ...
The PyInt PMC currently duplicates almost all functionality that
*should* be in the Integer PMC. We have first to fix the Integer PMC to
do the Right Thing. Then we need some syntax for multiple inheritance in
PMCs. The same holds for other PMCs. It was already proposed that we
should have a language-neutral Hash PMC.
No question that that is the intended final goal.  What you see in the 
current python dynclasses is not representative of that final goal.

So, why have I proceeded in this manner?  Two reasons.
First, I am not about to make random, unproven changes to the Parrot 
core until I am confident that the change is correct.  Cloning a class 
temporarily gives me a playground to validate my ideas.

Second, I am not going to wait around for Warnocked questions and 
proposals to be addressed.

Now, neither of the above are absolutes.  You have seen me make changes 
to the core - but only when I was relatively confident.  And I *have* 
put on hold trying to reconcile object oriented semantics as this is 
both more substantial and seemed to be something that was likely to be 
addressed.

Also, while I am not intending to make speculative changes to the core 
of Parrot, I don't have any objections to anybody making changes on my 
behalf.  If you see some way of refactoring my code, go for it.  It 
isn't mine - it is the community's.

The one thing I would like to ask is that test cases that currently pass 
continue to pass.  The dynclass unit tests are part of the normal test. 
 Additionally, the tests in languages/parrot have been the ones driving 
most of my implementation lately.

I do realize that that means checking out Pirate.  Even though I don't 
agree with it, I do understand Michal's licensing issues.  The reason I 
am not investing much time in resolving this issue is that Pirate is 
exactly one source file and could quickly be rewritten using the Perl 6 
Grammar engine once that functionallity becomes sufficiently complete.

So given that we have a set of language-neutral PMCs in core that do the
right thing, Python or Perl PMCs can inherit a lot of functionality from
core PMCs. Language-specific behavior is of course implemented in the
specific PMC.
Agreed.  One area that will require a bit more thought is error cases. 
The behavior of integer divide by zero is likely to be different in each 
language.  This could be approached in a number of different ways.  One 
is by cloning such methods, like I have done.  Another is to wrap such 
methods, catch the exception that is thrown, and handle it in a language 
specific manner.

A better approach would be for the core to call out to a method on such 
error cases.  Subclasses could simply inherit the common core behavior 
and override this one method.  It also means that the normal execution 
path length (i.e., when dividing by values other than zero) is optimal, 
it is only the error paths that involve extra dispatches.

That's an easy case.  Overflow is a bit more subtle.  Some languages 
might want to wrap the results (modulo 2**32).  Some languages might 
want an exception.  Other languages might want promotion to BigInt.

Even if promotion to BigInt were the default behavior, subclasses would 
still want to override it.  In Python's case, promotion to PyLong (which 
ideally would inherit from and trivially specialize and extend BigIt) 
would be the desired effect.

Even this is only one aspect of a more general case: all morphing 
behavior needs to be overridable by subclasses.  I believe that this can 
be easily handled by the current Parrot architecture by virtue of the 
fact that destination objects must be created before methods are called, 
and such destination objects can override morph methods).  But it would 
help the cause if code were written to promote things to Integer 
instead of PerlInt.  Yes, at the moment, I'm guilty of this too.

Second: method dispatch. I've looked a bit into PyObject. It seems that
you start rolling your own method dispatch. Please don't get me wrong,
I'm not criticizing your implementation. It might also be needed for
some reasons I'm just overlooking and it's currently needed because core
functionality isn't totally finished.
I'll address your questions below, but for reference, here is the code 
that Pirate generates for a=b+c:

find_type $I0, 'PyObject'
new $P0, $I0

Re: Exceptions, sub cleanup, and scope exit

2004-12-08 Thread Leopold Toetsch

Dan Sugalski [EMAIL PROTECTED] wrote:

  pushmark 12
  popmark 12

 pushaction Psub

I've now implemented these bits. I hope it's correct, specifically, if a
return continuation in only captured, the action handler is not run.
See t/pmc/exceptions.t

Still missing is the throw opcode. Or better that exists, just exception
creation and the extended attributes like language is missing.

I'm still voting for a more object-ish exception constructor to better
accomodate HLLs different exception usage.

E.g.

  e = new PyKeyError  # presumably a constant singleton
  throw e

That ought to be enough for heavily used exception and for Perl6
control exceptions.

OTOH

  e = new Exception
  setattribute e, message, Pmsg
  setattribute e, language, PLang
  ...
  throw e

construct a full exception object.

Currently it is:

  e[_message] = foo
  e[_error]
  e[_severity]
  ...

And it could be even something like:

  cl = getclass Exception
  e = cl.instantiate(foo, Perl, .error, .severity, ...)

leo

Re: Premature pessimization

2004-12-08 Thread Leopold Toetsch

Sam Ruby [EMAIL PROTECTED] wrote:
 Ah!  Now we are getting somewhere!

Yeah. That's the goal.

 So, why have I proceeded in this manner?  Two reasons.

Fair enough, both.

 So given that we have a set of language-neutral PMCs in core that do the
 right thing, Python or Perl PMCs can inherit a lot of functionality from
 core PMCs. Language-specific behavior is of course implemented in the
 specific PMC.

 Agreed.  One area that will require a bit more thought is error cases.

Yep. But let's just figure that out later. First the basics.

 I'll address your questions below, but for reference, here is the code
 that Pirate generates for a=b+c:

  find_type $I0, 'PyObject'
  new $P0, $I0
  find_lex $P1, 'b'
  find_lex $P2, 'c'
  $P0 = $P1 + $P2
  store_lex -1, 'a', $P0

Good. Now Evil Leo (who can't program in Python ;) writes some piece of
code like this:

$ cat m.py
class M(type):
def __new__(meta, name, base, vars):
cls = type.__new__(meta, name, base, vars)
cls.__add__ = myadd
return cls

def myadd(self, r):
return 44 - r

I = M('Int', (int,), {})

i = I(5)
print i
print i + 2

$ python m.py
5
42

 What this means is that the __add__ method will not be directly used for
 either PyInt or PyString objects

Well, and that's not true, IMHO. See above. It has to be part of
Parrot's method dispatch. What if your translator just sees the last 3
lines of the code and M is in some lib? That implies that you either
can't translate to $P0 = $P1 + $P2, or that you just translate or
alias __add__ to Parrot's __add and let Parrot fiddle around to find
the correct method.

 * the method is returning a new PMC. This doesn't follow the signature
   of Parrot infix MMD operations.

 Here I do think you are misunderstanding.  The __add__ method with
 precisely that signature and semantics is defined by the Python language
 specification.  It is (somewhat rarely) used directly, and therefore
 must be supported exactly that way.

 |  __add__(...)
 |  x.__add__(y) == x+y

Parrot semantics are that the destination exists. But having a look at
above myadd, we probably have to adjust the calling conventions for
overloaded infix operators, i.e. return the destination value. Or
provide both schemes ... dunno.

 In the general case, looking for reserved method names at compile
 time doesn't work.

__add__ is reserved in Python and corresponds directly to __add in
Parrot. I don't think that doesn't work.

 ... As everything can be overridden, this dispatch must
 be done at runtime.

Exactly and that's what I want to achieve.

 I personally don't think that performance considerations should be out
 of bounds in these discussions

I've already shown that it's possible to go with fully dynamic dispatch
*and* 30% faster for MMD and 70% faster for overloaded operations. First
correct and complete, then speed considerations.

 - Sam Ruby

leo

Re: Premature pessimization

2004-12-08 Thread Sam Ruby

Leopold Toetsch wrote:
Good. Now Evil Leo (who can't program in Python ;) writes some piece of
code like this:
$ cat m.py
class M(type):
def __new__(meta, name, base, vars):
cls = type.__new__(meta, name, base, vars)
cls.__add__ = myadd
return cls
def myadd(self, r):
return 44 - r
I = M('Int', (int,), {})
i = I(5)
print i
print i + 2
$ python m.py
5
42
What this means is that the __add__ method will not be directly used for
either PyInt or PyString objects
Well, and that's not true, IMHO. See above. It has to be part of
Parrot's method dispatch. What if your translator just sees the last 3
lines of the code and M is in some lib? That implies that you either
can't translate to $P0 = $P1 + $P2, or that you just translate or
alias __add__ to Parrot's __add and let Parrot fiddle around to find
the correct method.
Here's the part that you snipped that addresses that question:
   And there is a piece that I haven't written yet that will do the
   reverse: if MMD_ADD is called on a PyObject that has not provided
   such behavior, then an any __add__ method provided needs to be
   called.
* the method is returning a new PMC. This doesn't follow the signature
 of Parrot infix MMD operations.

Here I do think you are misunderstanding.  The __add__ method with
precisely that signature and semantics is defined by the Python language
specification.  It is (somewhat rarely) used directly, and therefore
must be supported exactly that way.
 |  __add__(...)
 |  x.__add__(y) == x+y
Parrot semantics are that the destination exists. But having a look at
above myadd, we probably have to adjust the calling conventions for
overloaded infix operators, i.e. return the destination value. Or
provide both schemes ... dunno.
Since you provided an Evil Leo sample, let me provide an Evil Sam sample:
  d = {
__init__: lambda self,x: setattr(self, value, x),
__add__:  lambda self,x: str(self.value) + str(x.value)
  }
  def dict2class(d):
class c: pass
c.__dict__.update(d)
return c
  c = dict2class(d)
  a=c(2)
  b=c(3)
  print a+b
Things to note:
  1) classes which are created every time a function is called
  2) classes are thin wrappers over a dictionary object
Now, given the above sample, let's revisit the statement that The 
Python translator needs just a translation table for the common core 
methods.

How, exactly, would that be done?  Given that the method name is simply 
a string... used as a key in dictionary... with a different parameter 
signature than the hypothetical Parrot __add method.

That's why I say:
In the general case, looking for reserved method names at compile
time doesn't work.

__add__ is reserved in Python and corresponds directly to __add in
Parrot. I don't think that doesn't work.
__add__ is *not* reserved in Python.  There just is some syntatic sugar 
that provide a shorthand for certain signatures.  I am free to define 
__add__ methods that have zero or sixteen arguments.  I won't be able to 
call such methods with the convenient shorthand, but other than that, 
they should work.

I personally don't think that performance considerations should be out
of bounds in these discussions
I've already shown that it's possible to go with fully dynamic dispatch
*and* 30% faster for MMD and 70% faster for overloaded operations. First
correct and complete, then speed considerations.
Neither of which match Python semantics.  We are going to need a system 
where classes are anonymous, not global.  Where methods are properties 
that can be added simply by calling the equivalent of set_pmc_keyed.

- Sam Ruby

Re: S05 question

2004-12-08 Thread Larry Wall

On Tue, Dec 07, 2004 at 10:36:53PM -0800, Larry Wall wrote:
: But somehow I expect that when someone writes (foo) they probably
: usually meant («foo»).

If we're going to stick with the notion that foo captures and something
else doesn't, I'm beginning to think that the other thing isn't «foo» for
a couple of reasons.  First, if other languages are going to borrow this
notation, they're probably not going to buy into the French quotes.  Second,
I can think of several other possible uses for the French quotes to cure
perceived ills such as the (...) vs {...} confusion.  Third, it now
bothers me to have a ! without a ?.  So what if «foo» is instead written
?foo, meaning you only want to evaluate its success.  (Unlike !foo,
it's not zero-width, but that's just how success/failure works.)  So we'd
get things like

/ $bar := [ (?ident) = (\N+) ]* /

And people would have to get used to seeing ? as non-capturing assertions:

?before ...
?after ...
?ws
?sp
?null

This has a rather Ruby-esque I am a boolean feeling to it.  I think
I like it.  It's pretty easy to type, at least on my keyboard.

Now suppose that we extend that I am a boolean feeling to

?{ code }

which might take the place of the confusing (...), and make consistent
the notion that we always use {...} to invoke real code.

: : Or is it that hypotheticals only bind to things captured by parens?
: : If so, it might need clarification (or perhaps I'm overlooking the part
: : that makes it clear).
: 
: No, I think you just found a blind spot in the design.

I think I'm leaning toward the idea that anything in angles that
begins alpha is a capture to just the alpha part, so the ? prefix is 
merely a no-op that happens to make the assertion not start with an
alpha.  Interestingly, that gives these implicit bindings:

after ... $after$`
before ...$before   $'

Thought that's an argument for changing them to pre ... and post ...,
I suppose, since if users are going to refer to $after in their main
program, it doesn't look like a declarative assertion anymore.

Another problem we've run into is naming if there are multiple assertions
of the same name.  If the capture name is just the alpha part of the
assertion, then we could allow an optional number, and still recognize
it as a ws:

ws1 ws2 ws3

Except I can well imagine people wanting numbered rules.  Drat.  Could
force people to say ws_1 if they want that, I suppose.

Or we could use some standard delim for that:

ws-1 ws-2 ws-3

which is vaguely reminiscent of our version syntax.  Indeed, if we
had quantifications, you might well want to have wildcards ws-* and
let the name be filled in rather than autogenerating a list.  But maybe
we just stick with lists in that case.

For captures of non-alpha assertions, we could say that ? is the same
as true (just as with regular operators), and so

true-3 +alpha-[aeiou]

would capture to $true-3.  (And one could always do an explicit binding
for a different name.)

Actually, I think people would find $match-3 more meaningful than
Ctrue-3.

I'm still thinking about what «...» might mean, if anything.  Bonus points
for interpolative and/or word-splitty.

Anyway, that's where I am this week/day/hour/minute/second.

Larry

Re: S05 question

2004-12-08 Thread Austin Hastings

Larry Wall wrote:
Another problem we've run into is naming if there are multiple assertions
of the same name.  If the capture name is just the alpha part of the
assertion, then we could allow an optional number, and still recognize
it as a ws:
   ws1 ws2 ws3
Except I can well imagine people wanting numbered rules.  Drat.  Could
force people to say ws_1 if they want that, I suppose.
Or we could use some standard delim for that:
   ws-1 ws-2 ws-3
which is vaguely reminiscent of our version syntax.  Indeed, if we
had quantifications, you might well want to have wildcards ws-* and
let the name be filled in rather than autogenerating a list.  But maybe
we just stick with lists in that case.
For captures of non-alpha assertions, we could say that ? is the same
as true (just as with regular operators), and so
   true-3 +alpha-[aeiou]
would capture to $true-3.  (And one could always do an explicit binding
for a different name.)
Actually, I think people would find $match-3 more meaningful than
Ctrue-3.
 

PHP's use of $array[] as push might work for this:
true[] +alpha-[aeiou]
or
@true +alpha-[aeiou]
or
true=1.. +alpha-[aeiou]
or
true@ +alpha-[aeiou]
I like the idea of being able to continue versus chunk patterns. How 
do you say  This is a continuation of the other thing versus This 
is a separate thing ?

=Austin

Re: S05 question

2004-12-08 Thread Patrick R. Michaud

On Wed, Dec 08, 2004 at 08:19:17AM -0800, Larry Wall wrote:
 And people would have to get used to seeing ? as non-capturing assertions:
 ?before ...
 ?after ...
 ?ws
 ?sp
 ?null
 This has a rather Ruby-esque I am a boolean feeling to it.  I think
 I like it.  It's pretty easy to type, at least on my keyboard.

FWIW, for some reason in rule contexts I tend to conflate 
I am a boolean feelings with zero-width assertion, so that each
of those look vaguely to me as though I'm testing a zero-width 
proposition and not consuming any text.  And I still tend to think of
'?' in it's zero or one matches or minimal match connotations.
Oh well, I suppose I could get used to that.

 Now suppose that we extend that I am a boolean feeling to
 ?{ code }
 which might take the place of the confusing (...), and make consistent
 the notion that we always use {...} to invoke real code.

Hmm, this is nice, however.

 Another problem we've run into is naming if there are multiple assertions
 of the same name.  If the capture name is just the alpha part of the
 assertion, then we could allow an optional number, and still recognize
 it as a ws:
 ws1 ws2 ws3
 Except I can well imagine people wanting numbered rules.  Drat.  Could
 force people to say ws_1 if they want that, I suppose.

I had been thinking that 

/ws foo ws bar/

would simply cause $ws to be a list of captured elements, similar to 
what might happen for $1 in 

/ [ (.*?) , ]* /

If someone really needs the contents of the first and second ws, they
could do

   (ws) foo (ws)

and get them as $1 and $2.  But, seeing this tells me that perhaps
(rule) should be used for capturing rules, analogous to the
capturing parens, and leave rule to be the non-capturing version.
But maybe that's anti-Huffman overall.  Maybe the parens could also
help for disambiguating

   (ws) foo (ws)

so that we end up with $/ws[1], $/ws[2], etc.  But then we might
have to always subscript our named captures, which is icky, or maybe 
we'd only make $/ws act like list when there's more than one 
capturing (ws) in the rule.

I dunno.  I kinda like (rule) for capturing, but maybe it just
doesn't work.

Pm

Re: Devel::Cover cover command uses to much memory

2004-12-08 Thread Michael G Schwern

On Tue, Dec 07, 2004 at 07:21:09PM -0800, Jason Remillard wrote:
 I ran the codestriker (http://codestriker.sourceforge.net/) test set
 using Devel::Cover. The test cases ran over a day and a half
 and generated a cover_db directory that is 127 megs. Attempting to run
 the cover command keeps using up all of the available memory causing
 cover to be killed by the OS. I have my swap file up to 1 gig, and after
 two days of the computer swapping its brains out, it still was not
 enough memory. 

How big is this test suite?  How long does it usually take to run?
Just trying to get an order-of-magnitude feel here.


 Lastly, some documentation on how to use with with a normal cgi script
 would be helpful. The way I finally got it to work was to rename
 codestriker.pl (the main cgi perl script), to codestriker_test.pl. 
 Write a new codestriker.pl that just does a system call with the 
 Devel::Cover switch. Perl would not let me add it to the 
 #!/usr/bin/perl line at the start of the script. I would be interested 
 in knowing if a cleaner way is possible, as this is kind of lame.

You just have to say use Devel::Cover in your program.  That's what 
-MDevel::Cover means.


-- 
Michael G Schwern[EMAIL PROTECTED]  http://www.pobox.com/~schwern/
It's Yellowing Laudanum time!

Re: S05 question

2004-12-08 Thread Luke Palmer

Larry Wall writes:
 If we're going to stick with the notion that foo captures and
 something else doesn't, I'm beginning to think that the other thing
 isn't foo for a couple of reasons.

I just sat down to say the exact same thing.  I'm glad you beat me to
it.

 And people would have to get used to seeing ? as non-capturing assertions:
 
 ?before ...
 ?after ...
 ?ws
 ?sp
 ?null
 
 This has a rather Ruby-esque I am a boolean feeling to it.  I think
 I like it.  It's pretty easy to type, at least on my keyboard.

Yeah, I like it pretty well too.  Better than the french quites for
sure.

 Now suppose that we extend that I am a boolean feeling to
 
 ?{ code }
 
 which might take the place of the confusing (...), and make consistent
 the notion that we always use {...} to invoke real code.

Hmm...  I'm just so attached to (...).  I find it quite beautiful.  It
also somehow communicates the feeling you shouldn't be putting
side-effects here.

 I think I'm leaning toward the idea that anything in angles that
 begins alpha is a capture to just the alpha part, so the ? prefix is
 merely a no-op that happens to make the assertion not start with an
 alpha.  Interestingly, that gives these implicit bindings:
 
 after ...   $after$`
 before ...  $before   $'

I don't quite follow.  Wouldn't that mean that these guys would get
clobbered if you used lookaheads or lookbehinds in your rules?

 Or we could use some standard delim for that:
 
 ws-1 ws-2 ws-3
 
 which is vaguely reminiscent of our version syntax.  Indeed, if we
 had quantifications, you might well want to have wildcards ws-* and
 let the name be filled in rather than autogenerating a list.  But
 maybe we just stick with lists in that case.

I can imagine this being a lot cleaner if the thing after the dash can
be any sort of identifier:

ws-indent if ?ws condition ws-comment

On the other hand, it could be misleading, since the standard naming of
BNF uses dashes instead of underscored.  I don't think it should be a
big problem though. 

 I'm still thinking about what ... might mean, if anything.  Bonus
 points for interpolative and/or word-splitty.

Yeah... umm... nope.  I got nothin.

Luke

Re: S05 question

2004-12-08 Thread Ashley Winters

On Wed, 8 Dec 2004 08:19:17 -0800, Larry Wall [EMAIL PROTECTED] wrote:
 / $bar := [ (?ident) = (\N+) ]* /

You know, to be honest I don't know that I want rules in one-liners to
capture by default. I certainly want them to capture in rules, though.

 And people would have to get used to seeing ? as non-capturing assertions:
 
 ?before ...
 ?after ...
 ?ws
 ?sp
 ?null
 
 This has a rather Ruby-esque I am a boolean feeling to it.  I think
 I like it.  It's pretty easy to type, at least on my keyboard.

I like it. It reads to me as if before ..., if null. Sounds good.

 I think I'm leaning toward the idea that anything in angles that
 begins alpha is a capture to just the alpha part, so the ? prefix is
 merely a no-op that happens to make the assertion not start with an
 alpha.  Interestingly, that gives these implicit bindings:
 
 after ... $after$`
 before ...$before   $'

Again, I don't see the utility of that in a one-liner. In a grammar,
you would create a real rule which would assert after ... and
capture the result in a reasonable name.

 Anyway, that's where I am this week/day/hour/minute/second.

I'm thinking capturing rules should be default in rules, where they're
downright useful. Your hour/minute/second comment brings up parsing
ISO time:

grammar ISO8601::DateTime {
rule year { \d4 }
rule month { \d2 }
rule day { \d2 }
rule hour { \d2 }
rule minute { \d2 }
rule second { \d2 }
rule fraction { \d+ }

rule date { year -? month -? day }
rule time { hour \:? minute \:? second [\. fraction]? }
rule datetime { date T time }
}

For a grammar, that works perfectly!

In a one-liner, I'd rather just use:

$datetime ~~ /$year := (\d+) -? $month := (\d+) -? ./

and specify the vars I want to save directly in my own scope.

Ashley Winters

Re: continuation enhanced arcs

2004-12-08 Thread Piers Cawley

Leopold Toetsch [EMAIL PROTECTED] writes:

 Piers Cawley [EMAIL PROTECTED] wrote:
 Leopold Toetsch [EMAIL PROTECTED] writes:

  ... While S registers hold pointers, they have
  value semantics.

 Is that guaranteed? Because it probably needs to be.

 It's the current implementation and tested.

  This would restore the register contents to the first state shown above.
  That is, not only I and N registers would be clobbered also S registers
  are involved.

 That's correct. What's the problem? Okay, you've created an infinite
 loop, but what you're describing is absolutely the correct behaviour for
 a continuation.

 Ok. It's a bit mind-twisting but OTOH it's the same as setjmp/longjmp
 with all implications on CPU registers. C has the volatile keyword to
 avoid clobbering of a register due to a longjmp.

  Above code could only use P registers. Or in other words: I, N, and S
  registers are almost[1] useless.

 No they're not. But you should expect them to be reset if you take a
 (full) continuation back to them.

 The problem I have is: do we know where registers may be reset? For
 example:

 $I0 = 10
   loop:
 $P0 = shift array
 dec $I0
 if $I0 goto loop

 What happens if the array PMC's Cshift get overloaded and does some
 fancy stuff with continuations. My gut feeling is that the loop might
 suddenly turn into an infinite loop, depending on some code behind the
 scenes ($I0 might be allocated into the preserved register range or not
 depending on allocation pressure).

 Second: if we don't have a notion that a continuation may capture and
 restore a register frame, a compiler can hardly use any I,S,N registers
 because some library code or external function might just restore these
 registers.

This is, of course, why so many languages that have full continuations
use reference types throughout, even for numbers. And immutable strings...

Python method overloading (was: Premature pessimization)

2004-12-08 Thread Leopold Toetsch

Sam Ruby [EMAIL PROTECTED] wrote:
 Leopold Toetsch wrote:

 Here's the part that you snipped that addresses that question:

 And there is a piece that I haven't written yet that will do the
 reverse: if MMD_ADD is called on a PyObject that has not provided
 such behavior, then an any __add__ method provided needs to be
 called.

Ok. But that would imply that HLL interoperbility isn't really possible. Or
just at a minimal surface level. But see below.

 Since you provided an Evil Leo sample, let me provide an Evil Sam sample:

d = {
  __init__: lambda self,x: setattr(self, value, x),
  __add__:  lambda self,x: str(self.value) + str(x.value)
}

def dict2class(d):
  class c: pass
  c.__dict__.update(d)
   ^^^

This is the critical part of it. The __dict__ of your class provides the
namespace. Setting a key in that namespace (or an attribute of your class
with that key) has a special meaning in Python, *if* that key happens to
be one of the method names.

While the Python people aren't stopping to talk about the clearness of
their language, nothing is clear and explicit, when it comes to overloading
or metaclasses.

Anyway, IMHO, class.__add__ = foo or your example manipulating
class.__dict__ (another special attribute name!) is the point, where
you can install Parrot semantics WRT method overloading.

 Now, given the above sample, let's revisit the statement that The
 Python translator needs just a translation table for the common core
 methods.

We both know that's a simplification :) You've to install the methods of
course ...

 How, exactly, would that be done?  Given that the method name is simply
 a string... used as a key in dictionary... with a different parameter
 signature than the hypothetical Parrot __add method.

The class.__dict__ dictionary is special. Setting an __add__ key too.
The combined meaning is overloading. The different signature is a
problem, yes - I've already mentioned that. And Parrot's __add method
is not hypothetical :-)

$ grep __add t/pmc/object*.t

 That's why I say:

In the general case, looking for reserved method names at compile
time doesn't work.

 __add__ is reserved in Python and corresponds directly to __add in
 Parrot. I don't think that doesn't work.

 __add__ is *not* reserved in Python.

Does it matter if the name is actually reserved? The meaning is
important.

 ... There just is some syntatic sugar
 that provide a shorthand for certain signatures.  I am free to define
 __add__ methods that have zero or sixteen arguments.  I won't be able to
 call such methods with the convenient shorthand, but other than that,
 they should work.

I'd say, if you define an '__add__' method with 16 arguments, Python
will throw an exception, if you try to use C+ with an object of that
class:

  TypeError: myadd() takes exactly 16 arguments (2 given)

So that's rather hypothetical. And if you always use x.__add__(16
args) Parrot will just run the function.

I personally don't think that performance considerations should be out
of bounds in these discussions

 I've already shown that it's possible to go with fully dynamic dispatch
 *and* 30% faster for MMD and 70% faster for overloaded operations. First
 correct and complete, then speed considerations.

 Neither of which match Python semantics.  We are going to need a system
 where classes are anonymous, not global.

Why? And how do you find your class then:

  c = C()
 ...
  3  22 LOAD_NAME1 (C)
 25 CALL_FUNCTION0

 ... Where methods are properties
 that can be added simply by calling the equivalent of set_pmc_keyed.

Nah. Methods aren't properties, but ...

The set_pmc_keyed on __dict__ (or an equivalent setattribute call)
of your type system is responsible to create Parrot semantics for method
calls :-)

 - Sam Ruby

leo

Re: continuation enhanced arcs

2004-12-08 Thread Leopold Toetsch

Piers Cawley [EMAIL PROTECTED] wrote:
 Leopold Toetsch [EMAIL PROTECTED] writes:

 The problem I have is: do we know where registers may be reset? For
 example:

 $I0 = 10
   loop:
 $P0 = shift array
 dec $I0
 if $I0 goto loop

 What happens if the array PMC's Cshift get overloaded and does some
 fancy stuff with continuations. My gut feeling is that the loop might
 suddenly turn into an infinite loop, depending on some code behind the
 scenes ($I0 might be allocated into the preserved register range or not
 depending on allocation pressure).

 Second: if we don't have a notion that a continuation may capture and
 restore a register frame, a compiler can hardly use any I,S,N registers
 because some library code or external function might just restore these
 registers.

 This is, of course, why so many languages that have full continuations
 use reference types throughout, even for numbers. And immutable strings...

So my conclusion that (in combination with restoring registers to the
values of continuation creation) I,S,N registers are almost unusable is
correct?

What about my proposal Lexicals, continuations, and register
allocation? Would that provide proper semantics for continuations?

leo

Re: [perl #32545] [PATCH] [TODO] remove Perl dependancy on split opcode

2004-12-08 Thread James deBoer

Attached is a patch that changes the split opcode to use an Array 
instead of a PerlArray.

It also updates the documentation to note this.
All the tests still pass, and a grep in the languages/ directory shows 
that no language implementations are effected.

- James
Will Coleda (via RT) wrote:
# New Ticket Created by  Will Coleda 
# Please include the string:  [perl #32545]
# in the subject line of all future correspondence about this issue. 
# URL: http://rt.perl.org:80/rt3/Ticket/Display.html?id=32545 

The split opcode currently uses a PerlArray to house its result. It should 
use a non-language specific class.
 

? classes/.array.pmc.swp
Index: ops/string.ops
===
RCS file: /cvs/public/parrot/ops/string.ops,v
retrieving revision 1.28
diff -u -r1.28 string.ops
--- ops/string.ops	28 Sep 2004 11:26:49 -	1.28
+++ ops/string.ops	6 Dec 2004 19:16:59 -
@@ -561,7 +561,7 @@
 
 =item Bsplit(out PMC, in STR, in STR)
 
-Create a new PerlArray PMC $1 by splitting the string $3 with
+Create a new Array PMC $1 by splitting the string $3 with
 regexp $2. Currently implemented only for the empty string $2.
 
 =cut
@@ -589,7 +589,7 @@
 }
 
 op split(out PMC, in STR, in STR) :base_core {
-PMC *res = $1 = pmc_new(interpreter, enum_class_PerlArray);
+PMC *res = $1 = pmc_new(interpreter, enum_class_Array);
 STRING *r = $2;
 STRING *s = $3;
 int slen = string_length(interpreter, s);
@@ -599,6 +599,7 @@
 	goto NEXT();
 if (string_length(interpreter, r))
 	internal_exception(1, Unimplemented split by regex);
+VTABLE_set_integer_native(interpreter, res, slen);
 for (i = 0; i  slen; ++i) {
 	STRING *p = string_substr(interpreter, s, i, 1, NULL, 0);
 	/* TODO first set empty string, then replace */

Re: Python method overloading

2004-12-08 Thread Sam Ruby

Leopold Toetsch wrote:
Sam Ruby [EMAIL PROTECTED] wrote:
Leopold Toetsch wrote:

Here's the part that you snipped that addresses that question:

   And there is a piece that I haven't written yet that will do the
   reverse: if MMD_ADD is called on a PyObject that has not provided
   such behavior, then an any __add__ method provided needs to be
   called.
Ok. But that would imply that HLL interoperbility isn't really possible. Or
just at a minimal surface level. But see below.
I don't believe that to be the case.  If a Perl subroutine were to call 
a Python function and pass a PerlInt as a parameter, the receiving 
function should expect to be able to do addition via tha + operator, 
but should not be expect to find an __add__ method on such objects. 
Instead, and if they cared to, they could explicitly call the __add 
method provided.

The reverse should also be true, if Python function were to call a Perl 
subroutine and pass a PyInt as a parameter, the receiving subroutine 
should expect to be able to do addition via the + operator, but not 
expect to find an __add method on such objects.  Instead, and if they 
cared to, they could explicitly call the __add__ method provided.

I would consider that significant interoperability with only minimal 
restrictions.

While the Python people aren't stopping to talk about the clearness of
their language, nothing is clear and explicit, when it comes to overloading
or metaclasses.
Please don't do that.  I am not trying to extoll the virtues of Python, 
merely trying to implement it.

Anyway, IMHO, class.__add__ = foo or your example manipulating
class.__dict__ (another special attribute name!) is the point, where
you can install Parrot semantics WRT method overloading.
Hold that thought.  I'll answer this below.
Now, given the above sample, let's revisit the statement that The
Python translator needs just a translation table for the common core
methods.
We both know that's a simplification :) You've to install the methods of
course ...
Again, it can't be done exclusively at translation time.  It needs to be 
done at runtime.  And if it is done at runtime, it need not be done at 
translation time at all.  More below.

How, exactly, would that be done?  Given that the method name is simply
a string... used as a key in dictionary... with a different parameter
signature than the hypothetical Parrot __add method.
The class.__dict__ dictionary is special. Setting an __add__ key too.
The combined meaning is overloading. The different signature is a
problem, yes - I've already mentioned that. And Parrot's __add method
is not hypothetical :-)
$ grep __add t/pmc/object*.t
Here I'll apologize for being unclear.  Yes, there is code in the 
existing object class in support of Perl's S06.  What's hypothetical is 
the presumption that all languages will adopt Perl 6's naming convention 
for methods.

That's why I say:

In the general case, looking for reserved method names at compile
time doesn't work.

__add__ is reserved in Python and corresponds directly to __add in
Parrot. I don't think that doesn't work.

__add__ is *not* reserved in Python.
Does it matter if the name is actually reserved? The meaning is
important.
It does matter.  Python classes are dictionaries of objects, some of 
which may be functions.  You may extract objects from that dictionary 
and access them later.  The meaning in such a scenario is not apparent 
until well after all interaction with the compile and runtime 
dictionaries is over.

... There just is some syntatic sugar
that provide a shorthand for certain signatures.  I am free to define
__add__ methods that have zero or sixteen arguments.  I won't be able to
call such methods with the convenient shorthand, but other than that,
they should work.
I'd say, if you define an '__add__' method with 16 arguments, Python
will throw an exception, if you try to use C+ with an object of that
class:
If I define an __add__ method with 16 arguments, Python will not throw 
an exception.

I've already shown that it's possible to go with fully dynamic dispatch
*and* 30% faster for MMD and 70% faster for overloaded operations. First
correct and complete, then speed considerations.

Neither of which match Python semantics.  We are going to need a system
where classes are anonymous, not global.
Why? And how do you find your class then:
  c = C()
 ...
  3  22 LOAD_NAME1 (C)
 25 CALL_FUNCTION0
$pirate -d c.py
...
find_lex $P0, 'C'
$P1=$P0()
store_lex -1, 'c', $P1
The important part isn't simply in which hash a given class name is 
looked up in, but that classes themselves in Python are transient 
objects subject to garbage collection.

... Where methods are properties
that can be added simply by calling the equivalent of set_pmc_keyed.
Nah. Methods aren't properties, but ...
No?  Try the following:
   x = abcdef.find
   print x('c')
The set_pmc_keyed on __dict__ (or an equivalent setattribute

Re: continuation enhanced arcs

2004-12-08 Thread Matt Fowles

Leo~


On Wed, 8 Dec 2004 20:29:00 +0100, Leopold Toetsch [EMAIL PROTECTED] wrote:
 So my conclusion that (in combination with restoring registers to the
 values of continuation creation) I,S,N registers are almost unusable is
 correct?

I would disagree.  Let me take the above example and work with it a little:

  $I0 = 10
loop:
  $P0 = shift array
  dec $I0
  if $I0 goto loop

We are (for the moment) assuming that shift array somehow causes a
full continuations to be taken and then invoked it in a subsequent
call.  Then this code would infinite loop; however, so would this code
as the second call is returning through the first calls continuation.

  $P0 = shift array
  $P1 = shift array

On the other hand, if every call to shift array took a full
continuation, did some stuff, and eventually returned through its
return continuation.  Then neither would infinite loop, as every call
to shift array would have its own return continuation.

What this means is that care must be taken when you are writing code
that you expects to be invoked multiple times.  However, if you are a
function that on your second invocation returns via the continuation
from you first invocation, you should probably expect to be called
again because it happened the first time!  If you are expecting other
behavior, it is probably because one person wrote the whole chain of
calls and had some extra knowledge about the caller.  This author may
have to be a little wary about value vs reference semantics, but
programmers are fairly used to that pitfall by now.

Matt
-- 
Computer Science is merely the post-Turing Decline of Formal Systems Theory.
-???

Re: Python method overloading

2004-12-08 Thread Leopold Toetsch

Sam Ruby [EMAIL PROTECTED] wrote:

[ snipped - all ok }

 If I define an __add__ method with 16 arguments, Python will not throw
 an exception.

I didn't write that. I've said: *if* you call it via a + b, Python
throws an exception - that one I've shown. Anyway...

 If this is done at runtime, the it need not be done at compile time.

... Yes. That's the overall conclusiom it seems. It can be done partially
at compile time, and it isn't worth the effort to try it, because
languages we are targeting are too dynamic.

 However, it doesn't stop here.  Just like methods can be added
 dynamically by name at runtime, they can be accessed dynamically by
 name.  That means that all method lookups will need to be preceeded by a
 hash lookup.  An not just on Python objects, but *all* objects.

... preceeded by some kind of lookup, which is defined by
class-vtable-find_method() of the responsible metaclass.  Being it
one or 100 hash lookups in properties, dicts, globals and what not. It
doesn't matter. Dot.

 That's why I object to characterizations like dynamic dispatch is 30%
 faster than  What will ultimately result if it is mandated that all
 languages adopt Perl6's semantics is that an ADDITIONAL dynamic dispatch
 will be required to make non-Perl6 functions work.

You are still not getting the principal of the scheme, IMHO. It has
nothing to do with Perl6 or any other language, nor with Python.

The original subject: premature pessimization strikes back :)

Whe just do a dynamic lookup at runtime - that's all.

The e.g. add opcode calls left-vtable-find_method(), and probably
more if the return results inidicates MMD. Eventually one of the
find_method calls returns a function that does implement the __add
method for the involved types. Or a (possibly user provided) distance
function decides, which function to call. It doesn't matter.

Then the *runcore* calls the function and *caches* the function pointer.

Next time the call is instantaneous, given that the language is able to
call a cache invalidation function, if method lookup order (for that
class) changes.

The call to the invalidation function is possible, even for Python.
*Iff* you can roll your own method dispatch, you eventually need to
know, which method you call. That has to be defined. You can as well
call a cache invalidation function, if something changes here (I hope)

 PerlScalar's implementation of the add will know about how to implement
 Perl 6's multi sub *infix.  PyObject won't, but it will know about
 Python's __meta__ and __init_class__.

If even an add instruction doesn't work outside of one HLL, we can
forget any interoperbility. __meta__ and what not Python semantics can
be added - or not :-) But let's first concentrate on the basics.

 - Sam Ruby

leo

Re: Python method overloading

2004-12-08 Thread Sam Ruby

Leopold Toetsch wrote:
Sam Ruby [EMAIL PROTECTED] wrote:
[ snipped - all ok }
If I define an __add__ method with 16 arguments, Python will not throw
an exception.
I didn't write that. I've said: *if* you call it via a + b, Python
throws an exception - that one I've shown. Anyway...
What you wrote (and snipped) was
I'd say, if you define an '__add__' method with 16 arguments, Python
will throw an exception,...
To which I responded with the above.
You are still not getting the principal of the scheme, IMHO. It has
nothing to do with Perl6 or any other language, nor with Python.
Either that, or I *am* getting the principle of the scheme.  I guess 
that this is the point where I need to return back to writing code and 
test cases.

Leo - at one point you indicated that you might be interested in helping 
to factor out the common code again.  Please feel free to do so whenever 
you are ready.  All I ask is that you don't break the test cases.

- Sam Ruby
P.S.  No fair changing the test cases either.  ;-)

Re: S05 question

2004-12-08 Thread Luke Palmer

Ashley Winters writes:
 I'm thinking capturing rules should be default in rules, where they're
 downright useful. Your hour/minute/second comment brings up parsing
 ISO time:
 
 grammar ISO8601::DateTime {
 rule year { \d4 }
 rule month { \d2 }
 rule day { \d2 }
 rule hour { \d2 }
 rule minute { \d2 }
 rule second { \d2 }
 rule fraction { \d+ }
 
 rule date { year -? month -? day }
 rule time { hour \:? minute \:? second [\. fraction]? }
 rule datetime { date T time }
 }
 
 For a grammar, that works perfectly!

Yep. 

 In a one-liner, I'd rather just use:
 
 $datetime ~~ /$year := (\d+) -? $month := (\d+) -? ./

Then go ahead and use that.  If you're going to use subrules, you can
either use the ?subrule form or just the regular old subrule form
and ignore the result.  There's nothing forcing you to pay attention to
those.  The number variables only get incremented when you use
parentheses.  I'd suspect that the return value of a rule only accounts
for parenthecized captures as well.

Or are you asking something different than that?

Luke

Anon-repo access for libtap

2004-12-08 Thread Clayton, Nik

I've set up anonymous read-only access to my Subversion repo, so
anyone that wants to play with libtap easily can now:

svn checkout svn://jc.ngo.org.uk/nik/libtap/trunk/

Share and enjoy.

N

Re: S05 question

2004-12-08 Thread Juerd


Warning: excessive nitpicking ahead.


Ashley Winters skribis 2004-12-08 10:51 (-0800):
 rule year { \d4 }

\d**{4}

Or, well, \d**{2,4}

 rule month { \d2 }

\d**{2}

 rule date { year -? month -? day }

rule week { \d**{2} }
rule yday { \d**{3} }
rule date {
year 
[
-? 
[ 
yday 
|
[ [ Wweek | month ] [ -? day ]? ] 
]
]?
}  # :)

 rule time { hour \:? minute \:? second [\. fraction]? }

Likewise making parts optional, and . can also be ,.


 rule datetime { date T time }

rule timezone { Z | [+-] hour [ \:? minute ]? }

rule datetime { date [ T time timezone? ]? }


And still this isn't a full ISO8601 grammar. But I it now covers every
notation that I have seen in the wild so far. A useful source of
information, apart from the ISO standard itself, is
DateTime-Format-ISO8601.


Juerd

Re: Is object representation per class or per object?

2004-12-08 Thread Larry Wall

On Tue, Dec 07, 2004 at 12:32:50PM -0500, Abhijit Mahabal wrote:
: According to S12, it is possible to supply the object layout to bless(), 
: like so:
: 
: $object = $class.bless(:CREATE[:reprP6opaque] :k1($v1) :k2($v2))
: 
: But in the section Introspection, layout is a class trait. Does this 
: mean that classes have a default layout that can be overriden for 
: individual objects?

Er, no.  It's probably just a braino.  If it works at all, I think
it's probably for when the class doesn't specify a layout, or has a
meta-layout that can handle multiple layouts. It might not even make
sense for that.  In general, a class should have a consistent layout.

I think I was thinking about the fact that Perl 5's bless can just use
whatever data structure you hand it.  So maybe

$object = $class.bless(:CREATE[:reprP6Hash] :k1($v1) :k2($v2))

is equivalent to

$object = $class.bless({}, :k1($v1) :k2($v2))

But mostly I was just looking for an example option to pass to :CREATE.
Perhaps :repr is a bit too violent for that.

Larry

Re: S05 question

2004-12-08 Thread Larry Wall

On Wed, Dec 08, 2004 at 11:09:30AM -0700, Patrick R. Michaud wrote:
: On Wed, Dec 08, 2004 at 08:19:17AM -0800, Larry Wall wrote:
:  And people would have to get used to seeing ? as non-capturing assertions:
:  ?before ...
:  ?after ...
:  ?ws
:  ?sp
:  ?null
:  This has a rather Ruby-esque I am a boolean feeling to it.  I think
:  I like it.  It's pretty easy to type, at least on my keyboard.
: 
: FWIW, for some reason in rule contexts I tend to conflate 
: I am a boolean feelings with zero-width assertion, so that each
: of those look vaguely to me as though I'm testing a zero-width 
: proposition and not consuming any text.  And I still tend to think of
: '?' in it's zero or one matches or minimal match connotations.
: Oh well, I suppose I could get used to that.

Yes, there are those interferences, which was one of the reasons for
removing ? the last time we had it in that position (albeit on the
captures rather than the non-captures).  I think we'll have to let
it set a while to see how it feels in this role.  For the purpose of
being a non-alpha no-op, any other non-alpha character would do as well,
so maybe the I am a boolean feeling is not that useful.

:  Now suppose that we extend that I am a boolean feeling to
:  ?{ code }
:  which might take the place of the confusing (...), and make consistent
:  the notion that we always use {...} to invoke real code.
: 
: Hmm, this is nice, however.

In some ways, and not so nice in others, as Luke pointed out.

:  Another problem we've run into is naming if there are multiple assertions
:  of the same name.  If the capture name is just the alpha part of the
:  assertion, then we could allow an optional number, and still recognize
:  it as a ws:
:  ws1 ws2 ws3
:  Except I can well imagine people wanting numbered rules.  Drat.  Could
:  force people to say ws_1 if they want that, I suppose.
: 
: I had been thinking that 
: 
: /ws foo ws bar/
: 
: would simply cause $ws to be a list of captured elements, similar to 
: what might happen for $1 in 
: 
: / [ (.*?) , ]* /

That's what happens by default whenever there is a name conflict.  This
would just be a way of giving a rule a long name as well as a short one,
much like abscomplex is the long name of abs when dispatched on a
complex number, whereas abs is just the set of all abs() multis, if
there is such a beastie.

: If someone really needs the contents of the first and second ws, they
: could do
: 
:(ws) foo (ws)
: 
: and get them as $1 and $2.  But, seeing this tells me that perhaps
: (rule) should be used for capturing rules, analogous to the
: capturing parens, and leave rule to be the non-capturing version.
: But maybe that's anti-Huffman overall.  Maybe the parens could also
: help for disambiguating
: 
:(ws) foo (ws)
: 
: so that we end up with $/ws[1], $/ws[2], etc.  But then we might
: have to always subscript our named captures, which is icky, or maybe 
: we'd only make $/ws act like list when there's more than one 
: capturing (ws) in the rule.
: 
: I dunno.  I kinda like (rule) for capturing, but maybe it just
: doesn't work.

I thought about that a long time, which was part of the reason I also
thought about freeing up (...).  But it just seems a little icky
to mix together the named captures and numbered captures visually if
not semantically.  It starts not being at all clear which parentheses
count and which ones not.  Which is perhaps another reason for changing
current (...) to ?{...}.

We could, I suppose use a subscript inside:

ws[0] foo ws[1]
ws«first» foo ws«second»

but then you'd reference it as

$ws[0]
$wsfirst

which is a gratuitous difference, and suffers the same problem as
the parenthese in confusing real arrays/hashes with sorta fake ones.
So I think we'll stick with the hyphen names for now, which have the
benefit of looking the same and not sending us to bracket heaven.

ws-1 foo ws-2
ws-first foo ws-second

$ws-1
$ws-first

Larry

Re: S05 question

2004-12-08 Thread Larry Wall

On Wed, Dec 08, 2004 at 11:50:51AM -0700, Luke Palmer wrote:
:  Now suppose that we extend that I am a boolean feeling to
:  
:  ?{ code }
:  
:  which might take the place of the confusing (...), and make consistent
:  the notion that we always use {...} to invoke real code.
: 
: Hmm...  I'm just so attached to (...).  I find it quite beautiful.  It
: also somehow communicates the feeling you shouldn't be putting
: side-effects here.

Well, there is that.  On the other hand, {...} is usually just as
side-effect free.  I'm still of two minds about ?{...} vs (...).
Course, if we used «...» to interpolate something then «{...}»
might interpolate a rule, which would free up {...} for the code
assertion.  Doesn't have your side-effectlessness feeling, but it is
at least symmetrical.

:  I think I'm leaning toward the idea that anything in angles that
:  begins alpha is a capture to just the alpha part, so the ? prefix is
:  merely a no-op that happens to make the assertion not start with an
:  alpha.  Interestingly, that gives these implicit bindings:
:  
:  after ... $after$`
:  before ...$before   $'
: 
: I don't quite follow.  Wouldn't that mean that these guys would get
: clobbered if you used lookaheads or lookbehinds in your rules?

The point is that you don't get the $`/$' equivalents unless you
explicitly put a lookbehind/lookahead assertion in your pattern:

/after .* foo before .*/

That has the benefit of telling the rule engine when it has to worry
about saving the prefix/postfix.  Not knowing that is part of why
we had the sawampersand problem in Perl 5.

My other point is that the Perl 6 names of $` and $' fall out naturally
if we name the assertions appropriately.  Unfortunately, $after and
$before don't work as well for variable names as they do for assertion
names.  Maybe we just have pre and post forms that really mean after .*
and before .*.

:  Or we could use some standard delim for that:
:  
:  ws-1 ws-2 ws-3
:  
:  which is vaguely reminiscent of our version syntax.  Indeed, if we
:  had quantifications, you might well want to have wildcards ws-* and
:  let the name be filled in rather than autogenerating a list.  But
:  maybe we just stick with lists in that case.
: 
: I can imagine this being a lot cleaner if the thing after the dash can
: be any sort of identifier:
: 
: ws-indent if ?ws condition ws-comment

Funny thing, I just wrote that into S05.pod.

: On the other hand, it could be misleading, since the standard naming of
: BNF uses dashes instead of underscored.  I don't think it should be a
: big problem though. 

Me either, since it's difficult to define a rule with a hyphen in the name.
And other delimiter candidates run into various problems too.

Larry

Re: Pipe dream - Devel::Cover::Regex

2004-12-08 Thread Paul Johnson

On Tue, Dec 07, 2004 at 11:33:54AM -0800, Kevin Scaldeferri wrote:

 I'm wondering if I'm the only one who would love to see 
 Devel::Cover::Regex?  Many (most?) perl programs are pretty regex 
 heavy, and if we are honest with ourselves, we have to admit that each 
 regex is actually a program in itself.  You can try to throw lots of 
 inputs at it and hope that you were thorough enough, but most of us 
 aren't that good at figuring out all the crazy ways a regex could 
 execute.  I think this would be a very useful extension to 
 Devel::Cover, although I imagine that it's pretty tricky to do.  Even 
 figuring out how to display the results might be tough to do well.

This is something I mentioned early in the development of Devel::Cover.
I think the display should map fairly well into the statements, branches
and conditions we have at the moment.  Atoms map to statements.
Quantifiers map to branches.  Alternation maps to conditions.  It won't
be quite that simple of course, but I think that should be the basics.

 Occasionally I have fantasies of having enough free time to really dig 
 into the internals of the regex engine and trying to do this, but to be 
 honest I don't really see it happening for me.  So, I figure the next 
 best thing is to throw this idea out here and see if anyone else runs 
 with it.

Micheal suggested mjd's Rx might be useful.  Jeff Pinyan's
Regexp::Parser might also help as a base.

-- 
Paul Johnson - [EMAIL PROTECTED]
http://www.pjcj.net

svn

2004-12-08 Thread William Coleda

Is there a plan at any point to move to an svn repository from cvs?
I'd like to work on a patch to move all the perl* pmcs into dynclasses, which would involve quite a bit of file moving, and I'll happily wait for svn if we're going that way, since it'll be smoother.

Re: Test labels

2004-12-08 Thread Mark Stosberg

 On Mon, Dec 06, 2004 at 10:28:45PM -0600, Andy Lester wrote:
 I think even better than 
 
   ok( $expr, name );
 
 or
 
   ok( $expr, comment );
 
 is
 
   ok( $expr, label );
 
 RJBS points out that comment implies not really worth doing, and I
 still don't like name because it implies (to me) a unique identifier.
 We also talked about description, but description is just s
 overloaded.

I prefer name or label to comment. 

Name does not imply 'unique' for me, just like 'John Smith' 
is not expected to a unique name of a person. 

Mark

-- 
http://mark.stosberg.com/

Re: svn

2004-12-08 Thread Matt Fowles

Will~


On Wed, 08 Dec 2004 19:19:07 -0500, William Coleda [EMAIL PROTECTED] wrote:
 Is there a plan at any point to move to an svn repository from cvs?
 
 I'd like to work on a patch to move all the perl* pmcs into dynclasses, which 
 would involve quite a bit of file moving, and I'll happily wait for svn if 
 we're going that way, since it'll be smoother.
 

While I personally like the idea, I think it is unlikely given how
much slower svn is on sizable repositories.  Of course I have not
tried it recently, so maybe that has changed...

All that being said, I am in absolutely no position of authority about this...

Matt
-- 
Computer Science is merely the post-Turing Decline of Formal Systems Theory.
-???

Re: S05 question

2004-12-08 Thread Ashley Winters

On Wed, 8 Dec 2004 16:07:43 -0700, Luke Palmer [EMAIL PROTECTED] wrote:
 Ashley Winters writes:
  For a grammar, that works perfectly!
 
 Yep.
 
  In a one-liner, I'd rather just use:
 
  $datetime ~~ /$year := (\d+) -? $month := (\d+) -? ./
 
 Then go ahead and use that.  If you're going to use subrules, you can
 either use the ?subrule form or just the regular old subrule form
 and ignore the result.  There's nothing forcing you to pay attention to
 those.  The number variables only get incremented when you use
 parentheses.  I'd suspect that the return value of a rule only accounts
 for parenthecized captures as well.

I was working on the (possibly misguided) assumption that there's a
cost to capturing, and that perhaps agressive capturing isn't worth
having on in a one-liner. Some deep part of my mind remembers $`
being bad, I think. If there's no consequence to having capture being
on, then ignoring it is fine. I don't have a problem with that. As I
said before, ?foo reads fine to me.

I'm still going to prefer using :=, simply as a good programming
practice. My mind sees a big difference between building a parse-tree
object and just grepping for some word I want in a string. Within a
rule{} block, there is no place except the rule object to keep your
data (hypothetically -- haha), so it makes sense to have everything
capture unless otherwise specified. There's no such limitation in a
regular code block, so I don't see the need.

I may change my mind after using $/URI::URLpath_segment[2]

Ashley Winters

Re: svn

2004-12-08 Thread Robert Spier

 While I personally like the idea, I think it is unlikely given how
 much slower svn is on sizable repositories.  Of course I have not
 tried it recently, so maybe that has changed...
 All that being said, I am in absolutely no position of authority about this...

This is, and always has been, (since 1.0 at least), a myth.

We have always been at war with 

Apache has moved most of their projects to SVN.  It's probably ready.

-R

Re: S05 question

2004-12-08 Thread Alexey Trofimenko

On Wed, 8 Dec 2004 16:07:43 -0700, Luke Palmer [EMAIL PROTECTED] wrote:
Ashley Winters writes:
In a one-liner, I'd rather just use:
$datetime ~~ /$year := (\d+) -? $month := (\d+) -? ./
I'm starting to think that this '$year := ' syntax is an obfuscator. We  
couldn't refer to that capture with $year even inside a regex, right? We  
should use $year instead. Maybe $year := (\d+) would be less  
obfuscating.. but it's longer :)
(year:= \d+) and [year:= \d+] are somewhat better, IMHO, but I'm not sure  
if : in := is unambigous here.
but if /year/ and /$year:=.../ both capture to $year, why not make  
thoose two more similar? things like year:\d+ or year[\d+] or year:  
[\d+] come to mind. or that (now unused) year [\d+]

Then go ahead and use that.  If you're going to use subrules, you can
either use the ?subrule form or just the regular old subrule form
and ignore the result.  There's nothing forcing you to pay attention to
those.  The number variables only get incremented when you use
parentheses.  I'd suspect that the return value of a rule only accounts
for parenthecized captures as well.
..and ignore the result? hm. what if someone lazy will put $a ~~  
/rule/ instead of $a ~~ /?rule/, would be there any copying overhead  
after $a = something else (to keep $rule, which he isn't even going to  
use).
(Some perl5 programmers use (...) where (?:...) would be sufficient, just  
because they are too lazy to put extra two characters, and because it's  
noisier. ?rule is better than rule for noncapturing behaviour in  
that sense, but I could imagine thoose ?ws everywhere.. um, just  
moaning..  maybe old, nonswapped behaviour, was better:  ws to not  
capture, ws to capture (I don't think  and  are appropriate.

Re: svn

2004-12-08 Thread Michael G Schwern

On Wed, Dec 08, 2004 at 10:16:21PM -0500, Matt Fowles wrote:
 While I personally like the idea, I think it is unlikely given how
 much slower svn is on sizable repositories.  Of course I have not
 tried it recently, so maybe that has changed...

If you wish to try out a recent Subversion on some sizable source
there's a mirror of the maint and bleadperl Perforce repositories here.
http://svn.clkao.org/svnweb/perl

You can pull them out using
svn://svn.clkao.org/perl

Subversion has improved a lot.  I'm using it now.  If you do try it I
recommend going straight to 1.1.1 and using fsfs based repositories.

Keep in mind that SVN is slower on checkouts than CVS.  However diff is
a purely local operation.  And if you're using something like SVK network
traffic isn't much of an issue after all after the initial mirror.


-- 
Michael G Schwern[EMAIL PROTECTED]  http://www.pobox.com/~schwern/
Now we come to that part of the email you've all been waiting for--the end.

RE: C implementation of Test::Harness' TAP protocol

2004-12-08 Thread Andrew Savige

--- Clayton, Nik wrote: 
 Any Writing thread safe libraries for dummies texts you could point
 me at?

I recommend Programming with POSIX Threads by David Butenhof.

Re the varargs ok() business, I assume you'll be using some sort of
config.h with your libtap library. Any plans on using autoconf or
similar tool?

One way around this __VA_ARGS__ portability issue is to let
configure work it out and write your code for some sort of
VA_ARGS capability. There doesn't appear to be a standard
autoconf symbol for this, at least I couldn't find one.
Googling for HAVE_VA_ARGS uncovered only two dubious hits.
http://gcc.gnu.org/ml/gcc-help/2004-05/msg00181.html
asked a question about this issue, but no response.

I noticed this in a issue with glib.h gtk-devel-list thread:

I think we should just use the __STDC_VERSION__ define -- no need
for autoconf.

#if defined __STDC_VERSION__  __STDC_VERSION__ = 199901L
# define g_message(...) g_log (DOM, LOG_MSG, __VA_ARGS__)
#elif defined __GNUC__
# define g_message(format_args...) g_log (DOM, LOG_MSG, format_args)
#else
...
#endif

Finally, ACE C++ library uses:

#if defined (__GNUC__)  (__GNUC__ = 3 || __GNUC_MINOR__  95)
  // use GNU __VA_ARGS__ capability ...

HTH,
/-\


Find local movie times and trailers on Yahoo! Movies.
http://au.movies.yahoo.com

39 matches

Mail list logo