S02: generalized quotes and adverbs

2006-05-10 Thread jerry gay

according to S02, under 'Literals', generalized quotes may now take
adverbs. in that section is the following comment:

snip
[Conjectural: Ordinarily the colon is required on adverbs, but the
quote declarator allows you to combine any of the existing adverbial
forms above without an intervening colon:

   quote qw;   # declare a P5-esque qw//
snip

there's trouble if both q (:single) and qq (:double) are allowed
together. how would qqq resolve? i say it makes sense that we get
longest-token matching first, which means it translates to :double
followed by :single.

~jerry


Re: S02: generalized quotes and adverbs

2006-05-10 Thread Larry Wall
On Tue, May 09, 2006 at 11:15:24PM -0700, jerry gay wrote:
: according to S02, under 'Literals', generalized quotes may now take
: adverbs. in that section is the following comment:
: 
: snip
: [Conjectural: Ordinarily the colon is required on adverbs, but the
: quote declarator allows you to combine any of the existing adverbial
: forms above without an intervening colon:
: 
:quote qw;   # declare a P5-esque qw//
: snip
: 
: there's trouble if both q (:single) and qq (:double) are allowed
: together. how would qqq resolve? i say it makes sense that we get
: longest-token matching first, which means it translates to :double
: followed by :single.

That would be one way to handle it.  I'm not entirely convinced that
we have the right adverb set yet though.  I'm still thinking about
turning :n, :q, and :qq into :0, :1, and :2.  I'd like to turn :ww
into something single character as well.  The doubled ones bother me
just a little.

But as it stands, the conjectured quote declarator is kind of lame.
It'd be just about as easy to allow

quote qX :x :y :z;

so you could alias it any way you like.  Or possibly just allow

alias qX q:x:y:z;

or even

qX ::= q:x:y:z;

as a simple, argumentless word macro.  But the relationship
of that to real macros would have to be evaluated.  There's
something to be said for keeping macros a little bit klunky.
On the other hand, if people are going to invent simplified
macro syntax anyway, I'd rather there be some standards.

Larry


Re: Scans

2006-05-10 Thread Markus Laire

On 5/10/06, Austin Hastings [EMAIL PROTECTED] wrote:

Mark A. Biggar wrote:
 Use hyper compare ops to select what you want followed by using filter
 to prune out the unwanted.

 filter gives you with scan:

 filter (list [] @array) @array ==
 first monotonically increasing run in @array

This seems false. @array = (1 2 2 1 2 3), if I understand you correctly,
yields (1 2 2 3).


No, it yields (1, 2, 2)

   list [] @array
==
   list [] (1, 2, 2, 1, 2, 3)
==
   1,
   1  2,
   1  2  2,
   1  2  2  1,
   1  2  2  1  2,
   1  2  2  1  2  3,
==
   Bool::True, Bool::True, Bool::True, Bool::False, Bool::False, Bool::False

And so
   filter (list [] @array) @array
would give first 3 elements of @array, i.e. (1, 2, 2)


 filter (list [=] @array) @array ==
 first monotonically non-decreasing run in @array

So @array = (1 0 -1 -2 -1 -3) == (1, -1) is monotonically non-decreasing?


This would give (1, 0, -1, -2)

   list [=] (1, 0, -1, -2, -1, -3)
==
   1,
   1 = 0,
   1 = 0 = -1,
   1 = 0 = -1 = -2,
   1 = 0 = -1 = -2 = -1,
   1 = 0 = -1 = -2 = -1 = -3
==
   Bool::True, Bool::True, Bool::True, Bool::True, Bool::False, Bool::False

And so
   filter (list [=] @array) @array
would give first 4 elements of @array, i.e. (1, 0, -1, -2)

--
Markus Laire


Re: A rule by any other name...

2006-05-10 Thread Juerd
Damian Conway skribis 2006-05-10 18:07 (+1000):
  More than that, the current 'rule' and 'regex' can both be used inside
  and outside a grammar. If we were to take the 'sub'/'method' pattern, then
  'rule' should never be allowed outside a grammar,
 I entirely agree.

I don't. While disallowing named methods and rules may be a wise idea
(I'm not sure they are), the anonymous forms are probably very useful to
have around.

my $method = method { ... };
$object.$method(...);


Juerd
-- 
http://convolution.nl/maak_juerd_blij.html
http://convolution.nl/make_juerd_happy.html 
http://convolution.nl/gajigu_juerd_n.html


Re: Scans

2006-05-10 Thread Markus Laire

In the previous mail I accidentally read [=] as [=]

On 5/10/06, Markus Laire [EMAIL PROTECTED] wrote:

  filter (list [=] @array) @array ==
  first monotonically non-decreasing run in @array

 So @array = (1 0 -1 -2 -1 -3) == (1, -1) is monotonically non-decreasing?

This would give (1, 0, -1, -2)


Correction: This would give (1)



list [=] (1, 0, -1, -2, -1, -3)
==
1,
1 = 0,
1 = 0 = -1,
1 = 0 = -1 = -2,
1 = 0 = -1 = -2 = -1,
1 = 0 = -1 = -2 = -1 = -3
==
Bool::True, Bool::True, Bool::True, Bool::True, Bool::False, Bool::False


Correction:
   Bool::True, Bool::False, Bool::False, Bool::False, Bool::False, Bool::False



And so
filter (list [=] @array) @array
would give first 4 elements of @array, i.e. (1, 0, -1, -2)


Correction: It would give only first element of @array, i.e. (1)

--
Markus Laire


Re: Scans

2006-05-10 Thread Markus Laire

And here I mis-read  as =.
Perhaps I should stop fixing, as I'm making too many errors here...

On 5/10/06, Markus Laire [EMAIL PROTECTED] wrote:

  filter (list [] @array) @array ==
  first monotonically increasing run in @array
 
 This seems false. @array = (1 2 2 1 2 3), if I understand you correctly,
 yields (1 2 2 3).

No, it yields (1, 2, 2)


Correction: (1, 2)



list [] @array
==
list [] (1, 2, 2, 1, 2, 3)
==
1,
1  2,
1  2  2,
1  2  2  1,
1  2  2  1  2,
1  2  2  1  2  3,
==
Bool::True, Bool::True, Bool::True, Bool::False, Bool::False, Bool::False


Correction: Bool::True, Bool::True, Bool::False, Bool::False,
Bool::False, Bool::False



And so
filter (list [] @array) @array
would give first 3 elements of @array, i.e. (1, 2, 2)


Correction: First 2 elements, i.e. (1, 2)

--
Markus Laire


Re: Scans

2006-05-10 Thread Markus Laire

On 5/9/06, Jonathan Scott Duff [EMAIL PROTECTED] wrote:

On Tue, May 09, 2006 at 06:07:26PM +0300, Markus Laire wrote:
 ps. Should first element of scan be 0-argument or 1-argument case.
 i.e. should list([+] 1) return (0, 1) or (1)

I noticed this in earlier posts and thought it odd that anyone
would want to get an extra zero arg that they didn't specify. My
vote would be that list([+] 1) == (1)  just like [+] 1 == 1


Yes, that was an error on my part. I mis-read the example from Juerd
as giving 0 arguments for first item, while it gives the 0th
argument of an array.

I (now) agree that it doesn't seem to be usefull to include the 0-argument case.

--
Markus Laire


Re: A rule by any other name...

2006-05-10 Thread Damian Conway

Allison wrote:


I've never met anyone who *voluntarily* added
the 'p'. ;-)


You've spent too much time in the U.S. ;)


And Australia. I don't know where the silent 'p' comes from but it sure ain't 
the New World.




Picking names that mean what they say is important in Perl. It's why we have
'given'/'when' instead of 'switch'/'case'. We don't have to use the same old
name for things just because everyone else is doing it (even if we started it).

There's nothing about 'regex' that says backtracking enabled.


Sure there is. About 20 years of computing history. Nowadays regex has 
virtually nothing to regular expressions; it's now just the computing term 
for compact set of instructions for a pattern matching machine.




But isn't it appealing to stop using an archaic word that has now become
meaningless?


No. For a start, regex isn't archaic. In fact it's a comparative neologism, 
having only recently broken awa--both syntactically and semantically--from the 
older regular expression. More importantly, the *concept* hasn't become 
meaningless at all; indeed it's grown significantly in meaning over the past 
decade. And the word regex is now far more strongly associated with that 
expanded concept than with the original idea of a regular expression.




That's pretty much the Perl 5 argument for using sub for both subroutines
and methods, which we've definitively rejected in Perl 6.


Subs and methods have a number of distinguising characteristics. If the only
distinction between them was one small characteristic change, I might argue
against using different keywords there too. (I think the choice of using only
'sub' made sense for Perl 5 with its simplistic OO semantics, but Perl 6
provides more intelligent defaults for methods so the separation makes sense
here.)


I think you're wrong. I think sub has proved not to be the right choice in 
Perl 5 either. As abstractions, methods and subs are very different. In usage, 
they're very different. It's only in implementation that they're similar. 
Using the same keyword for two constructs that are used--and which act--very 
differently was a rare misstep on Larry's part.


And it's those same enormous abstract and pragmatic differences that we need 
two keywords to distinguish when it comes to pattern matching. Think about the 
trouble we're going to have translating Perl 5 subs to Perl 6 subs or methods, 
precisely because of the lack of semantic marking. The designers of Perl 7 
won't thank us if we repeat the mistake with regexes and rules.




Rules inside and outside grammars are the same class. They have the same
behaviour aside from :ratchet,


And skipping!


and :ratchet can be set without the keyword change.


But then you've no way of knowing from *local* context which way it defaults 
for a given instance.




More than that, the current 'rule' and 'regex' can both be used inside
and outside a grammar. If we were to take the 'sub'/'method' pattern, then
'rule' should never be allowed outside a grammar,


I entirely agree.


and 'regex' should either not be allowed inside a 'grammar', 

 or should express some distinctive feature

inside the grammar (like non-inherited or doesn't operate on the match
object, 


The main distinction is that rules are ratcheted and skippy whereas regexes 
aren't. But yes regexes they ought not be inherited either.




but there are better words for those concepts than 'regex').


If you can come up with even one other word that means backtrackable, 
non-skippy, and uninherited, in the same way that rule implies ratcheted, 
whitespace-skipping, and heritable, then I'd be more than delighted to 
consider it.


Personally, I thought regex already fit the bill admirably, since 
backtracking, not skipping, and not inheriting is exactly what regexes do in 
most current languages (including Perl 5).




If we use rule for both kinds of regexes, we force the reader to constantly
check surrounding context in order to understand the behaviour of the
construct. :-(


Context is a Perlish concept. :)


*Local* context is. Having three fundamental behaviours change because of a 
namespace declaration 1000 lines earlier doesn't seem very Perlish to me.




Making different things different is an important design principle, but so is
making similar things similar.


I disagree. What we've been doing in Perl 6 is making different things
different, and identical things identical (or, more precisely, consolidating 
things that turn out to be identical if you look closely enough).


But regexes and rules aren't identical; merely similar. And making
similar things identical is a *bad* idea in language. IANL(inguist) but
it seems to me that most languages evolve towards make similar things as
different as possible, so that they're not accidentally confused.



I do like 'term' better.


Me too. :-)


That really isn't whitespace skipping, though. 


Sure it is. Whitespace is just the industry term for anything we politely 
ignore. 

Re: S02: generalized quotes and adverbs

2006-05-10 Thread Daniel Hulme
 qX ::= q:x:y:z;
 
 as a simple, argumentless word macro.
But would that DWIM when I come to write

qX(stuff, specifically not an adverb argument);

?

-- 
The  rules  of  programming  are  transitory;  only  Tao  is  eternal. 
 Therefore you  must contemplate Tao before you receive  enlightenment.
How will I know when I have received enlightenment?  asked the novice.
Your program will then run correctly, replied the master. 


pgpXeXXcH6srs.pgp
Description: PGP signature


Re: A rule by any other name...

2006-05-10 Thread Allison Randal
On Wed, 10 May 2006, Damian Conway wrote:
 Allison wrote:
 
 I've never met anyone who *voluntarily* added
 the 'p'. ;-)

You've spent too much time in the U.S. ;)

   and the fact that everyone knows 'regex(p)'
  means regular expression no matter how may times we say it doesn't.
 
 Sure. But almost nobody knows what regular actually means, and of
 those few only a tiny number of pedants actually *care* anymore. So
 does it matter?

Picking names that mean what they say is important in Perl. It's why we have
'given'/'when' instead of 'switch'/'case'. We don't have to use the same old
name for things just because everyone else is doing it (even if we started it).

There's nothing about 'regex' that says backtracking enabled.

 Then don't. I teach regexes all the time and I *never* explain what
 regular means, or why it doesn't apply to Perl (or any other
 commonly used) regexes any more.

But isn't it appealing to stop using an archaic word that has now become
meaningless?

  Maybe 'match' is a better keyword.
 
 I don't think so. Match is a better word for what comes back from
 a regex match (what we currently refer to as a Capture, which is
 okay too).

I agree there. I still prefer 'rule'.

 That's pretty much the Perl 5 argument for using sub for both subroutines
 and methods, which we've definitively rejected in Perl 6.

Subs and methods have a number of distinguising characteristics. If the only
distinction between them was one small characteristic change, I might argue
against using different keywords there too. (I think the choice of using only
'sub' made sense for Perl 5 with its simplistic OO semantics, but Perl 6
provides more intelligent defaults for methods so the separation makes sense
here.)

Rules inside and outside grammars are the same class. They have the same
behaviour aside from :ratchet, and :ratchet can be set without the keyword
change. More than that, the current 'rule' and 'regex' can both be used inside
and outside a grammar. If we were to take the 'sub'/'method' pattern, then
'rule' should never be allowed outside a grammar, and 'regex' should either not
be allowed inside a 'grammar', or should express some distinctive feature
inside the grammar (like non-inherited or doesn't operate on the match
object, but there are better words for those concepts than 'regex').

 If we use rule for both kinds of regexes, we force the reader to constantly
 check surrounding context in order to understand the behaviour of the
 construct. :-(

Context is a Perlish concept. :)

It's worse to force the writer and reader to distinguish between two keywords
when they don't have a sharp difference in meaning, and when the names of the
two keywords don't provide any clues to what the difference is.

Making different things different is an important design principle, but so is
making similar things similar.

 True. Token is the wrong word for another reason: a token is a
 segments component of the input stream, *not* a rule for matching
 segmented components of the input stream. The correct term for that is
 terminal. So a suitable keyword might well be term.

I do like 'term' better.

 Whitespace skipping (for suitable values of whitespace) is a critical
 feature of parsers. I'd go so far as to say that it's *the* killer feature of
 Parse::RecDescent.

 What you want is *whitespace* skipping (where comments are a special form of
 whitespace). What you *really* want is is whitespace skipping where you get
 to define what constitutes whitespace in each context where whitespace might
 be skipped.

That really isn't whitespace skipping, though. Calling it whitespace skipping
conflates two concepts that are only slightly related. I agree that skipping is
an important feature in parsers.

 But the defining characteristic of a terminal is that you try to match
 it exactly, without being smart about what to ignore. That's why I like the
 fundamental rule/token distinction as it is currently specified.

Can you give me some additional characteristics for 'term' beyond just turn
off :skip? Grammars also need to turn off skipping in rules that aren't
terminals, and the different keyword is entirely inappropriate in those cases.
Since you'd need to use ':!skip' (or whatever syntax) on other rules anyway, it
doesn't make sense to use 'term' anywhere unless it provides some additional
intelligent defaults for terminals.

  I also suggest a new modifier for comment skipping (or skipping in
  general) that's separate from :words, with semantics much closer to
  Parse::RecDescent's 'skip'.
 
 Note, however, that the recursive nature of Parse::RecDescent's skip
 directive is a profound nuisance in practice, because you have to
 remember to turn it off in every one of the terminals.

And in the current form you have to remember to use 'token' for all the
terminals. Not really a significant difference in mental effort.

 In light of all that, perhaps :words could become :skip, which defaults to
 :skip(/ws/) but allows 

Re: [perl #39072] [BUG] Unable to load_bytecode :multi after PGE.pbc

2006-05-10 Thread Leopold Toetsch

Patrick R.Michaud (via RT) wrote:
# New Ticket Created by  Patrick R. Michaud 
# Please include the string:  [perl #39072]
# in the subject line of all future correspondence about this issue. 
# URL: https://rt.perl.org/rt3/Ticket/Display.html?id=39072 



I've been unable to get pheme to run on my system, and after
chromatic and I did some testing tonight we think we've narrowed
the problem down to an issue with using load_bytecode on files
containing :multi subs.


Fixed (r12593), thanks for the testcase.


Pm


leo



Re: [perl #39081] [BUG] (possible bug) multiple calls to __init for subclassed objects

2006-05-10 Thread Leopold Toetsch

Patrick R.Michaud (via RT) wrote:


If a subclass doesn't define an __init method, then creating
a new instance of the subclass results in multiple calls to
the base class __init method.


Fixed, r12594. (__init was searched in parents with find_method, which 
also searched parents ...)



I've added a test for this to t/pmc/objects.t


Thanks, unTODOed.


Pm


leo




Re: A rule by any other name...

2006-05-10 Thread Ruud H.G. van Tol
Allison Randal schreef:
 Damian:

 Match is a better word for what comes back from
 a regex match (what we currently refer to as a Capture, which is
 okay too).

 I agree there. I still prefer 'rule'.

Maybe matex (mat-ex) for matching expression and, within that,
capex/captex (cap-ex/capt-ex) for capturing expression?

-- 
Groet, Ruud



Re: A rule by any other name...

2006-05-10 Thread Ruud H.G. van Tol
Damian Conway schreef:

 grammar Perl6 is skip(/[ws+ | \# brackets | \# \N]+/) {
 ...
 }

I think that first + is superfluous.

Doubly so if ws already stands for the run of all consecutive
word-separators.

-- 
Groet, Ruud



Re: A rule by any other name...

2006-05-10 Thread Patrick R. Michaud
On Wed, May 10, 2006 at 06:07:54PM +1000, Damian Conway wrote:
 
 Including :skip(/someotherrule/). Yes, agreed, it's a huge 
 improvement. I'd be more comfortable if the default rule to 
 use for skipping was named skip instead of ws. 
 (On IRC sep was also proposed, but the connection between
 :skip and skip is more immediately obvious.)
 
 Yes, I like skip too. I too keep mistakely reading ws as WhiteSpace.

FWIW, I recently noticed noticed in another language
definition the phrase intertoken space as being something
that can occur on either side of any token, but not within
a token.  Perhaps some abbreviation or variation of that could 
work in place of either ws or skip.

(Somehow skip seems too verbish to me, when the other
subrules we tend to see in a rule tend to be nounish.  Yes, I 
know that skip can be a noun as well, it just feels wrong.)

 I'm still utterly convinced my original three-keyword list is the right one 
 (and that the three keywords in it are the right ones too). 

Having played with regex/token/rule in the perl6 grammar a bit
further, as well as looking at a couple of others, I'm finding 
regex/token/rule to be fairly natural.  It only becomes unnatural
if I'm trying hard to optimize things -- e.g., by using token instead
of rule to avoid unnecessary calls to ?ws.  (And it may well turn
out that trying to avoid these calls is a premature or incorrect
optimization anyway -- I won't know until I'm a little farther along
in the grammars I'm work with.)

Pm


Re: A rule by any other name...

2006-05-10 Thread Larry Wall
On Wed, May 10, 2006 at 11:25:26AM +1000, Damian Conway wrote:
: True. Token is the wrong word for another reason: a token is a
: segments component of the input stream, *not* a rule for matching
: segmented components of the input stream. The correct term for that is
: terminal. So a suitable keyword might well be term.

There are several problems with that.  A small problem is that
term is the same length as rule, and that makes it harder to
tell them apart visually.  A larger problem is that, unfortunately,
term is one of the more heavily overloaded terms (pun intended)
in computing.  Even in Perl 5 culture we use it *heavily* to mean
non-infix.  Calling infix:* a term really grates for that reason.

The overloading of token is much milder, and I'd rather take the
core metaphor of token and extend it to the supertoken, because
the intent is the same.  The intent of a token is to present a
simple interface outward.  The same is true for the supertoken.
Structurally a supertoken is rather like an object, insofar as it
has a simple outside and a complicated inside.  That complicated
inside is expressed by the fact that the supertoken calls out to a
subrule.  But the supertoken itself still wants to be treated simply
in its own context, just as any object can be treated as a scalar.
The interface to a postcircumfix requires token parsing on the
outside, despite allowing full expressions on the inside.  But as
with the sub/multi/method distinction, the primary motivation is to
distinguish the outward interface, that is, how they are to be used.

So anyway, I think token is sufficiently close to what we want
it to mean that we can force it to mean that, and it's sufficiently
orphaned that few people are going to complain about impressing it
into forced labor.  And, in fact, the larger cultural meaning of
token implies that it's something simple that represents something
complicated, as in a token of our appreciation.

Larry


[perl #39117] [TODO] Using v?snprintf/strlcpy/strlcat when useful

2006-05-10 Thread via RT
# New Ticket Created by  Leopold Toetsch 
# Please include the string:  [perl #39117]
# in the subject line of all future correspondence about this issue. 
# URL: https://rt.perl.org/rt3/Ticket/Display.html?id=39117 


See also http://use.perl.org/articles/06/05/03/1325204.shtml

19:24 @leo Andy: btw - if you got some extra tuits: Using 
v?snprintf/strlcpy/strlcat when useful would be also very welcome for 
Parrot
19:25 @leo strlcpy/strlcat would need a test too, snprintf should 
already be in config tests
19:25 @leo and we'd need an implementation, if libc doesn't provide 
the funcs

leo



Re: [perl #39117] [TODO] Using v?snprintf/strlcpy/strlcat when useful

2006-05-10 Thread Steve Peters
On Wed, May 10, 2006 at 10:30:42AM -0700, Leopold Toetsch wrote:
 # New Ticket Created by  Leopold Toetsch 
 # Please include the string:  [perl #39117]
 # in the subject line of all future correspondence about this issue. 
 # URL: https://rt.perl.org/rt3/Ticket/Display.html?id=39117 
 
 
 See also http://use.perl.org/articles/06/05/03/1325204.shtml
 
 19:24 @leo Andy: btw - if you got some extra tuits: Using 
 v?snprintf/strlcpy/strlcat when useful would be also very welcome for 
 Parrot
 19:25 @leo strlcpy/strlcat would need a test too, snprintf should 
 already be in config tests
 19:25 @leo and we'd need an implementation, if libc doesn't provide 
 the funcs
 

I'm taking a look at it.  I should have something working this evening 
for the configs.  Adding the HAS_BLAH's will take some additional time.

Steve Peters
[EMAIL PROTECTED]


signature.asc
Description: Digital signature


Re: A rule by any other name...

2006-05-10 Thread Damian Conway

Larry wrote:


So anyway, I think token is sufficiently close to what we want
it to mean that we can force it to mean that, and it's sufficiently
orphaned that few people are going to complain about impressing it
into forced labor.


I'm perfectly fine with that. To quote myself out of context:

 But almost nobody knows what [the word] actually means, and of
 those few only a tiny number of pedants actually *care* anymore.
 So does it matter?

;-)

Damian


Re: A rule by any other name...

2006-05-10 Thread Uri Guttman
 AR == Allison Randal [EMAIL PROTECTED] writes:


  AR Including :skip(/someotherrule/). Yes, agreed, it's a huge
  AR improvement. I'd be more comfortable if the default rule to use
  AR for skipping was named skip instead of ws. (On IRC sep was
  AR also proposed, but the connection between :skip and skip is more
  AR immediately obvious.)

a small point but why not have both ws and skip be aliased to each
other? i like the skip connection but ws is (usually) about skipping
white space which is likely the most commonly skipped text. both names
have value so we should have both. and i think in most cases you won't
see many explicit skip or ws as they will be implied by the
whitespace in the rule/term/whatever that has skipping enabled.

uri

-- 
Uri Guttman  --  [EMAIL PROTECTED]   http://www.stemsystems.com
--Perl Consulting, Stem Development, Systems Architecture, Design and Coding-
Search or Offer Perl Jobs    http://jobs.perl.org


Re: A rule by any other name...

2006-05-10 Thread Allison Randal
To summarize a phone call today, the more intelligent defaults we add to 
differently named rule keywords the more comfortable I am with having 
different names. So, here's what we have so far (posted both as an FYI 
and to confirm that we have the coherent solution I think we have):


rule:
- Has :ratchet and :skip turned on by default

- May only be used inside a grammar

- Takes default modifiers (a.k.a. traits) from the grammar in which it 
is defined


- Is inherited by subclasses of a grammar

- The default modifiers can be turned off by :!ratchet and :!skip both 
for individual rules and for an entire grammar (I'd like to see some 
syntax for this)



regex:
- Has no modifiers turned on by default

- May be used inside and outside a grammar

- Inside a grammar, it is not inherited by subclasses of the grammar

- Inside a grammar, it does not take default modifiers from the grammar

- Individual regexen can turn on the :ratchet or :skip modifiers


token:
- Has :ratchet turned on by default

- Is inherited by subclasses of a grammar

- Does not take default modifiers from the grammar

- Individual token rules can turn off the :ratchet modifier with 
:!ratchet, and can turn on :skip


- (I'd still like to see more for token, perhaps some optimizations that 
are possible when you're certain you have a terminal, like cannot call 
subrules)



skip:
- We keep :words as shorthand for :skip(/ws/)

- And :skip is shorthand for :skip(/skip/)

- To change skipping behavior: a) override skip in your grammar, b) 
set :skip(/.../) on an individual rule, or c) set 'is skip(/.../)' on a 
grammar


- ws is optional whitespace, following skippy behavior (and it always 
behaves the same no matter what the current :skip pattern is)


- sp is a single character of obligatory whitespace

Allison
--
E pur si muove!
-- apocryphally attributed to Galileo Galilei


Re: A rule by any other name...

2006-05-10 Thread Patrick R. Michaud
On Wed, May 10, 2006 at 05:58:57PM -0700, Allison Randal wrote:
 To summarize a phone call today, the more intelligent defaults we add to 
 differently named rule keywords the more comfortable I am with having 
 different names. So, here's what we have so far (posted both as an FYI 
 and to confirm that we have the coherent solution I think we have):
 [...]
 skip:
 - We keep :words as shorthand for :skip(/ws/)
 - And :skip is shorthand for :skip(/skip/)
 [...]

Please, describe these with ?ws and ?skip to make clear their
non-capturing semantic.  :-)

But Allison's message helps me to crystallize what has been
bugging me about the term :skip (and to a lesser extent :words)
in describing what they do.  So, I'll offer my thoughts here
in case anyone wants to pick it up before we go a-changing S05
yet again.  (If no-one picks it up, I'll just wait for S05 to
be updated to whatever is decided and implement that. :-)

Whitespace in regexes and rules is metasyntactic, in that it is 
not matched literally.  Effectively what the :w (or :words or 
:skip) option does it to change the metasyntactic meaning of 
any whitespace found in the regex.  Or, another way of thinking
of it -- as S05 currently stands, 'regex' and 'token' cause
the pattern whitespace to be treated as ?null, while 'rule'
causes the pattern whitespace to become ?ws.

So what we're really doing with this option--whatever we 
call it--is to specify what the whitespace _in the pattern_
should match.  Somehow :skip and ?skip don't carry that
meaning for me.

In some sense it seems to me that the correct adverb is
more along the lines of :ws, :white, or :whitespace, in that
it says what to do with the whitespace in the pattern.  It
doesn't have to say anything about whether the pattern's
whitespace is actually matching \s* (although the default
rule for :ws/:white/:whitespace could certainly provide that
semantic).

I can fully see the argument that people will still
confuse :ws and ?ws with whitespace in the target, 
when in reality they specify the meaning of whitespace
in the regex pattern, so :ws might not be the right choice
for the adverb.  But I think that something more closely 
meaning whitespace in the pattern means /this/ would be a 
better adverb than :skip.

If someone *really* wants to use skip, there's always
:ws(/?skip/) (or whatever we choose) which means 
whitespace in the regex matches ?skip.

 - sp is a single character of obligatory whitespace

This one has bugged me since the day I first saw it implemented
in PGE.  We _already_ have \s, blank, and space to represent 
the notion of a whitespace character -- do we really need a 
separate sp form also?  (An idle thought: perhaps sp is
better used as an :sp adverb and a corresponding ?sp regex?)

Pm


Re: A rule by any other name...

2006-05-10 Thread Damian Conway

Allison admirably summarized:


rule:

regex: 


token:

skip:
- We keep :words as shorthand for :skip(/ws/)

- And :skip is shorthand for :skip(/skip/)


...where skip defaults to ws, but is distinct from it (i.e. it can be 
redefined independently).



- To change skipping behavior: a) override skip in your grammar, b) 
set :skip(/.../) on an individual rule, or c) set 'is skip(/.../)' on a 
grammar


- ws is optional whitespace, 


Not quite. ws is semi-optional whitespace. More precisely, it's not optional 
between two identifier characters:


token ws { after \w  \s+  before \w
 | after \w  \s*  before \W
 | after \W  \s*
 }


 following skippy behavior (and it always behaves the same no matter
 what the current :skip pattern is)


Damian