> On Sun, Mar 08, 2009 at 09:43:17AM +0100, [email protected] wrote:
> =item * ws
>
> Match whitespace between tokens.
>
> =item * space
>
> Match a single whitespace character. Hence C< <ws> > is equivalent to C<
> <space>+ >.
The definitions of <ws> and <space> above are incorrect, or at least
misleading. <ws> matches required whitespace between pairs of word
characters; it's optional whitespace otherwise. The default definition
of <ws> is something like:
token ws { <?before \w> <?after \w> <!> || \s* }
It's certainly _not_ the case that <ws> is equivalent to <space>+ .
To make things a bit quicker for people writing custom versions of
<ws> (which may need to include "comment whitespace"), the Parrot
Compiler Toolkit also provides an optimized <ww> rule that matches
only between a pair of word characters. Then the default definition
of <ws> becomes
token ws { <!ww> \s* }
Grammars can change this to things like:
token ws { <!ww> [ \s+ || '#' \h* \n ]* }
Pm