Simon Cozens wrote:
> 
> On Thu, Apr 12, 2001 at 05:39:12PM -0400, Dan Sugalski wrote:
> > [We have FOO:BAR]
> > While this is reasonably true (and reasonably reasonable) it's not entirely
> > to the point. If we're going to provide a mechanism to define the syntax of
> > a mini-language (or a maxi one, I suppose, though there are probably better
> > ways to do it) then the details of colons and constants and what-have-yous
> > are pretty close to irrelevant.
> 
> No, I don't think so. The whole thing rests on the fact that class FOO knows
> how to parse string BAR. This, from the tokener's point of view, means that
> class FOO has to tell us when string BAR actually *ends*. 

No it doesn't.  There are well-defined rules for when string BAR actually
*ends*
which are followed before FOO ever sees it.


> For complex BAR (and
> complex FOO) this could be, uh, complex. It means that our parser would have
> to call out to other routines - which can presumably be defined in Perl - to
> assist in parsing Perl code. And hey, if BAR can be defined in Perl, it can be
> defined on-the-fly. Oh dear.
> 
> Not impossible by any means, but *by no means* irrelevant.


No. 

Recursive parsing is not needed.  We have the HERE string, which can
 include anything in with the rest of the code, by looking for the
 end-token.  The perl5 Inline module works that way.

Perl5 can be parsed by making everything token, whitespace, or
 literal. Literals have to end the way they start, but it is not
 recursive: interpolation is applied to a quoted literal, it does not
 affect what is in and what is out of the literal.

To me the simplest way to proceed, with maximum flexibility, would be
 to offer two types of rewriting systems:

        1:  your system operates from scratch on a string literal,
            like Inline does now.  Any syntax is allowed, as long as
            there is some indicator you can remember to escape when
            it appears within your string.  This is how all     
            8-bit-safe transfer protocols work, except for the ones
            that know the length of their payloads at the beginning. 
            Prefixing literals with character counts would be
            nightmarish and I am NOT suggesting it.

        2:  your system operates on tokenized (but not yet
            interpreted) perl symbols. The only restriction is, your
            curlies have to match.  So we introduce two new tokens,
            the literal curlies -- \{ and \} -- which are equivalent
            to \" within a string -- in case your special token would
            like to accept pre-tokened (token, whitespace, literal)
            code and agrees with perl's idea of how blocking and
            quoting works.  To parse python using this system we'd
            need to keep the details of whitespace around instead of
            instantly dismissing it.  Or insist that language
            extensions must maintain curlie balance.  It's really a 
            very minor demand, esp. since there is method 1     
            (inline-style operation on a quoted literal string ) to 
            fall back on.




-- 
                      David Nicol 816.235.1187 [EMAIL PROTECTED]
                                            Home of the V-90 modern

Reply via email to