[Readable-discuss] BNF: Productions "head" and "rest". Comments??

David A. Wheeler Fri, 11 Jan 2013 15:25:49 -0800

Okay, here are the last 2 key productions for sweet-expressions, "head" and 
"rest".  The "head" reads an indented line, and handles the first expression in 
a line specially.  If there's more than one expression on a line, it calls 
"rest" to read the rest on that line.


Comments very very welcome.

Note: This doesn't include the calls or productions for the potential "restart" 
semantics, whose fate is currently unknown.  I want to discuss that in a 
separate thread, after the proposal has matured a little.

I've tried to build on all past work. However, the BNF here is slightly more 
complex than some older versions, because I'm trying to be *extremely* specific 
about whitespace (SRFI-49 wasn't, and that caused lots of problems in 
interpretation).  In particular:
// This BNF uses the following slightly complicated pattern in many places:
//   from_n_expr ((hspace+ (stuff /*= val1 */ | empty /*= val2 */ ))
//                | empty                             /*= val2 */ )
// This is an expanded form of this BNF pattern (sans actions):
//   from_n_expr (hspace+ stuff?)?
// Note that this pattern quietly removes horizontal spaces at the
// end of the line correctly; that's important because you can't see them,
// so quietly handling them eliminates a source of hard-to-find and
// unnecessary errors.
// If from_n_expr (etc.) is as greedy as possible (it needs to be),
// we *could* instead accept this simpler BNF pattern:
//   from_n_expr hspace* stuff?
// but while that simpler BNF pattern would correctly accept *good* input,
// it would also accept *incorrect* input like "a(1)q" or other n-expressions
// followed immediately by other n-expressions without intervening whitespace.
// We want to detect such situations as errors, so we'll use the
// more complex (and more persnickety) BNF pattern instead.

I'd rather have the BNF slightly more complex, if the result is easier-to-use 
and avoids surprising, hard-to-find errors.

 --- David A. Wheeler




// The "head" is the production for 1+ n-expressions on one line; it will
// return the list of n-expressions on the line.  If there is one n-expression
// on the line, it returns a list of exactly one item; this makes it
// easy to append to later (if appropriate).  In some cases, we want
// single items to be themselves, not in a list; function monify does this.
// The "head" production never reads beyond the current line
// (except within a block comment), so it doesn't need to keep track
// of indentation, and indentation will NOT change within head.
// Callers can depend on "head" and "after" *not* changing indentation.
// On entry, all indentation/hspace must have already been read.
// On return, it will have consumed all hspace (spaces and tabs).
// On a non-tokenizing recursive descent parser, the "head" and its callees
// have to also read and determine if the n-expression is special
// (e.g., //, $, #!...!#, abbreviation + hspace), and have it return a
// distinct value if it is; head and friends operate a lot like a tokenizer
// in that case.

head returns [Object v]
 :  PERIOD
    (hspace+
      (n_expr1=n_expr hspace* {$v = list($n_expr1.v);} (n_expr2=n_expr error)?
       | empty  {$v = list(".");} /*= (list '.) */ )
     | empty    {$v = list(".");} /*= (list '.) */ )
 | n_expr_first (
     (hspace+
       (rest2=rest  {$v = cons($n_expr_first.v, $rest2.v);}
        | empty {$v = list($n_expr_first.v);} ))
      | empty   {$v = list($n_expr_first.v);} ) ;

// The "rest" production reads the rest of the expressions on a line
// (the "rest of the head"), after the first expression of the line.
// Like head, it consumes any hspace before it returns.
// The "rest" production is written this way so a non-tokenizing
// implementation can read an expression specially. E.G., if it sees a period,
// read the expression directly and then see if it's just a period.
// Note that unlike the first head expression, block comments and
// datum comments that don't begin a line (after indent) are consumed,
// and abbreviations followed by a space merely apply to the
// next n-expression (not to the entire indented expression).

rest returns [Object v]
  : PERIOD
      (hspace+
        (n_expr1=n_expr hspace* /* improper list. */
           {$v = $n_expr1.v;}
           (n_expr2=n_expr error)?
         | empty {$v = list(".");})
       | empty   {$v = list(".");})
  | scomment hspace* (rest1=rest {$v = $rest1.v;} | empty {$v = null;} )
  | n_expr3=n_expr
      ((hspace+ (rest3=rest {$v = cons($n_expr3.v, $rest3.v);}
                 | empty {$v = list($n_expr3.v);} ))
       | empty           {$v = list($n_expr3.v);} ) ;






------------------------------------------------------------------------------
Master HTML5, CSS3, ASP.NET, MVC, AJAX, Knockout.js, Web API and
much more. Get web development skills now with LearnDevNow -
350+ hours of step-by-step video tutorials by Microsoft MVPs and experts.
SALE $99.99 this month only -- learn more at:
http://p.sf.net/sfu/learnmore_122812
_______________________________________________
Readable-discuss mailing list
Readable-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/readable-discuss

[Readable-discuss] BNF: Productions "head" and "rest". Comments??

Reply via email to