why is this string not recognized? trouble with complex L0 and quotes, I think.

stefan . gottschalk Mon, 19 Feb 2018 11:49:42 -0800

I'm scarcely more than a novice with Marpa, so please forgive me if I'm 
asking for too much or being naive.


A sample of my legacy DSL looks like this:

sublevel: -only "-R[SUNLF]{0,1}\d+\s" -testargsmore -foo 

{ 

  < test { foo } >

}


The *sublevel:* is supposed to be a key word, introducing a kind of 
statement.  

Following the key word is a bunch of nearly arbitrary text (containing 
options and parameters), terminated by a {}-delimited body.  So the overall 
structure is this:

*sublevel:* *<options>*
*{*
    *<more statements>*
*}* 


So, the open curly signals the end of the *<options>* and the start of the 
body.

This legacy DSL allows curlies inside the *<options>* provided they are 
quoted (single or double) or escaped.  So, the curlies in *{0,1}* should 
not be interpreted as special, but taken verbatim.  

I studied the string grammar listed 
in https://gist.github.com/jddurand/8d3238c22731a85eb890 and used it as a 
guide for my development for the grammar I list below.  In particular, that 
example taught me that L0 rules permit alternative productions, and also 
allow sequences.  Anyway, the portion of the grammar in ALL CAPS was 
derived from that example.

But, I get this "No lexeme" error when it hits the first dquote, and I 
cannot figure out why!

Setting trace_terminals option
Setting trace_values option
Discarded lexeme L1c1: whitespace
Accepted lexeme L2c1-9 e1: 'sublevel:'; value="sublevel:"
Accepted lexeme L2c1-9 e1: 'sublevel:'; value="sublevel:"
Accepted lexeme L2c10-16 e2: SUBLEVELOPTIONS; value=" -only "
****** FAILED TO PARSE ******
MSG:
Error in SLIF parse: No lexeme found at line 2, column 17
* String before error: \nsublevel: -only\s
* The error was at line 2, column 17, and at character 0x0022 '"', ...
* here: "-R[SUNLF]{0,1}\\d+\\s" -testargsmore -foo\n{\n  <
Marpa::R2 exception at ./marpa_bnf_1.pl line 31.


I would be grateful for any insights.  

I intended that my sample input would have been interpreted, at some depth 
of productions, as 

*sublevel:* *SublevelOptions*
*{*
*  NamedBlockList*
*}*


and I thought that the SublevelOptions

 -only "-R[SUNLF]{0,1}\d+\s" -testargsmore -foo

would decompose into 

STRING_UNQUOTED = ( -only )
STRING_DQUOTED = ("-R[SUNLF]{0,1}\d+\s")
STRING_UNQUOTED = ( -testargsmore -foo)


and I failed to see why it doesn't do so.  Instead, Marpa tells me it 
doesn't know what to do when it sees that dquote.  

Below is my full grammar.  

:default ::= action => [name, start, length, values]
lexeme default = latm => 1

File ::= BodyStatements
File ::=

BodyStatements ::= BodyStatement+

BodyStatement ::=
    Sublevel
  | SingleTest


Sublevel ::= ('sublevel:') SublevelOptionsMaybe ('{') BodyStatements ('}')
Sublevel ::= ('sublevel:') SublevelOptionsMaybe ('{') ('}')
SublevelOptionsMaybe ::= SublevelOptions
SublevelOptionsMaybe ::=

SublevelOptions      ::=  SUBLEVELOPTIONS

SUBLEVELOPTIONS             ~ SUBLEVELOPTIONS_STRING+

SUBLEVELOPTIONS_STRING      ~ STRING_UNQUOTED
                            | STRING_SQUOTED
                            | STRING_DQUOTED

STRING_UNQUOTED             ~ CHAR_UNQUOTED+
CHAR_UNQUOTED               ~ [^"'\}\{;\\\n]
CHAR_UNQUOTED               ~ ES

STRING_SQUOTED              ~ SQUOTE STRING_INSIDE_SQUOTES SQUOTE
STRING_INSIDE_SQUOTES       ~ CHAR_INSIDE_SQUOTES*
CHAR_INSIDE_SQUOTES         ~ [^'\\]
CHAR_INSIDE_SQUOTES         ~ [\\] [']
SQUOTE                      ~ [']

STRING_DQUOTED              ~ DQUOTE STRING_INSIDE_DQUOTES DQUOTE
STRING_INSIDE_DQUOTES       ~ CHAR_INSIDE_DQUOTES*
CHAR_INSIDE_DQUOTES         ~ [^"\\\n]
CHAR_INSIDE_DQUOTES         ~ [\\] [^#]
DQUOTE                      ~ ["]

ES                          ~ [\\] [\\'"\{\};]

NamedBlockList ::= NamedBlock+
NamedBlock ::= ArgTag ('{') ArgBodyMaybe ('}')
ArgBodyMaybe ::= ArgBody
ArgBodyMaybe ::=
ArgBody ~ [^\{\}]+
ArgTag ~ [\w]+

SingleTest ::=
    SingleSimpleTest

SingleSimpleTest ::= ('<') NamedBlockList ('>')

# whitespace
:discard ~ whitespace
whitespace ~ [\s]+


 

-- 
You received this message because you are subscribed to the Google Groups 
"marpa parser" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

why is this string not recognized? trouble with complex L0 and quotes, I think.

Reply via email to