Thank you very much for your response!

1) Multiline: The language does have a "_" line-joining character, but the 
grammar wouldn't have to support that - it could be done with a trivial 
preprocessor. Once joined, commands may not span multiple lines.

2) Command/variable upper-case: Commands are always upper case, but there are 
no case restrictions on variables.


So it sounds, however, like there isn't a straight-forward grammar or option 
tweak. That's ok. The language has fancy expressions (algebraic expressions, 
function calls, strings, comments, and arrays), but its statement structure is 
extremely simplistic. The terminators (newline and semi-colon) are not allowed 
anywhere except as terminators (no escapes, not in strings, not in comments). 
So, as a practical solution, I should be able to dumb-split a program on 
terminators, look at the first characters of each statement, strip off the 
command or variable assignment part and parse the rest as an expression - which 
follows more reasonable rules that the LATM will like.

So, I guess this falls to the "handled in easier faster ways" approach which I 
guess should have been obvious but I failed to think of.

Thank you for your time, and a great library!

  - Dean


On 12/2/20 4:28 PM, Jeffrey Kegler wrote:
> I'll first describe your immediate problem, then ask a couple Q's.
> 
> The problem: Lexing is LATM -- *Longest* Acceptable Token Matching.  The 
> lexeme priority is a tie breaker, used when tokens are the same length.  When 
> your grammar fails, "PAx" is your longest token, and the only choice at 
> length 3.  "PA" is only 2 chars long, and lexemes of different length are not 
> compared for priority.
> 
> (Btw the reason for this is, as implemented, lexeme priorities can be (and 
> are) tested in a few machine instructions.  If Marpa needed to look at 
> earlier possibilities, the logic gets vastly very complex, efficiency goes 
> out the window, and you get into the territory when the grammar can often be 
> handled in easier faster ways.)
> 
> Now the questions:
> 
> 1.) I notice statements cannot be multiline.  Is that the intent going 
> forward?
> 
> 2.) In the example, commands always begin with a capital letter, variables 
> never do.  Will that continue to be the case?  (If so, it points to an easy, 
> fast solution.)
> 
> Possible solutions, depending, include finding something that distinguishes 
> commands from variables in the lexer; custom lexers; using events to guide 
> custom lexing; and character-by-character lexing, whereby you handle your own 
> whitespace.
> 

-- 
You received this message because you are subscribed to the Google Groups 
"marpa parser" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to marpa-parser+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/marpa-parser/270ecda3-6917-f717-593b-051ded20629d%40gmail.com.

Reply via email to