I'll first describe your immediate problem, then ask a couple Q's. The problem: Lexing is LATM -- *Longest* Acceptable Token Matching. The lexeme priority is a tie breaker, used when tokens are the same length. When your grammar fails, "PAx" is your longest token, and the only choice at length 3. "PA" is only 2 chars long, and lexemes of different length are not compared for priority.
(Btw the reason for this is, as implemented, lexeme priorities can be (and are) tested in a few machine instructions. If Marpa needed to look at earlier possibilities, the logic gets vastly very complex, efficiency goes out the window, and you get into the territory when the grammar can often be handled in easier faster ways.) Now the questions: 1.) I notice statements cannot be multiline. Is that the intent going forward? 2.) In the example, commands always begin with a capital letter, variables never do. Will that continue to be the case? (If so, it points to an easy, fast solution.) Possible solutions, depending, include finding something that distinguishes commands from variables in the lexer; custom lexers; using events to guide custom lexing; and character-by-character lexing, whereby you handle your own whitespace. On Wed, Dec 2, 2020 at 3:27 PM Dean S <[email protected]> wrote: > Hello, I'm having trouble figuring out how to express my grammar and was > hoping someone could help. I've tried rewriting various ways and looking > for some options that might change behavior, but I haven't been able to > figure it out. > > I have a language with variable assignment and simple commands and doesn't > care about whitespace. So, > > PAx=42 # variable assignment to "PAx" > > PAx # PA command with argument x > > I have a grammar, but it insists on spaces after command names. I've tried > hiding assignment behind a prioritized rule and tried setting the command > lexeme priority, but I always get parse errors when parsing "PAx". I have a > simplified grammer which exhibits the issue, > > #!/usr/bin/perl > use warnings; use strict; use 5.028; > > use Marpa::R2 8.000000; > use Data::Dumper; > > my $grammar = 'Marpa::R2::Scanless::G'->new({ source => \(<<'RULES') }); > :default ::= action => [name,values] > lexeme default = latm => 1 > > :start ::= Program > > Program ::= Statement+ > > Statement ::= Command terminator > || Assign terminator > > Command ::= command arg > > Assign ::= variable equal value > > # Doesn't help: :lexeme ~ command priority => 2 > command ~ 'PA' | 'PR' > arg ~ [\w]+ > equal ~ '=' > value ~ [0-9]+ > variable ~ [\w]+ > terminator ~ [;\n] > > :discard ~ whitespace > whitespace ~ [ \t]+ > RULES > > # This parses correctly, line 2 is a command, line 3 is assignment. > my $ok = <<TEXT; > x=23 > PA x > PAx=42 > TEXT > > say Dumper($grammar->parse(\$ok)); > > # I want to generate same tree as above, > # but my grammar wants line 2 to be an assignment. > my $bad = <<TEXT; > x=23 > PAx > PAx=42 > TEXT > > # Error in SLIF parse: No lexeme found at line 2, column 4 > say Dumper($grammar->parse(\$bad)); > > Is there some trick to this? Did I miss someting in the documentation? > > Any suggestions? > > Thanks! > - Dean > > -- > You received this message because you are subscribed to the Google Groups > "marpa parser" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/marpa-parser/32966712-7756-4590-8b13-c9b2decbc3e4n%40googlegroups.com > <https://groups.google.com/d/msgid/marpa-parser/32966712-7756-4590-8b13-c9b2decbc3e4n%40googlegroups.com?utm_medium=email&utm_source=footer> > . > -- You received this message because you are subscribed to the Google Groups "marpa parser" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/marpa-parser/CA%2B2Wrv9ZPWr4RRKNRuQ8GZ5UE_5kOyLuPnqMDz0enh24%2BR_Q%3Dw%40mail.gmail.com.
