Re: A grammar for quoted strings with escaped chars
An update on MarpaX::Demo::SampleGrammars. I've haven't started it yet, but after huge progress on the original problem over the last couple of days, I may well do it on Sunday. I have at least 5 demos ready. I'll create dirs (in the distro) such as data/nested.strings/, data/quoted.strings/, data/numbers/ or whatever, and have a little script which takes a 'name' parameter such as 'nested.strings', etc. It'll look in the dir for one *.bnf file and N *.dat files, load the BNF, and run each *.dat independently. -- You received this message because you are subscribed to the Google Groups "marpa parser" group. To unsubscribe from this group and stop receiving emails from it, send an email to marpa-parser+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: A grammar for quoted strings with escaped chars
I've had a play with your (very fancy) code, but I'm now working purely on an extension of the grammar this thread started with. And the more I play with Marpa the more astonished I am by its power. -- You received this message because you are subscribed to the Google Groups "marpa parser" group. To unsubscribe from this group and stop receiving emails from it, send an email to marpa-parser+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: A grammar for quoted strings with escaped chars
Sounds good. Of course, the namespace declaration could be inside the include file. -- You received this message because you are subscribed to the Google Groups "marpa parser" group. To unsubscribe from this group and stop receiving emails from it, send an email to marpa-parser+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: A grammar for quoted strings with escaped chars
Well, perhaps. BTW, I was thinking about namespace more like an adverb for :default pseudo-rule and/or lexeme default statements. Or, crazy as it is, wrapping grammars up and exposing them as perl modules. On Wed, Sep 24, 2014 at 9:19 AM, Ron Savage wrote: > Not at first. Consider: > > V 1 or 2: :include file_name => /my/grammars/quoted.strings.bnf > > V 1 or 2: :include file_name => /my/grammars/quoted.strings.bnf namespace > => xyz > > -- > You received this message because you are subscribed to the Google Groups > "marpa parser" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to marpa-parser+unsubscr...@googlegroups.com. > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "marpa parser" group. To unsubscribe from this group and stop receiving emails from it, send an email to marpa-parser+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: A grammar for quoted strings with escaped chars
Not at first. Consider: V 1 or 2: :include file_name => /my/grammars/quoted.strings.bnf V 1 or 2: :include file_name => /my/grammars/quoted.strings.bnf namespace => xyz -- You received this message because you are subscribed to the Google Groups "marpa parser" group. To unsubscribe from this group and stop receiving emails from it, send an email to marpa-parser+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: A grammar for quoted strings with escaped chars
Implementing ':include' would require namespace support like e.g. XML namespaces to avoid symbol clashing. On Mon, Sep 22, 2014 at 10:12 AM, Ron Savage wrote: > I've developed a grammar (with help from various people of course) for > quoted strings: http://scsys.co.uk:8002/424926 > > Requirements: > > o Strings must be quoted > > o Strings are either single or double quoted > > o The escape character is \ > > o If the string is single quoted, internal single quotes must be escaped > > o If the string is double quoted, internal double quotes must be escaped > > o Any other character may be escaped > > o If a character is escaped, the escape character is preserved in the > output > > o Empty strings are accepted > > ToDo: Make it work with utf8. > > Does anyone see problems, or other input strings which should be tested? > > Jeffrey: This is one of the plug-in grammars Jean-Damien and I talked > about recently. Any chance you can implement: > > my $source = <<'END_OF_GRAMMAR'; > ... > :include /my/grammars/quoted.strings.bnf > ... > END_OF_GRAMMAR > > to include a suitable[*] grammar in situ within a grammar declaration? > > [*] Obviously, here that just means the prefix: > > :default ::= action => [values] > > lexeme default = latm => 1 # Longest Acceptable Token Match. > > :start ::= string_token > > and the suffix: > > # Boilerplate. > > :discard ~ whitespace > whitespace ~ [\s]+ > > END_OF_GRAMMAR > > would not be present in the include file. > > -- > You received this message because you are subscribed to the Google Groups > "marpa parser" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to marpa-parser+unsubscr...@googlegroups.com. > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "marpa parser" group. To unsubscribe from this group and stop receiving emails from it, send an email to marpa-parser+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: A grammar for quoted strings with escaped chars
Not sure I got it right from the beginning, but my attempt, using a "parameterized" subset of the C grammar is at https://gist.github.com/jddurand/8d3238c22731a85eb890 . JD. Le lundi 22 septembre 2014 09:12:31 UTC+2, Ron Savage a écrit : > > I've developed a grammar (with help from various people of course) for > quoted strings: http://scsys.co.uk:8002/424926 > > Requirements: > > o Strings must be quoted > > o Strings are either single or double quoted > > o The escape character is \ > > o If the string is single quoted, internal single quotes must be escaped > > o If the string is double quoted, internal double quotes must be escaped > > o Any other character may be escaped > > o If a character is escaped, the escape character is preserved in the > output > > o Empty strings are accepted > > ToDo: Make it work with utf8. > > Does anyone see problems, or other input strings which should be tested? > > Jeffrey: This is one of the plug-in grammars Jean-Damien and I talked > about recently. Any chance you can implement: > > my $source = <<'END_OF_GRAMMAR'; > ... > :include /my/grammars/quoted.strings.bnf > ... > END_OF_GRAMMAR > > to include a suitable[*] grammar in situ within a grammar declaration? > > [*] Obviously, here that just means the prefix: > > :default ::= action => [values] > > lexeme default = latm => 1 # Longest Acceptable Token Match. > > :start ::= string_token > > and the suffix: > > # Boilerplate. > > :discard ~ whitespace > whitespace ~ [\s]+ > > END_OF_GRAMMAR > > would not be present in the include file. > > -- You received this message because you are subscribed to the Google Groups "marpa parser" group. To unsubscribe from this group and stop receiving emails from it, send an email to marpa-parser+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: A grammar for quoted strings with escaped chars
OK. -- You received this message because you are subscribed to the Google Groups "marpa parser" group. To unsubscribe from this group and stop receiving emails from it, send an email to marpa-parser+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: A grammar for quoted strings with escaped chars
On Tue, Sep 23, 2014 at 2:14 AM, Ron Savage wrote: > Thanx for the link. >> > > 2 of those 3 samples (the 2nd & 3rd) produce ambiguous parses. Is that > what you find too? > Yes, the code warns about it; actually I was planning to deal with it as part of my current work on ASF-based disambiguation so the code can better be used when I'll finish. It can serve just an idea/illustration now. > > > -- > You received this message because you are subscribed to the Google Groups > "marpa parser" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to marpa-parser+unsubscr...@googlegroups.com. > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "marpa parser" group. To unsubscribe from this group and stop receiving emails from it, send an email to marpa-parser+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: A grammar for quoted strings with escaped chars
I think I'll release such samples (I have encountered a few) as MarpaX::Demo::SampleGrammars. It'll be basically a dummy module with the good stuff in scripts/*.pl. I've been thinking about a script collection for many months now. Any other suggestions (module name, code to include)? -- You received this message because you are subscribed to the Google Groups "marpa parser" group. To unsubscribe from this group and stop receiving emails from it, send an email to marpa-parser+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: A grammar for quoted strings with escaped chars
> > Thanx for the link. > 2 of those 3 samples (the 2nd & 3rd) produce ambiguous parses. Is that what you find too? -- You received this message because you are subscribed to the Google Groups "marpa parser" group. To unsubscribe from this group and stop receiving emails from it, send an email to marpa-parser+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: A grammar for quoted strings with escaped chars
Thanx for the code. I'm turn it into a program for studying. As per my reply to rns, the Graphviz DOT language is the ultimate target, and my code for that already uses various events. That's not to say events are mandatory. Indeed, the original version of the code only ever had events because I could not write the grammar to handle the strings [1] [2], so I used events to do the string parsing manually. Switching to actions is always a possibility. As always, I'm not limited by your skills/knowledge of these issues, but I am limited by my lack thereof. [1] Look for the definition of ID: http://www.graphviz.org/content/dot-language [2] My version: https://metacpan.org/pod/MarpaX::Demo::StringParser#What-is-the-grammar-parsed-by-this-module -- You received this message because you are subscribed to the Google Groups "marpa parser" group. To unsubscribe from this group and stop receiving emails from it, send an email to marpa-parser+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: A grammar for quoted strings with escaped chars
Thanx for the link. I did not consider that case, since I'm really interested in the Graphviz DOT file format, where quotes if any must be double quotes, and internal quotes must be escaped. However, I will examine the code you link to, since ever such example is interesting. -- You received this message because you are subscribed to the Google Groups "marpa parser" group. To unsubscribe from this group and stop receiving emails from it, send an email to marpa-parser+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: A grammar for quoted strings with escaped chars
I've posted some things previously on this topic - but in short, you don't really need to use events to do this. It's possible to do it in a semi-straightforward fashion without a lot of jumping through hoops (just a bunch of rules). Here's some grammar fragments demonstrating what I'm talking about (this handles single quoted, double quoted, and "quote-like" parsing (e.g. q%%, q||), while efficiently handling simple quoted strings that have no escape sequences but falling back to an escape-aware mode when they're present. my ($dsl, $grammar) = <<'==='; :default ::= action => [values] lexeme default = latm => 1 [...] # Normal, bare, unquoted value ::= value_n value_n ::= valword_n # Quoted but not escaped# reassemble action value ::= value_qdaction => val_qd | value_qsaction => val_qs | value_ql0 action => val_ql0 | value_ql1 action => val_ql1 value_qd::= valword_qd value_qs::= valword_qs value_ql0 ::= valword_ql0 value_ql1 ::= valword_ql1 # Quoted and escaped# reassemble action value ::= (g_quote_d) value_eqd (g_quote_d) action => val_eqd | (g_quote_s) value_eqs (g_quote_s) action => val_eqs | (g_quote_ls0) value_eql0 (g_quote_le0) action => val_eql0 | (g_quote_ls1) value_eql1 (g_quote_le1) action => val_eql1 value_eqd ::= valword_eqd* value_eqs ::= valword_eqs* value_eql0 ::= valword_eql0* value_eql1 ::= valword_eql1* # Normal, bare, unquoted valword_n ~ valword_n_c valword_n_c ~ [\w_\@:.\/\*-]+ # Quoted but not escaped valword_qd~ quote_d valword_qd_c quote_d valword_qs~ quote_s valword_qs_c quote_s valword_ql0 ~ quote_ls0 valword_ql0_c quote_le0 valword_ql1 ~ quote_ls1 valword_ql1_c quote_le1 valword_qd_c ~ [^"\\]* valword_qs_c ~ [^'\\]* valword_ql0_c ~ [^|\\]* valword_ql1_c ~ [^%\\]* # Quoted and escaped valword_eqd ~ valword_eqd_c valword_eqs ~ valword_eqs_c valword_eql0 ~ valword_eql0_c valword_eql1 ~ valword_eql1_c valword_eqd_c ~ [^"] | whitespace | escape ["] valword_eqs_c ~ [^'] | whitespace | escape ['] valword_eql0_c~ [^|] | whitespace | escape [|] valword_eql1_c~ [^%] | whitespace | escape [%] # These do translation, but cannot be enabled yet as the expectation is no translation. # valword_eqd ~ [^\a\b\e\f\r\n\t\\"] | whitespace | escape valword_esc # valword_eqs ~ [^\a\b\e\f\r\n\t\\'] | whitespace | escape valword_esc # valword_esc ~ [abefrnt\\"'] # The same base lexemes cannot be directly used by both the lexer and grammar *at the same time*. # Work around it by providing wrapper lexeme rules for the grammar which end up at the same terminal. g_quote_d ~ quote_d g_quote_s ~ quote_s g_quote_ls0 ~ quote_ls0 g_quote_le0 ~ quote_le0 g_quote_ls1 ~ quote_ls1 g_quote_le1 ~ quote_le1 quote_d ~ ["] quote_s ~ ['] quote_ls0 ~ 'q|' quote_le0 ~ '|' quote_ls1 ~ 'q%' quote_le1 ~ '%' escape~ '\' :discard ~ whitespace whitespace~ [\s]+ === # Deescaping table my $xtab = { 'eqd' => { q(\") => qq(") }, 'eqs' => { q(\') => qq(') }, 'eql0' => { q(\|) => qq(|) }, 'eql1' => { q(\%) => qq(%) }, # # Not presently used. # 'eqx' => { # q(\a) => qq(\a), # q(\b) => qq(\b), # q(\e) => qq(\e), # q(\f) => qq(\f), # q(\n) => qq(\n), # q(\r) => qq(\r), # q(\t) => qq(\t), # q(\") => qq("), # q(\') => qq('), # q() => qq(\\), # }, }; # Deescaping functions sub val_eqd { return [ join '', map +($xtab->{'eqd'}{$_} || $_), @{$_[1]} ] } sub val_eqs { return [ join '', map +($xtab->{'eqs'}{$_} || $_), @{$_[1]} ] } sub val_eql0 { return [ join '', map +($xtab->{'eql0'}{$_} || $_), @{$_[1]} ] } sub val_eql1 { return [ join '', map +($xtab->{'eql1'}{$_} || $_), @{$_[1]} ] } #sub val_eqx { return [ join '', map +($xtab->{'eqx'}{$_} || $_), @{$_[1]} ] } # Dequoting functions sub val_qd { return [ substr($_[1]->[0], 1, -1) ] } sub val_qs { return [ substr($_[1]->[0], 1, -1) ] } sub val_ql0 { return [ substr($_[1]->[0], 2, -1) ] } sub val_ql1 { return [ substr($_[1]->[0], 2, -1) ] } The "deescape anything back to it's original fo
Re: A grammar for quoted strings with escaped chars
Does parsing nested quotes make sense for adding to that grammar? I once ran into a grammar that can parse them to a tree — https://gist.github.com/rns/23ce8639c4ceb87d70c9 On Mon, Sep 22, 2014 at 10:12 AM, Ron Savage wrote: > I've developed a grammar (with help from various people of course) for > quoted strings: http://scsys.co.uk:8002/424926 > > Requirements: > > o Strings must be quoted > > o Strings are either single or double quoted > > o The escape character is \ > > o If the string is single quoted, internal single quotes must be escaped > > o If the string is double quoted, internal double quotes must be escaped > > o Any other character may be escaped > > o If a character is escaped, the escape character is preserved in the > output > > o Empty strings are accepted > > ToDo: Make it work with utf8. > > Does anyone see problems, or other input strings which should be tested? > > Jeffrey: This is one of the plug-in grammars Jean-Damien and I talked > about recently. Any chance you can implement: > > my $source = <<'END_OF_GRAMMAR'; > ... > :include /my/grammars/quoted.strings.bnf > ... > END_OF_GRAMMAR > > to include a suitable[*] grammar in situ within a grammar declaration? > > [*] Obviously, here that just means the prefix: > > :default ::= action => [values] > > lexeme default = latm => 1 # Longest Acceptable Token Match. > > :start ::= string_token > > and the suffix: > > # Boilerplate. > > :discard ~ whitespace > whitespace ~ [\s]+ > > END_OF_GRAMMAR > > would not be present in the include file. > > -- > You received this message because you are subscribed to the Google Groups > "marpa parser" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to marpa-parser+unsubscr...@googlegroups.com. > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "marpa parser" group. To unsubscribe from this group and stop receiving emails from it, send an email to marpa-parser+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.