On Tue, Nov 25, 2014 at 5:17 PM, Dave Townsend
<[email protected]> wrote:
> I'm having trouble getting Parslet 1.6.2 to parse nested tags dynamically. 
> Here's a baby version of my code:
>
> -------
> require 'parslet'
>
> class SimpleNestParser < Parslet::Parser
>   rule(:id) { match('[a-z]') }
>
>   rule(:tag) {
>     str('<') >> id.capture(:tagname) >> str('>') >>

Here's your problem:

>     tag.maybe >>
>     str('</') >> dynamic { |src,ctx| str(ctx.captures[:tagname]) } >> str('>')
>   }
>
>   root(:tag)
> end
>
> SimpleNestParser.new.parse "<a><b></b></a>"
> -------
>
> I would expect the parse to succeed, but it fails:
>
> -------
> /Library/Ruby/Gems/2.0.0/gems/parslet-1.6.2/lib/parslet/cause.rb:63:in `rai! 
> se': Failed to match sequence ('<' (:tagname = ID) '>' TAG? '</' dynamic { 
> ... } '>') at line 1 char 13. (Parslet::ParseFailed)
> -------
>
> If I change the dynamic matcher to "match('[ab]')" then the parse does 
> succeed, which makes it seem that I'm not trying to do anything that Parslet 
> can't handle.  (But it's important that I match blocks exactly, so I can't 
> use this as a permanent solution.)
>
> By adding "$stderr.puts ctx.captures[:tagname]" within the dynamic block, I 
> can see that the block is executed four times, and the value is always "! b", 
> which seems suspicious. I would have thought to have seen at least one "a" in 
> there.
>
> Am I misunderstanding how to use Parslet in general or the dynamic block in 
> particular?

Just the latter! If you want to parse nested tags like this, you need
to give each nested level its own scope, otherwise :tagname will be
overwritten for each recursion.

It works if you wrap line 8 in a scope block like this: "scope { tag.maybe }"

You can read more about dynamic parsing and scopes at the bottom of
this tutorial: http://kschiess.github.io/parslet/parser.html

> Thanks for any guidance the list can provide on how to get my grammar working!

An aside to the list: I googled a bit to see if there were any good
HTML/XML (or any other context-free grammars) parsers I could show as
examples here. I found
https://github.com/kschiess/parslet/blob/master/example/simple_xml.rb
which claims to parse "simple" XML without any complexities, and
although that's vague, I was a bit surprised that it parses stuff like
"<a><b></a></b>" or even just "<a></b> and therefore couldn't be used
an as example to help with the above problem. That example could
probably be improved.

-- 
Tobias V. Langhoff

Reply via email to