I'm wanting to parse some Wikipedia pages.
Wikipedia template data looks like this: {{my template|arg one|arg
two|keyword=value}}
In a template definition, you can use variable expansion, like this:
{{{1|default for arg one}}}
I defined my lexer to grab runs of '{' and '}' and return different tokens
depending on the length of the run.
My problem is, I'm hitting cases where a template's name is a variable
expansion, resulting in: {{{{{keword}}}|arg one}}
Those five braces in a row are problematic. My first thought is that I'd
like to return two tokens when I see them, a two brace token followed by a
three brace one, but I'm having problems figuring out how to do that. My
second thought is to define parser rules that start with a five-brace token,
but that's not so easy, either. Any suggestions on how to fix things?
Thanks.
--
You received this message because you are subscribed to the Google Groups
"ply-hack" group.
To view this discussion on the web visit
https://groups.google.com/d/msg/ply-hack/-/Hu9ufcYrKnwJ.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to
[email protected].
For more options, visit this group at
http://groups.google.com/group/ply-hack?hl=en.