On 09/06/11 00:20, Sam Denton wrote:
I'm wanting to parse some Wikipedia pages.
Wikipedia template data looks like this:  {{my template|arg one|arg 
two|keyword=value}}
In a template definition, you can use variable expansion, like this:  
{{{1|default for arg one}}}
I defined my lexer to grab runs of '{' and '}' and return different tokens 
depending on the length of the run.
My problem is, I'm hitting cases where a template's name is a variable 
expansion, resulting in:  {{{{{keword}}}|arg one}}

If this is the only way they can be nested, you can use scanner states, that is, define a scanner state 'outside template', which matches {{ only. when encountering {{, switch to a 'inside template' scanner state which matches {{{ only. When encountering }}, switch back to the 'outside template' scanner state.

An alternative solution would be to use a scannerless parser. I am however not sure whether these exist for Python.

Sincerely,
Albert

--
You received this message because you are subscribed to the Google Groups 
"ply-hack" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/ply-hack?hl=en.

Reply via email to