I've written a couple small parsers before, but this one is starting to
get kinda annoying in me trying to figure out the "right" way to parse.
You guys are smart, any suggestions on how to approach this?
Conceptually, it's pretty simple. Django parses the text like this. Take
a text string, and split it on these delimiters: "{{" and "}}" for
filters,"{%" and "%}" for tags, and "{#" and "#}" for comments. The text
outside of the delimiters is just considered plain text. All text is
allowed excluding the delimiters. For inside, use a different
regex-based parsing based on the type. Comments throw out their
contained text, filters allow for:
{{ foo|bar }}
{{ foo|bar:"a b c" }}
{{ foo|bar:"a b c"|baz }}
and etc. The first argument is something in
"([a-zA-Z0-9_]+\.)*[a-zA-Z0-9]+". Filter arguments are always in quotes
and preserve spaces. Otherwise, spaces are thrown away.
Tags allow a little more:
{% foo %}
{% foo a b c %}
{% foo "a b c" d "e f" %}
where the quotes preserve spacing. Non-quoted arguments must be
identifier safe from above.
It'd be pretty simple to do the multipass regular expressions, but I
want to see if I can just do it in one go with a lexer and a parser. The
problem I'm running into is that while it's really quite easy to parse
all these tags, filters, and comments, dealing with the text in
between them is a bit of a pain. I can't figure out how to do it
cleanly. For instance, it'd be nice if the lexer automatically handled
the strings for tags and filters, but now that means that my top level
expression nonterminal has to check for the string and just dump it out
again. Same for finding the pipes and identifiers.
So what's the best way to do this? Do I need to maintain state between
the lexer and parser to switch what regexes the lexer should be using?
Just have a complicated text processing section? Or do I just need to do
two stages of lexing/parsing?
-e
-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Felix-language mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/felix-language