I've written a couple small parsers before, but this one is starting to 
get kinda annoying in me trying to figure out the "right" way to parse. 
You guys are smart, any suggestions on how to approach this?

Conceptually, it's pretty simple. Django parses the text like this. Take 
a text string, and split it on these delimiters: "{{" and "}}" for 
filters,"{%" and "%}" for tags, and "{#" and "#}" for comments. The text 
outside of the delimiters is just considered plain text. All text is 
allowed excluding the delimiters. For inside, use a different 
regex-based parsing based on the type. Comments throw out their 
contained text, filters allow for:

{{ foo|bar }}
{{ foo|bar:"a b c" }}
{{ foo|bar:"a b c"|baz }}

and etc. The first argument is something in 
"([a-zA-Z0-9_]+\.)*[a-zA-Z0-9]+". Filter arguments are always in quotes 
and preserve spaces. Otherwise, spaces are thrown away.

Tags allow a little more:

{% foo %}
{% foo a b c %}
{% foo "a b c" d "e f" %}

where the quotes preserve spacing. Non-quoted arguments must be 
identifier safe from above.

It'd be pretty simple to do the multipass regular expressions, but I 
want to see if I can just do it in one go with a lexer and a parser. The 
problem I'm running into is that while it's really quite easy to parse 
all these tags, filters, and comments, dealing with the text in
between them is a bit of a pain. I can't figure out how to do it 
cleanly. For instance, it'd be nice if the lexer automatically handled 
the strings for tags and filters, but now that means that my top level 
expression nonterminal has to check for the string and just dump it out 
again. Same for finding the pipes and identifiers.

So what's the best way to do this? Do I need to maintain state between 
the lexer and parser to switch what regexes the lexer should be using? 
Just have a complicated text processing section? Or do I just need to do 
two stages of lexing/parsing?

-e

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Felix-language mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/felix-language

Reply via email to