"Dave" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] > So I'm trying to write a CSS preprocessor. > > I want to add the ability to append a selector onto other selectors. > So, given the following code: > ========================================= > #selector { > > { property: value; property: value; } > .other_selector { property: value; property: value; } > > #selector_2 { > > .more_selector { property: value; } > > } > > } > ========================================= > > I want to return the following: > ========================================= > #selector { property: value; property: value; } > #selector .other_selector { property: value; property: value; } > #selector #selector_2 .more_selector { property: value; } > =========================================
Dave - Since other posters have suggested parsing, here is a pyparsing stab at your problem. Pyparsing allows you to construct your grammar using readable construct names, and can generate structured parse results. Pyparsing also has built-in support for skipping over comments. This paper describes a prior use of pyparsing to parse CSS style sheets: http://dyomedea.com/papers/2004-extreme/paper.pdf. Google for "pyparsing CSS" for some other possible references. This was really more complex than I expected. The grammar was not difficult, but the recursive routine was trickier than I thought it would be. Hope this helps. Download pyparsing at http://pyparsing.sourceforge.net. -- Paul ========================= data = """ #selector { { property: value; /* a nasty comment */ property: value; } .other_selector { property: value; property: value; } #selector_2 { /* another nasty comment */ .more_selector { property: value; /* still another nasty comment */ } } } """ from pyparsing import Literal,Word,Combine,Group,alphas,nums,alphanums,\ Forward,ZeroOrMore,cStyleComment,ParseResults # define some basic symbols - suppress grouping and delimiting punctuation # and let grouping do the rest lbrace = Literal("{").suppress() rbrace = Literal("}").suppress() colon = Literal(":").suppress() semi = Literal(";").suppress() pound = Literal("#") dot = Literal(".") # define identifiers, property pattern, valid property values, and property list ident = Word(alphas,alphanums+"_") pound_ident = Combine(pound + ident) dot_ident = Combine(dot + ident) prop_value = Word(nums) | Word(alphanums) # expand this as needed property_def = Group( ident + colon + prop_value + semi ) prop_list = Group( lbrace + ZeroOrMore( property_def ) + rbrace ).setResultsName("propList") # define selector - must use Forward since selector is recursive selector = Forward() selector_contents = (prop_list) | Group( dot_ident.setResultsName("name") + prop_list ) | selector selector << Group( pound_ident.setResultsName("name") + lbrace + Group(ZeroOrMore( selector_contents )).setResultsName("contents") + rbrace ) # C-style comments should be ignored selector.ignore(cStyleComment) # parse the data - this only works if data *only* contains a single selector results = selector.parseString(data) # use pprint to display list - you can navigate the results to construct the various selectors import pprint pprint.pprint( results[0].asList() ) print # if scanning through text containing other text than just selectors, # use scanString, which returns a generator, yielding a tuple # for each occurrence found # # for results,start,end in selector.scanString(cssSourceText): # pprint.pprint(results.asList()) # a recursive function to print out the names and property lists def printSelector(res,namePath=[]): if res.name != "": subpath = namePath + [res.name] if res.contents != "": for c in res.contents: printSelector(c, subpath) elif res.propList != "": print " ".join(subpath),"{", " ".join([ "%s : %s;" % tuple(p) for p in res.propList ]),"}" else: print " ".join(subpath),"{", " ".join([ "%s : %s;" % tuple(r) for r in res ]),"}" else: print " ".join(namePath),"{", " ".join([ "%s : %s;" % tuple(r) for r in res]),"}" printSelector( results[0] ) ========================= This prints: ['#selector', [[['property', 'value'], ['property', 'value']], ['.other_selector', [['property', 'value'], ['property', 'value']]], ['#selector_2', [['.more_selector', [['property', 'value']]]]]]] #selector { property : value; property : value; } #selector .other_selector { property : value; property : value; } #selector #selector_2 .more_selector { property : value; } -- http://mail.python.org/mailman/listinfo/python-list