Re: template strings for matching?
Joe Strout wrote: Catching up on what's new in Python since I last used it a decade ago, I've just been reading up on template strings. These are pretty cool! However, just as a template string has some advantages over % substitution for building a string, it seems like it would have advantages over manually constructing a regex for string matching. So... is there any way to use a template string for matching? I expected something like: templ = Template(The $object in $location falls mainly in the $subloc.) d = templ.match(s) and then d would either by None (if s doesn't match), or a dictionary with values for 'object', 'location', and 'subloc'. But I couldn't find anything like that in the docs. Am I overlooking something? Yeah, its a bit hard to spot: http://docs.python.org/library/stdtypes.html#string-formatting-operations HTH Tino smime.p7s Description: S/MIME Cryptographic Signature -- http://mail.python.org/mailman/listinfo/python-list
Re: template strings for matching?
Joe templ = Template(The $object in $location falls mainly in the $subloc.) Joe d = templ.match(s) Joe and then d would either by None (if s doesn't match), or a Joe dictionary with values for 'object', 'location', and 'subloc'. Joe But I couldn't find anything like that in the docs. Am I Joe overlooking something? Nope, you're not missing anything. Skip -- http://mail.python.org/mailman/listinfo/python-list
Re: template strings for matching?
Tino Yeah, its a bit hard to spot: Tino http://docs.python.org/library/stdtypes.html#string-formatting-operations That shows how to use the template formatting as it currently exists. To my knowledge there is no support for the inverse operation, which is what Joe asked about. Given a string and a format string assign the elements of the string which correspond to the template elements to key/value pairs in a dictionary. Skip -- http://mail.python.org/mailman/listinfo/python-list
Re: template strings for matching?
Joe Strout wrote: Catching up on what's new in Python since I last used it a decade ago, I've just been reading up on template strings. These are pretty cool! However, just as a template string has some advantages over % substitution for building a string, it seems like it would have advantages over manually constructing a regex for string matching. So... is there any way to use a template string for matching? I expected something like: ... you could use something like this to record the lookups class XDict(dict): ... def __new__(cls,*args,**kwds): ... self = dict.__new__(cls,*args,**kwds) ... self.__record = set() ... return self ... def _record_clear(self): ... self.__record.clear() ... def __getitem__(self,k): ... v = dict.__getitem__(self,k) ... self.__record.add(k) ... return v ... def _record(self): ... return self.__record ... x=XDict() x._record() set([]) x=XDict(a=1,b=2,c=3) x {'a': 1, 'c': 3, 'b': 2} '%(a)s %(c)s' % x '1 3' x._record() set(['a', 'c']) a slight modification would allow your template match function to work even when some keys were missing in the dict. That would allow you to see which lookups failed as well. -- Robin Becker -- http://mail.python.org/mailman/listinfo/python-list
Re: template strings for matching?
Pyparsing makes building expressions with named fields pretty easy. from pyparsing import Word, alphas wrd = Word(alphas) templ = The + wrd(object) + in + wrd(location) + \ stays mainly in the + wrd(subloc) + . tests = \ The rain in Spain stays mainly in the plain. The snake in plane stays mainly in the cabin. In Hempstead, Haverford and Hampshire hurricanes hardly ever happen. .splitlines() for t in tests: t = t.strip() try: match = templ.parseString(t) print match.object print match.location print match.subloc print Fields are: %(object)s %(location)s %(subloc)s % match except: print ' + t + ' is not a match. print Read more about pyparsing at http://pyparsing.wikispaces.com. -- Paul -- http://mail.python.org/mailman/listinfo/python-list
Re: template strings for matching?
[EMAIL PROTECTED] wrote: Tino Yeah, its a bit hard to spot: Tino http://docs.python.org/library/stdtypes.html#string-formatting-operations That shows how to use the template formatting as it currently exists. To my knowledge there is no support for the inverse operation, which is what Joe asked about. Given a string and a format string assign the elements of the string which correspond to the template elements to key/value pairs in a dictionary. ??? can you elaborate? I don't see the problem. %(foo)s % mapping just calls get(foo) on mapping so if you have a dictionary with all possible values it just works. If you want to do some fancy stuff just subclass and change the method call appropriately. Regards Tino smime.p7s Description: S/MIME Cryptographic Signature -- http://mail.python.org/mailman/listinfo/python-list
Re: template strings for matching?
On Oct 9, 2008, at 7:05 AM, [EMAIL PROTECTED] wrote: Tino http://docs.python.org/library/stdtypes.html#string-formatting-operations That shows how to use the template formatting as it currently exists. To my knowledge there is no support for the inverse operation, which is what Joe asked about. Given a string and a format string assign the elements of the string which correspond to the template elements to key/value pairs in a dictionary. Right. Well, what do y'all think? It wouldn't be too hard to write this for myself, but it seems like the sort of thing Python ought to have built in. Right on the Template class, so it doesn't add anything new to the global namespace; it just makes this class more useful. I took a look at PEP 3101, which is more of a high-powered string formatter (as the title says, Advanced String Formatting), and will be considerably more intimidating for a beginner than Template. So, even if that goes through, perhaps Template will stick around, and being able to use it in both directions could be quite handy. Oh boy! Could this be my very first PEP? :) Thanks for any opinions, - Joe -- http://mail.python.org/mailman/listinfo/python-list
Re: template strings for matching?
Joe Strout wrote: Catching up on what's new in Python since I last used it a decade ago, I've just been reading up on template strings. These are pretty cool! I don't think they've gained much traction and expect them to be superseded by PEP 3101 (see http://www.python.org/dev/peps/pep-3101/ ) However, just as a template string has some advantages over % substitution for building a string, it seems like it would have advantages over manually constructing a regex for string matching. So... is there any way to use a template string for matching? I expected something like: templ = Template(The $object in $location falls mainly in the $subloc.) d = templ.match(s) and then d would either by None (if s doesn't match), or a dictionary with values for 'object', 'location', and 'subloc'. But I couldn't find anything like that in the docs. Am I overlooking something? I don't think so. Here's a DIY implementation: import re def _replace(match): word = match.group(2) if word == $: return [$] return (?P%s.*) % word def extract(template, text): r = re.compile(r([$]([$]|\w+))) r = r.sub(_replace, template) return re.compile(r).match(text).groupdict() print extract(My $$ is on the $object in $location..., My $ is on the biggest bird in the highest tree...) As always with regular expressions I may be missing some corner cases... Peter -- http://mail.python.org/mailman/listinfo/python-list
Re: template strings for matching?
Tino ??? can you elaborate? I don't see the problem. Tino %(foo)s % mapping Joe wants to go in the other direction. Using your example, he wants a function which takes a string and a template string and returns a dict. Here's a concrete example: s = My dog has fleas fmt = My $pet has $parasites d = fmt_extract(fmt, s) assert d['pet'] == 'dog' assert d['parasites'] == 'fleas' Skip -- http://mail.python.org/mailman/listinfo/python-list
template strings for matching?
Catching up on what's new in Python since I last used it a decade ago, I've just been reading up on template strings. These are pretty cool! However, just as a template string has some advantages over % substitution for building a string, it seems like it would have advantages over manually constructing a regex for string matching. So... is there any way to use a template string for matching? I expected something like: templ = Template(The $object in $location falls mainly in the $subloc.) d = templ.match(s) and then d would either by None (if s doesn't match), or a dictionary with values for 'object', 'location', and 'subloc'. But I couldn't find anything like that in the docs. Am I overlooking something? Thanks, - Joe -- http://mail.python.org/mailman/listinfo/python-list
Re: template strings for matching?
Wow, this was harder than I thought (at least for a rusty Pythoneer like myself). Here's my stab at an implementation. Remember, the goal is to add a match method to Template which works like Template.substitute, but in reverse: given a string, if that string matches the template, then it should return a dictionary mapping each template field to the corresponding value in the given string. Oh, and as one extra feature, I want to support a .greedy attribute on the Template object, which determines whether the matching of fields should be done in a greedy or non-greedy manner. #!/usr/bin/python from string import Template import re def templateMatch(self, s): # start by finding the fields in our template, and building a map # from field position (index) to field name. posToName = {} pos = 1 for item in self.pattern.findall(self.template): # each item is a tuple where item 1 is the field name posToName[pos] = item[1] pos += 1 # determine if we should match greedy or non-greedy greedy = False if self.__dict__.has_key('greedy'): greedy = self.greedy # now, build a regex pattern to compare against s # (taking care to escape any characters in our template that # would have special meaning in regex) pat = self.template.replace('.', '\\.') pat = pat.replace('(', '\\(') pat = pat.replace(')', '\\)') # there must be a better way... if greedy: pat = self.pattern.sub('(.*)', pat) else: pat = self.pattern.sub('(.*?)', pat) p = re.compile(pat) # try to match this to the given string match = p.match(s) if match is None: return None out = {} for i in posToName.keys(): out[posToName[i]] = match.group(i) return out Template.match = templateMatch t = Template(The $object in $location falls mainly in the $subloc.) print t.match( The rain in Spain falls mainly in the train. ) This sort-of works, but it won't properly handle $$ in the template, and I'm not too sure whether it handles the ${fieldname} form, either. Also, it only escapes '.', '(', and ')' in the template... there must be a better way of escaping all characters that have special meaning to RegEx, except for '$' (which is why I can't use re.escape). Probably the rest of the code could be improved too. I'm eager to hear your feedback. Thanks, - Joe -- http://mail.python.org/mailman/listinfo/python-list
Re: template strings for matching?
On Oct 9, 5:20 pm, Joe Strout [EMAIL PROTECTED] wrote: Wow, this was harder than I thought (at least for a rusty Pythoneer like myself). Here's my stab at an implementation. Remember, the goal is to add a match method to Template which works like Template.substitute, but in reverse: given a string, if that string matches the template, then it should return a dictionary mapping each template field to the corresponding value in the given string. Oh, and as one extra feature, I want to support a .greedy attribute on the Template object, which determines whether the matching of fields should be done in a greedy or non-greedy manner. #!/usr/bin/python from string import Template import re def templateMatch(self, s): # start by finding the fields in our template, and building a map # from field position (index) to field name. posToName = {} pos = 1 for item in self.pattern.findall(self.template): # each item is a tuple where item 1 is the field name posToName[pos] = item[1] pos += 1 # determine if we should match greedy or non-greedy greedy = False if self.__dict__.has_key('greedy'): greedy = self.greedy # now, build a regex pattern to compare against s # (taking care to escape any characters in our template that # would have special meaning in regex) pat = self.template.replace('.', '\\.') pat = pat.replace('(', '\\(') pat = pat.replace(')', '\\)') # there must be a better way... if greedy: pat = self.pattern.sub('(.*)', pat) else: pat = self.pattern.sub('(.*?)', pat) p = re.compile(pat) # try to match this to the given string match = p.match(s) if match is None: return None out = {} for i in posToName.keys(): out[posToName[i]] = match.group(i) return out Template.match = templateMatch t = Template(The $object in $location falls mainly in the $subloc.) print t.match( The rain in Spain falls mainly in the train. ) This sort-of works, but it won't properly handle $$ in the template, and I'm not too sure whether it handles the ${fieldname} form, either. Also, it only escapes '.', '(', and ')' in the template... there must be a better way of escaping all characters that have special meaning to RegEx, except for '$' (which is why I can't use re.escape). Probably the rest of the code could be improved too. I'm eager to hear your feedback. Thanks, - Joe How about something like: import re def placeholder(m): if m.group(1): return (?P%s.+) % m.group(1) elif m.group(2): return \\$ else: return re.escape(m.group(3)) regex = re.compile(r\$(\w+)|(\$\$)) t = The $object in $location falls mainly in the $subloc. print regex.sub(placeholder, t) -- http://mail.python.org/mailman/listinfo/python-list