Torsten Bronger wrote: > Hallöchen! > > James Stroud writes: > > >>Torsten Bronger wrote: >> >> >>>I need some help with finding matches in a string that has some >>>characters which are marked as escaped (in a separate list of >>>indices). Escaped means that they must not be part of any match. >>> >>>[...] >> >>You should probably provide examples of what you are trying to do >>or you will likely get a lot of irrelevant answers. > > > Example string: u"Hollo", escaped positions: [4]. Thus, the second > "o" is escaped and must not be found be the regexp searches. > > Instead of re.search, I call the function guarded_search(pattern, > text, offset) which takes care of escaped caracters. Thus, while > > re.search("o$", string) > > will find the second "o", > > guarded_search("o$", string, 0) > > won't find anything. But how to program "guarded_search"? > Actually, it is about changing the semantics of the regexp syntax: > "." doesn't mean anymore "any character except newline" but "any > character except newline and characters marked as escaped". And so > on, for all syntax elements of regular expressions. Escaped > characters must spoil any match, however, the regexp machine should > continue to search for other matches. > > Tschö, > Torsten. >
You will probably need to implement your own findall, etc., but this seems to do it for search: def guarded_search(rgx, astring, escaped): m = re.search(rgx, astring) if m: s = m.start() e = m.end() for i in escaped: if s <= i <= e: m = None break return m Here it is in use: py> def guarded_search(rgx, astring, escaped): ... m = re.search(rgx, astring) ... if m: ... s = m.start() ... e = m.end() ... for i in escaped: ... if s <= i <= e: ... m = None ... break ... return m ... py> import re py> escaped = [1, 5, 15] py> print guarded_search('abc', 'xyzabcxyz', escaped) None py> print guarded_search('abc', 'xyzxyzabcxyz', escaped) <_sre.SRE_Match object at 0x40379720> James -- http://mail.python.org/mailman/listinfo/python-list