Wolfgang Rohdewald wrote:
Hi,

I want to match a string only if a word (C1 in this example) appears
at most once in it. This is what I tried:

re.match(r'(.*?C1)((?!.*C1))','C1b1b1b1 b3b3b3b3 C1C2C3').groups()
('C1b1b1b1 b3b3b3b3 C1', '')
re.match(r'(.*?C1)','C1b1b1b1 b3b3b3b3 C1C2C3').groups()
('C1',)

but this should not have matched. Why is the .*? behaving greedy
if followed by (?!.*C1)? I would have expected that re first evaluates (.*?C1) before proceeding at all.

I also tried:

re.search(r'(.*?C1(?!.*C1))','C1b1b1b1 b3b3b3b3
C1C2C3C4').groups()
('C1b1b1b1 b3b3b3b3 C1',)

with the same problem.

How could this be done?

You're currently looking for one that's not followed by another; the
solution is to check first whether there are two:

>>> re.match(r'(?!.*?C1.*?C1)(.*?C1)','C1b1b1b1 b3b3b3b3 C1C2C3').groups()

Traceback (most recent call last):
  File "<pyshell#3>", line 1, in <module>
    re.match(r'(?!.*?C1.*?C1)(.*?C1)','C1b1b1b1 b3b3b3b3 C1C2C3').groups()
AttributeError: 'NoneType' object has no attribute 'groups'
--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to