"Christos Georgiou" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] > On Thu, 26 Jan 2006 16:26:57 GMT, rumours say that "Roger L. Cauvin" > <[EMAIL PROTECTED]> might have written: > >>"Christos Georgiou" <[EMAIL PROTECTED]> wrote in message >>news:[EMAIL PROTECTED] > >>> On Thu, 26 Jan 2006 14:09:54 GMT, rumours say that "Roger L. Cauvin" >>> <[EMAIL PROTECTED]> might have written: > >>>>Say I have some string that begins with an arbitrary sequence of >>>>characters >>>>and then alternates repeating the letters 'a' and 'b' any number of >>>>times, >>>>e.g. >>>> >>>>"xyz123aaabbaabbbbababbbbaaabb" >>>> >>>>I'm looking for a regular expression that matches the first, and only >>>>the >>>>first, sequence of the letter 'a', and only if the length of the >>>>sequence >>>>is >>>>exactly 3. >>>> >>>>Does such a regular expression exist? If so, any ideas as to what it >>>>could >>>>be? >>> >>> Is this what you mean? >>> >>> ^[^a]*(a{3})(?:[^a].*)?$ >> >>Close, but the pattern should allow "arbitrary sequence of characters" >>that >>precede the alternating a's and b's to contain the letter 'a'. In other >>words, the pattern should accept: >> >>"xayz123aaabbab" >> >>since the 'a' between the 'x' and 'y' is not directly followed by a 'b'. >> >>Your proposed pattern rejects this string. > > 1. > > (a{3})(?:b[ab]*)?$ > > This finds the first (leftmost) "aaa" either at the end of the string or > followed by 'b' and then arbitrary sequences of 'a' and 'b'. > > This will also match "aaaa" (from second position on). > > 2. > > If you insist in only three 'a's and you can add the constraint that: > > * let s be the "arbitrary sequence of characters" at the start of your > searched text > * len(s) >= 1 and not s.endswith('a') > > then you'll have this reg.ex. > > (?<=[^a])(a{3})(?:b[ab]*)?$ > > 3. > > If you want to allow for a possible empty "arbitrary sequence of > characters" > at the start and you don't mind search speed > > ^(?:.?*[^a])?(a{3})(?:b[ab]*)?$ > > This should cover you: > >>>> s="xayzbaaa123aaabbab" >>>> r=re.compile(r"^(?:.*?[^a])?(a{3})(?:b[ab]*)?$") >>>> m= r.match(s) >>>> m.group(1) > 'aaa' >>>> m.start(1) > 11 >>>> s[11:] > 'aaabbab'
Thanks for continuing to follow up, Christos. Please see my reply to your other post (in which you applied the test cases). -- Roger L. Cauvin [EMAIL PROTECTED] (omit the "nospam_" part) Cauvin, Inc. Product Management / Market Research http://www.cauvin-inc.com -- http://mail.python.org/mailman/listinfo/python-list