Thanks -- not only for the code, which does almost exactly what I need to do, but for the reminder (thanks also to Jeremy Bowers for this!) to prefer simple solutions. I was, of course, so tied up in getting my nifty one-liner right that I totally lost sight of how straightforwardly the job could be done; and now that I've got it, I've also got room to tune it. For instance, your code keeps the first "longest" match if several are equal in length; my program will I think do slightly better if I keep the last "longest" instead, and changing that required changing > into >=, which even I can't screw up.

Thanks to everyone who's helped on this. Makes me wish I were going to pycon.

Charles Hartman
Professor of English, Poet in Residence
http://cherry.conncoll.edu/cohar
http://villex.blogspot.com

Kent Johnson wrote:
It's pretty simple to put re.search() into a loop where subsequent searches start from the character after where the previous one matched. Here is a solution that uses a general-purpose longest match function:

import re

# RE solution
def longestMatch(rx, s):
''' Find the longest match for rx in s.
Returns (start, length) for the match or (None, None) if no match found.
'''


    start = length = current = 0

    while True:
        m = rx.search(s, current)
        if not m:
            break

        mStart, mEnd = m.span()
        current = mStart + 1

        if (mEnd - mStart) > length:
            start = mStart
            length = mEnd - mStart

    if length:
        return start, length

    return None, None


pairsRe = re.compile(r'(x[x/])+')

for s in [ '/xx/xxx///', '//////xx//' ]:
    print s, longestMatch(pairsRe, s)

-- http://mail.python.org/mailman/listinfo/python-list

Reply via email to