On 5/06/2006 2:51 AM, Baoqiu Cui wrote: > John Machin <[EMAIL PROTECTED]> writes: > >> Uh-oh. >> >> Try this: >> >>>>> pat = re.compile('(?<=abc\n).*?(?=xyz\n)', re.DOTALL) >>>>> re.sub(pat, '', linestr) >> 'blahfubarabc\nxyz\nxyzzy' > > This regexp still has a problem. It may remove the lines between two > lines like 'aaabc' and 'xxxyz' (and also removes the first two 'x's in > 'xxxyz'). > > The following regexp works better: > > pattern = re.compile('(?<=^abc\n).*?(?=^xyz\n)', re.DOTALL | re.MULTILINE) >
You are quite correct. Your reply, and the rejoinder below, only add to the proposition that regexes are not necessarily the best choice for every text-processing job :-) Just in case the last line is 'xyz' but is not terminated by '\n': pattern = re.compile('(?<=^abc\n).*?(?=^xyz$)', re.DOTALL | re.MULTILINE) Cheers, John -- http://mail.python.org/mailman/listinfo/python-list