Re: deleting texts between patterns

John Machin Sun, 04 Jun 2006 16:40:42 -0700

On 5/06/2006 2:51 AM, Baoqiu Cui wrote:
> John Machin <[EMAIL PROTECTED]> writes:
> 
>> Uh-oh.
>>
>> Try this:
>>
>>>>> pat = re.compile('(?<=abc\n).*?(?=xyz\n)', re.DOTALL)
>>>>> re.sub(pat, '', linestr)
>> 'blahfubarabc\nxyz\nxyzzy'
> 
> This regexp still has a problem.  It may remove the lines between two
> lines like 'aaabc' and 'xxxyz' (and also removes the first two 'x's in
> 'xxxyz').
> 
> The following regexp works better:
> 
>   pattern = re.compile('(?<=^abc\n).*?(?=^xyz\n)', re.DOTALL | re.MULTILINE)
>


You are quite correct. Your reply, and the rejoinder below, only add to 
the proposition that regexes are not necessarily the best choice for 
every text-processing job :-)

Just in case the last line is 'xyz' but is not terminated by '\n':

pattern = re.compile('(?<=^abc\n).*?(?=^xyz$)', re.DOTALL | re.MULTILINE)

Cheers,
John
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: deleting texts between patterns

Reply via email to