Re: [2.5] Regex doesn't support MULTILINE?

Gabriel Genellina Sun, 22 Jul 2007 01:36:19 -0700

En Sun, 22 Jul 2007 01:56:32 -0300, Gilles Ganault <[EMAIL PROTECTED]>  
escribió:


> Incidently, as far as using Re alone is concerned, it appears that
> re.MULTILINE isn't enough to get Re to include newlines: re.DOTLINE
> must be added.
>
> Problem is, when I add re.DOTLINE, the search takes less than a second
> for a 500KB file... and about 1mn30 for a file that's 1MB, with both
> files holding similar contents.
>
> Why such a huge difference in performance?
>
> pattern = "<span class=.?defaut.?>(\d+:\d+).*?</span>"

Try to avoid using ".*" and ".+" (even the non greedy forms); in this  
case, I think you want the scan to stop when it reaches the ending </span>  
or any other tag, so use: [^<]* instead.

BTW, better to use a raw string to represent the pattern: pattern =  
r"...\d+..."

-- 
Gabriel Genellina

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: [2.5] Regex doesn't support MULTILINE?

Reply via email to