Hi again Yitzhak,
On Tuesday 25 May 2010 13:20:38 Yitzhak Wiener wrote:
> The file is indeed large, ~6MB.
> I added print lines before/after each line and found that the only line
> that consumes more than 1 second was: " match = re.search(pattern, txt,
> re.S) ", it consumed ~5 minutes!
>
There are few things to try.
First is the easiest: In the pattern, change the two occurrences of ".*" to
".*?". In this case, this should not change the result, but -- especially if
most of the file is after the text you're looking for -- it should improve
timings (".*" means "the longest possible string of characters", and ".*?"
means "the shortest possible"; I'm assuming there is only one possible string,
being both longest and shortest; why this should change the timing is left as
an exercise).
If this doesn't help, perhaps you can do some smart cutting of the file to
pieces before the search.
Have fun,
Shai.
_______________________________________________
Python-il mailing list
[email protected]
http://hamakor.org.il/cgi-bin/mailman/listinfo/python-il