It's not the usual question mark you find in regular expressions (i.e. "ab?c" matches "abc" and "ac"); it's the syntax for some modified behaviour.
On Wed, May 26, 2010 at 19:47, Yitzhak Wiener <[email protected]>wrote: > Hi Shai, > > It worked. Thanks. > Adding the '?' after the '*' solved the time problem. I found it in the > python documentation but didn't really understand the logic of that. Why > it has effect? The '*' is before so it should still be greedy according > to logic!? Shouldn't it? > > > Best Regards, > Yitzhak > > -----Original Message----- > From: Shai Berger [mailto:[email protected]] > Sent: Tuesday, May 25, 2010 1:42 PM > To: Yitzhak Wiener > Cc: [email protected] > Subject: Re: [Python-il] [python-il]location in file > > Hi again Yitzhak, > > On Tuesday 25 May 2010 13:20:38 Yitzhak Wiener wrote: > > The file is indeed large, ~6MB. > > I added print lines before/after each line and found that the only > line > > that consumes more than 1 second was: " match = re.search(pattern, > txt, > > re.S) ", it consumed ~5 minutes! > > > > There are few things to try. > > First is the easiest: In the pattern, change the two occurrences of ".*" > to > ".*?". In this case, this should not change the result, but -- > especially if > most of the file is after the text you're looking for -- it should > improve > timings (".*" means "the longest possible string of characters", and > ".*?" > means "the shortest possible"; I'm assuming there is only one possible > string, > being both longest and shortest; why this should change the timing is > left as > an exercise). > > If this doesn't help, perhaps you can do some smart cutting of the file > to > pieces before the search. > > Have fun, > Shai. > > ______________________________________________________________________ > DSP Group, Inc. automatically scans all emails and attachments using > MessageLabs Email Security System. > _____________________________________________________________________ > > ______________________________________________________________________ > DSP Group, Inc. automatically scans all emails and attachments using > MessageLabs Email Security System. > _____________________________________________________________________ > _______________________________________________ > Python-il mailing list > [email protected] > http://hamakor.org.il/cgi-bin/mailman/listinfo/python-il >
_______________________________________________ Python-il mailing list [email protected] http://hamakor.org.il/cgi-bin/mailman/listinfo/python-il
