On Thu, 18 Dec 2008 05:51:33 -0800, Emanuele D'Arrigo wrote: > I've written the code below to test the differences in performance ... > ## TIMED FUNCTIONS > startTime = time.clock() > for i in range(0, numberOfRuns): > re.match(pattern, longMessage) > patternMatchingTime = time.clock() - startTime ...
You probably don't need to re-invent the wheel. See the timeit module. In my opinion, the best idiom for timing small code snippets is: from timeit import Timer t = Timer("func(arg)", "from __main__ import func, arg") time_taken = min(t.repeat(number=N))/N where N will depend on how patient you are, but probably shouldn't be less than 100. For small enough code snippets, the default of 1000000 is recommended. For testing re.match, I didn't have enough patience for one million iterations, so I used ten thousand. My results were: >>> t1 = Timer("re.match(pattern, longMessage)", ... "from __main__ import pattern, re, compiledPattern, longMessage") >>> t2 = Timer("compiledPattern.match(longMessage)", ... "from __main__ import pattern, re, compiledPattern, longMessage") >>> t1.repeat(number=10000) [3.8806509971618652, 3.4309241771697998, 4.2391560077667236] >>> t2.repeat(number=10000) [3.5613579750061035, 2.725193977355957, 2.936690092086792] which were typical over a few runs. That suggests that even with no effort made to defeat caching, using pre-compiled patterns is approximately 20% faster than re.match(pattern). However, over 100,000 iterations that advantage falls to about 10%. Given that each run took about 30 seconds, I suspect that the results are being contaminated with some other factor, e.g. networking events or other processes running in the background. But whatever is going on, 10% or 20%, pre-compiled patterns are slightly faster even with caching -- assuming of course that you don't count the time taken to compile it in the first place. -- Steven -- http://mail.python.org/mailman/listinfo/python-list