Jeffrey C. Jacobs <timeho...@users.sourceforge.net> added the comment:
</lurk> Re: timings Thanks for the info, John. First of all, I really like those tests and could you please submit a patch or other document so that we could combine them into the python test suite. The python test suite, which can be run as part of 'make test' or IIRC there is a way to run JUST the 2 re test suites which I seem to have senior moment'd, includes a built-in timing output over some of the tests, though I don't recall which ones were being timed: standard cases or pathological (rare) ones. Either way, we should include some timings that are of a standard nature in the test suite to make Matthew's and any other developer's work easier. So, John, if you are not familiar with the test suite, I can look into adding the specific cases you've developed into the test suite so we can have a more representative timing of things. Remember, though, that when run as a single instance, at least in the existing engine, the re compiler caches recent compiles, so repeatedly compiling an expression flattens the overhead in a single run to a single compile and lookup, where as your tests recompile at each test (though I'm not sure what timeit is doing: if it invokes a new instance of python each time, it is recompiling each time, if it is reusing the instance, it is only compiling once). Having not looked at Matthew's regex code recently (nice name, BTW), I don't know if it also contains the compiled expression cache, in which case, adding it in might help timings. Originally, the cache worked by storing ~100 entries and cleared itself when full; I have a modification which increases this to 256 (IIRC) and only removes the 128 oldest to prevent thrashing at the boundary which I think is better if only for a particular pathological case. In any case, don't despair at these numbers, Matthew: you have a lot of time and potentially a lot of ways to make your engine faster by the time 1.7 alpha is coined. But also be forewarned, because, knowing what I know about the current re engine and what it is further capable of, I don't think your regex will be replacing re in 1.7 if it isn't at least as fast as the existing engine for some standard set of agreed upon tests, no matter how many features you can add. I have no doubt, with a little extra monkey grease, we could implement all new features in the existing engine. I don't want to have to reinvent the wheel, of course, and if Matthew's engine can pick up some speed everybody wins! So, keep up the good work Matthew, as it's greatly appreciated! Thanks all! Jeffrey. <lurk> ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue2636> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com