Steven D'Aprano wrote:

> While I'm gratified that my prediction was so close to the results I
> found, I welcome any suggestions to better/faster/more efficient code.
> more things to try:

code tweaks:

- Factor out the creation of the regular expression from the tests: "escape" and "compile" are relatively expensive, and neither throw-away code (using the RE function forms) nor production code will end up doing them both for each string.

- Same w. the translation table for "translate"

- Use Unicode strings instead of byte strings (we're moving towards 3.0, after all).

test data variations:

- Try dropping the number of actual replacements and see what happens -- if you're escaping user-provided data (e.g. HTML), for example, it's not that unlikely that you end up doing only a few replacements for each string you're processing, or no replacements at all.

- Also try shorter and longer strings ("human-sized" data is often provided in shorter chunks than 216 characters per string; the typical size and distribution depends on your actual application, of course).

Unicode will affect translate more than the others; the last two will most likely affect in-replace instead (that approach gets faster the shorter the strings are, and the fewer calls to replace that you actually end up doing).

Finally, if you want the sub-lambda form to look better, try inserting a character before or after each special character using a template string or a lambda (e.g. a backslash).

</F>

--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to