Ian Kelly:

Micro-benchmarks like the ones you have been reporting are *useful*
when it comes to determining what operations can be better optimized,
but they are not *important* in and of themselves.  What is important
is that actual, real-world programs are not significantly slowed by
these kinds of optimizations.  Until you can demonstrate that real
programs are adversely affected by PEP 393, there is not in my opinion
any regression that is worth worrying over.

The problem with only responding to issues with real-world programs is that real-world programs are complex and their performance issues often difficult to diagnose. See, for example, scons which is written in Python and which has not been able to overcome performance problems over several years. (http://www.electric-cloud.com/blog/2010/07/21/a-second-look-at-scons-performance/)

Bottom-up performance work has advantages in that a narrow focus area can be more easily analyzed and tested and can produce widely applicable benefits.

The choice of comparison for the script wasn't arbitrary. Comparison is one of the main building blocks of higher-level code. Sorting, for example, depends strongly on comparison performance with a decrease in comparison speed multiplied when applied to sorting.

Its unfortunate that stringbench.py does not contain any comparison or sorting tests.

Sorting a million string list (all the file paths on a particular computer) went from 0.4 seconds with Python 3.2 to 0.78 with 3.3 so we're out of the 'not noticeable by humans' range. Perhaps this is still a 'micro-benchmark' - I'd just like to avoid adding email access to get this over the threshold.

Here's some code. Replace the "if 1" with "if 0" on subsequent runs to avoid the costly file system walk.

import os, time
from os.path import join, getsize
paths = []
if 1:
    for root, dirs, files in os.walk('c:\\'):
        for name in files:
            paths.append(join(root, name))
    with open("filelist.txt", "w") as f:
        f.write("\n".join(paths))
else:
    with open("filelist.txt", "r") as f:
        paths = f.read().split("\n")
print(len(paths))
timeStart = time.time()
paths.sort()
timeEnd = time.time()
print("Time taken=", timeEnd - timeStart)

   Neil
--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to