Re: performance critical Python features

2011-06-23 Thread Chris Angelico
On Fri, Jun 24, 2011 at 2:58 AM, Eric Snow ericsnowcurren...@gmail.com wrote:
 So, which are the other pieces of Python that really need the heavy
 optimization and which are those that don't?  Thanks.


Things that are executed once (imports, class/func definitions) and
things that primarily wait for user input don't need to be optimized.
Things that get executed millions of times a second MAY need to be
optimized.

ChrisA
(The keyword MAY is to be interpreted as per RFC 2119.)
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: performance critical Python features

2011-06-23 Thread Steven D'Aprano
On Fri, 24 Jun 2011 04:00:17 +1000, Chris Angelico wrote:

 On Fri, Jun 24, 2011 at 2:58 AM, Eric Snow ericsnowcurren...@gmail.com
 wrote:
 So, which are the other pieces of Python that really need the heavy
 optimization and which are those that don't?  Thanks.


 Things that are executed once (imports, class/func definitions) and

You can't assume that either of those things are executed once. Consider 
this toy example:

def outer(a, b):
def inner(x):
return (x*a - b)*(x*b - a) - 1
return inner(b**2 - a**2)

results = [outer(a, b) for (a, b) in coordinate_pairs()]

The function definition for inner gets executed repeatedly, inside a 
tight loop.

Fortunately Python does optimize this case. The heavy lifting (parsing 
the source of inner, compiling a code object) is done once, when outer is 
defined, and the only work done at runtime is assembling the pieces into 
a function object, which is fast.

Similarly, imports are so expensive that it makes sense to optimize them. 
A single line like import module requires the following work:

- expensive searches of the file system, looking for a module.py file 
  or a module/__init__.py package, possibly over a slow network or 
  inside zip files;
- once found, parse the file;
- compile it;
- execute it, which could be arbitrarily expensive;
- and which may require any number of new imports.

Again, imports are already optimized in Python: firstly, once a module 
has been imported the first time, the module object is cached in 
sys.modules so that subsequent imports of that same module are much 
faster: it becomes little more than a name lookup in a dict. Only if that 
fails does Python fall back on the expensive import from disk.

Secondly, Python tries to cache the compiled code in a .pyc or .pyo file, 
so that parsing and compiling can be skipped next time you import from 
disk (unless the source code changes, naturally).

And even so, importing is still slow. That's the primary reason why 
Python is not suitable for applications where you need to execute lots of 
tiny scripts really fast: each invocation of the interpreter requires a 
whole lot of imports, which are slow the first time.

(Still, Python's overhead at startup time is nowhere near as expensive as 
that of Java... but Java is faster once started up.)


-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: performance critical Python features

2011-06-23 Thread Chris Angelico
On Fri, Jun 24, 2011 at 10:07 AM, Steven D'Aprano
steve+comp.lang.pyt...@pearwood.info wrote:
 On Fri, 24 Jun 2011 04:00:17 +1000, Chris Angelico wrote:

 On Fri, Jun 24, 2011 at 2:58 AM, Eric Snow ericsnowcurren...@gmail.com
 wrote:
 So, which are the other pieces of Python that really need the heavy
 optimization and which are those that don't?  Thanks.


 Things that are executed once (imports, class/func definitions) and

 You can't assume that either of those things are executed once. Consider
 this toy example:

Sure. I was talking in generalities; of course you can do expensive
operations frequently. If you wanted to, you could do this:

radius=5
circum=0
for i in range(10,1000):
c=radius*calculate_pi_to_n_decimals(i)
if ccircum: circum=c

Calculates the highest possible circumference of a circle of that
radius. Does this mean we now have to optimize the pi calculation
algorithm so it can be used in a tight loop? Well, apart from the fact
that this code is moronic, no. All you need to do is cache. (Although
I guess in a way that's an optimization of the algorithm. It's the
same optimization as is done for imports.)

But generally speaking, functions are called more often than they're
defined, especially when we're talking about tight loops. And while
your example could be written without the repeated definition:

def outer(a, b):
   x=b**2 - a**2
   return (x*a - b)*(x*b - a) - 1

results = [outer(a, b) for (a, b) in coordinate_pairs()]

(at least, I think this is the same functionality), if inner() were
recursive, that would be different. But recursive inner functions
aren't nearly as common as write-once-call-many functions.

ChrisA
-- 
http://mail.python.org/mailman/listinfo/python-list