Re: [Python-Dev] C-level duck typing

Dag Sverre Seljebotn Thu, 17 May 2012 11:15:35 -0700

Mark Shannon <m...@hotpy.org> wrote:

Dag Sverre Seljebotn wrote:

from numpy import sin
# assume sin is a Python callable and that NumPy decides to support
# our spec to also support getting a "double (*sinfuncptr)(double)".


# Our mission: Avoid to have the user manually import "sin" from C,
# but allow just using the NumPy object and still be fast.

# define a function to integrate
cpdef double f(double x):
    return sin(x * x) # guess on signature and use "fastcall"!

# the integrator
def integrate(func, double a, double b, int n):
    cdef double s = 0
    cdef double dx = (b - a) / n
    for i in range(n):
        # This is also a fastcall, but can be cached so doesn't
        # matter...
        s += func(a + i * dx)
    return s * dx

integrate(f, 0, 1, 1000000)

There are two problems here:

 - The "sin" global can be reassigned (monkey-patched) between each

call

to "f", no way for "f" to know. Even "sin" could do the reassignment.

So

you'd need to check for reassignment to do caching...


Since Cython allows static typing why not just declare that func can
treat sin as if it can't be monkeypatched?

If you want to manually declare stuff, you can always use a C functionpointer too...

Moving the load of a global variable out of the loop does seem to be a
rather obvious optimisation, if it were declared to be legal.

In case you didn't notice, there was no global variable loads inside theloop...

You can keep chasing this, but there's *always* cases where they don't(and you need to save the situation by manual typing).

Anyway: We should really discuss Cython on the Cython list. If mymotivating example wasn't good enough for you there's really nothing Ican do.

Some rough numbers:

 - The overhead with the tp_flags hack is a 2 ns overhead (something
similar with a metaclass, the problems are more how to synchronize

that

metaclass across multiple 3rd party libraries)


Does your approach handle subtyping properly?


Not really.


 - Dict lookup 20 ns


Did you time _PyType_Lookup() ?

No, didn't get around to it yet (and thanks for pointing it out).(Though the GIL requirement is an issue too for Cython.)

 - The sin function is about 35 ns. And, "f" is probably only 2-3 ns,

and there could very easily be multiple such functions, defined in
different modules, in a chain, in order to build up a formula.


Such micro timings are meaningless, because the working set often tends

to fit in the hardware cache. A level 2 cache miss can takes 100s of
cycles.

I find this sort of response arrogant -- do you know the details ofevery usecase for a programming language under the sun?

Many Cython users are scientists. And in scientific computing inparticular you *really* have the whole range of problems and workingsets. Honestly. In some codes you only really care about the speed ofthe disk controller. In other cases you can spend *many seconds* workingalmost only in L1 or perhaps L2 cache (for instance when integratingordinary differential equations in a few variables, which is notentirely different in nature from the example I posted). (Then, thosemany seconds are replicated many million times for different parameterson a large cluster, and a 2x speedup translates directly into largeamounts of saved money.)

Also, with numerical codes you block up the problem so that loads to L2are amortized over sufficient FLOPs (when you can).

Every time Cython becomes able to do stuff more easily in this domain,people thank us that they didn't have to dig up Fortran but can staycloser to Python.

Sorry for going off on a rant. I find that people will give well-meantadvice about performance, but that advice is just generalizing fromcomputer programs in entirely different domains (web apps?), andsweeping generalizations has a way of giving the wrong answer.


Dag
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] C-level duck typing

Reply via email to