> What's the size of the performance difference between Cython and a C > Extension Module? Obviously this depends on how well each is written, and > how many concessions to maintainability are made, but just as rough > estimates? > If it's not a teensy difference, what is the main thing that accounts for > that difference? Is it perhaps the implicit memory management? >
I don't think this is something you could give a clear-cut answer to without knowing what you were doing in your C extension module, but I'd say the answer should be something along the lines of "probably negligible, unless you were going to perform crazy black magic in your C extension." And even if you were going to do something crazy -- you could always create a first pass with Cython, and then tweak the parts where the speed was that essential. For instance, if you're planning on interfacing with a C library, and creating/manipulating some C/C++ data structures, then you should get basically the same performance (unless you got tricked into converting back and forth between C and Python, but avoiding that should fall under the "how well each is written" bit in your question). If you're planning on mostly working with Python objects in a fairly safe way, then the answer is probably "there's still a small gap, but we're working on closing it fast." There are definitely situations where we have enough information to do things like avoid typechecks and generate tighter code, but I don't think we always take advantage of everything that's available. We also do a lot of things you probably wouldn't take the time to do yourself -- for instance, we generate custom argument parsing code, which some people have definitely seen some performance improvements from. However, there's one class of things you could do directly in a C extension that we don't, which is "cheat:" you could write a function that takes an arbitrary Python object, but decide that you "know" it will only ever get called with a dict, and directly dispatch to various PyDict_* macros, or access parts of the struct directly. Or you could decide some object doesn't get reused, and just side-effect it instead of allocating a new object to store a result, that kind of thing. You could definitely create some faster, more fragile code this way -- you'd start hitting lots of segfaults if your assumptions were off. (This was the "black magic" I referred to above.) Maybe that's not directed/helpful enough -- if you give us an idea of what kind of C extension you were thinking about writing, we could probably give a much better answer about why Cython is the right tool for the job. ;) -cc _______________________________________________ Cython-dev mailing list [email protected] http://codespeak.net/mailman/listinfo/cython-dev
