David Cournapeau, 03.02.2010 14:45:
> On Wed, Feb 3, 2010 at 9:11 PM, Stefan Behnel wrote:
>> Dag Sverre Seljebotn, 03.02.2010 12:20:
>>> We should have an option to split Cython-generated code into reusable
>>> include-files at some point too, and perhaps make use of precompiled
>>> headers.
>> Given that the largest part of code in a Cython generated C file is due to
>> real user code, I doubt that there is much Cython can improve by spitting
>> out multiple files here.
> 
> I don't think that's always true. Cython quickly generates files which
> are about 1 Mb of content. For relatively simple code (with a couple
> of python-visible functions), generated files are 5000 LOC, whereas a
> hand-made file would be much smaller.

What I meant, was, that the amount of code and the compiler runtime is not
due to code that Cython could extract into header files. Cython generates a
lot of (type) special casing code that you'd never write manually, e.g. for
loops or function argument extraction. The result is that your hand-written
C-API-using code simply isn't as fast as the code that Cython generates
(except for selected cases where Cython isn't good enough yet).

I would also guess that the compile time is substantially increased by the
amount of inline functions that Cython uses for type conversions. They are
obviously used all over the place and each occurrence needs to get
optimised by the C compiler into exactly the code that is required in the
specific context. So the final amount of code that the C compiler needs to
handle is actually a lot larger than the code that you see in the C file.
Header files won't change that.

But that's obviously just guessing. If someone is willing to profile gcc
while compiling a Cython generated file, I'd be very happy to hear the result.


> I checked on one simple cython
> file in my talkbox scikits, and the actual code is < 1/3 of the
> generated C code.

I agree that there is a relatively large overhead for small Cython files. I
also agree that some of that code is redundant and could be extracted into
header files. What I doubt is that you'd get a major speed-up from
extracting that code. The C compiler would still have to handle it,
regardless of where it came from. Parsing is only a tiny part of what the C
compiler does these days. Optimisation takes several times longer.


>> For most projects where build time actually matters, I bet there's a lot
>> more to gain from getting distutils to build multiple modules in parallel,
>> than from reducing the amount of code that is built in each step.
> 
> Actually, doing so in practice for python extensions is quite
> difficult because of the lack of C awareness of symbols sharing (how
> to access symbols between object files without polluting the resulting
> shared library).
> 
> It took me maybe one full week of work to be able to do that in NumPy,
> to a point where Compile/Test/Debug cycles are a couple of seconds
> instead of ~ 1 minute.

To give some numbers for lxml: both clang and gcc compile the C source file
of the main module in less than 10 seconds without optimisations (empty
CFLAGS/OPT), clang being about two seconds ahead. That's about 160K lines
of C code, generated from 18K lines of Cython code. I do not find ten
seconds an overly unacceptable overhead to a compile-test cycle for 18K
lines of original source code. When I pass -O3, however, the compile time
jumps up to some 40 seconds, i.e. more than a factor of 4, which is quite
noticeable in comparison.

BTW, I know that this is not directly comparable to other projects that use
several separate modules. I guess there is more overhead if you have a
larger number of small Cython modules. But that should be mitigated by
running partial builds (which you'd certainly do during a compile-test cycle).

Stefan
_______________________________________________
Cython-dev mailing list
[email protected]
http://codespeak.net/mailman/listinfo/cython-dev

Reply via email to