On Thu, 20 May 2021 at 04:58, Terry Reedy <tjre...@udel.edu> wrote: > > On 5/13/2021 4:18 AM, Mark Shannon wrote: > > > If a program does 95% of its work in a C++ library and 5% in Python, it > > can easily spend the majority of its time in Python because CPython is a > > lot slower than C++ (in general). > > I believe the ratio for the sort of numerical computing getting bogus > complaints is sometimes more like 95% of *time* in compiled C and only, > say, 5% of *time* in the Python interpreter. So even if the interpreter > ran instantly, it would make also no difference -- for such applications.
Not necessarily because if the interpreter is faster then it opens up new options that perhaps don't involve the same C routines. The situation right now is that it is often faster to do more "computation" than needed using efficient C routines rather than do precisely what is needed in bare Python. If the bare Python part becomes faster then maybe you don't need the C routine at all. To give a concrete example, in SymPy I have written a pure Python implementation of typed sparse matrices (this is much faster than the public Matrix class so don't compare with that). I would like to use the flint library to speed up some of these matrix calculations and the flint library has a highly optimised C/assembly implementation of dense matrices of arbitrary precision integers. Which of these implementations is faster for e.g. matrix multiplication depends on how sparse the matrix actually is. If I have a large matrix say 1000 x 1000 and only 1% of the elements are nonzero then the pure Python sparse implementation is faster (it can be much faster as the density reduces since it does not have the same big-O characteristics). On the other hand for fully dense matrices where all elements are nonzero the flint implementation is consistently around 100x faster. The break even point where both implementations take equal time is around about 5% density. What that means is that for a 1000 x 1000 matrix with 10% of elements nonzero it is faster to ask flint to construct an enormous dense matrix and perform a huge number of arithmetic operations (mostly involving zeros) than it is to use a pure Python implementation that has more favourable asymptotic complexity and theoretically computes the result with 100x fewer arithmetic "operations". In this situation there is a sliding scale where the faster the Python interpreter gets the less often you benefit from calling the C routine in the first place. Although this is a very specific example it illustrates something that I see very often which is that while the efficient C routines can make things "run at the speed of C" you can often end up optimising things to use an approach that would seem inefficient if you were working in C directly. This happens because it works out faster from the perspective of pure Python code that is encumbered by interpreter overhead and has a limited range of C routines to choose from. If the interpreter overhead is less then the options to choose from are improved. Also for many applications it is much easier for the programmer to write an algorithm directly in loops rather than coming up with a vectorised version based on e.g. numpy arrays. Vectorisation as a way of optimising code is actually work for the programmer. There is another tradeoff here which is not about C speed vs Python speed but about programmer time vs CPU time. If a straight-forward Python implementation is already "fast enough" then you don't need to spend time thinking about how to translate that into something that would possibly run faster (potentially at the expense of code readability). In the case of SymPy/flint if the maximum speed gain of flint was only 10x then I might not bother using it at all to avoid the complexity of having multiple implementations to choose from and external dependencies etc. -- Oscar _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/BBNEBYO42RAXUM526ZUA65SAQTKCS3QD/ Code of Conduct: http://python.org/psf/codeofconduct/