On Thu, 20 May 2021 at 04:58, Terry Reedy <tjre...@udel.edu> wrote:
>
> On 5/13/2021 4:18 AM, Mark Shannon wrote:
>
> > If a program does 95% of its work in a C++ library and 5% in Python, it
> > can easily spend the majority of its time in Python because CPython is a
> > lot slower than C++ (in general).
>
> I believe the ratio for the sort of numerical computing getting bogus
> complaints is sometimes more like 95% of *time* in compiled C and only,
> say, 5% of *time* in the Python interpreter.  So even if the interpreter
> ran instantly, it would make also no difference -- for such applications.

Not necessarily because if the interpreter is faster then it opens up
new options that perhaps don't involve the same C routines. The
situation right now is that it is often faster to do more
"computation" than needed using efficient C routines rather than do
precisely what is needed in bare Python. If the bare Python part
becomes faster then maybe you don't need the C routine at all.

To give a concrete example, in SymPy I have written a pure Python
implementation of typed sparse matrices (this is much faster than the
public Matrix class so don't compare with that). I would like to use
the flint library to speed up some of these matrix calculations and
the flint library has a highly optimised C/assembly implementation of
dense matrices of arbitrary precision integers. Which of these
implementations is faster for e.g. matrix multiplication depends on
how sparse the matrix actually is. If I have a large matrix say 1000 x
1000 and only 1% of the elements are nonzero then the pure Python
sparse implementation is faster (it can be much faster as the density
reduces since it does not have the same big-O characteristics). On the
other hand for fully dense matrices where all elements are nonzero the
flint implementation is consistently around 100x faster.

The break even point where both implementations take equal time is
around about 5% density. What that means is that for a 1000 x 1000
matrix with 10% of elements nonzero it is faster to ask flint to
construct an enormous dense matrix and perform a huge number of
arithmetic operations (mostly involving zeros) than it is to use a
pure Python implementation that has more favourable asymptotic
complexity and theoretically computes the result with 100x fewer
arithmetic "operations". In this situation there is a sliding scale
where the faster the Python interpreter gets the less often you
benefit from calling the C routine in the first place.

Although this is a very specific example it illustrates something that
I see very often which is that while the efficient C routines can make
things "run at the speed of C" you can often end up optimising things
to use an approach that would seem inefficient if you were working in
C directly. This happens because it works out faster from the
perspective of pure Python code that is encumbered by interpreter
overhead and has a limited range of C routines to choose from. If the
interpreter overhead is less then the options to choose from are
improved.

Also for many applications it is much easier for the programmer to
write an algorithm directly in loops rather than coming up with a
vectorised version based on e.g. numpy arrays. Vectorisation as a way
of optimising code is actually work for the programmer. There is
another tradeoff here which is not about C speed vs Python speed but
about programmer time vs CPU time. If a straight-forward Python
implementation is already "fast enough" then you don't need to spend
time thinking about how to translate that into something that would
possibly run faster (potentially at the expense of code readability).
In the case of SymPy/flint if the maximum speed gain of flint was only
10x then I might not bother using it at all to avoid the complexity of
having multiple implementations to choose from and external
dependencies etc.

--
Oscar
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/BBNEBYO42RAXUM526ZUA65SAQTKCS3QD/
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to