Tim Peters <t...@python.org> added the comment:
The test case here is a = (1 << 100000000) - 1, a solid string of 100 million 1 bits. The goal is to convert to a decimal string. Methods: native: str(a) numeral: the Python numeral() function from bpo-3451's div.py after adapting to use the Python divmod_fast() from the same report's fast_div.py. todecstr: from the Python file attached to this report. gmp: str() applied to gmpy2.mpz(a). Timings: native: don't know; gave up after waiting over 2 1/2 hours. numeral: about 5 1/2 minutes. todecstr: under 30 seconds. (*) gmp: under 6 seconds. So there's room for improvement ;-) But here's the thing: I've lost count of how many times someone has whipped up a pure-Python implementation of a bigint algorithm that leaves CPython in the dust. And they're generally pretty easy in Python. But then they die there, because converting to C is soul-crushing, losing the beauty and elegance and compactness to mountains of low-level details of memory-management, refcounting, and checking for errors after every tiny operation. So a new question in this endless dilemma: _why_ do we need to convert to C? Why not leave the extreme cases to far-easier to write and maintain Python code? When we're cutting runtime from hours down to minutes, we're focusing on entirely the wrong end to not settle for 2 minutes because it may be theoretically possible to cut that to 1 minute by resorting to C. (*) I hope this algorithm tickles you by defying expectations ;-) It essentially stands `numeral()` on its head by splitting the input by a power of 2 instead of by a power of 10. As a result _no_ divisions are used. But instead of shifting decimal digits into place, it has to multiply the high-end pieces by powers of 2. That seems insane on the face of it, but hard to argue with the clock ;-) The "tricks" here are that the O(log log n) powers of 2 needed can be computed efficiently in advance of any splitting, and that all the heavy arithmetic is done _in_ the `decimal` module, which implements fancier-than-Karatsuba multiplication and whose values can be converted to decimal strings very quickly. ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue46558> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com