Tim Peters <[email protected]> added the comment:
I don't know that there's a good use case for this. For floating addition,
tree-based summation can greatly reduce total roundoff error, but that's not
true of floating multiplication.
The advantage for prod(range(2, 50001)) doesn't really stem from turning it
into a tree reduction, but instead for that specific input sequence "the tree"
happens to do a decent job of keeping multiplicands more-or-less balanced in
size (bit length), which eventually allows Karatsuba multiplication to come
into play. If CPython didn't implement Karatsuba multiplication, tree reduction
wouldn't improve speed: if the inputs are in sequence xs, regardless how
school-book multiplication is grouped, or rearranged, the time needed is
proportional to
sum(a * b for a, b in combinations([x.bit_length() for x in xs], 2))
So if the real point is to speed large products of integers, a different
approach is wanted: keep track of intermediate products' bit lengths, and at
each step strive to multiply partial products near the same length. This will
reliably get Karatsuba into play if possible, and caters too to that Karatsuba
is most valuable on multiplicands of the same bit length. When any of that
happens from blind tree reduction, it's down to luck.
I've seen decent results from doing that with a fixed, small array A, which
(very roughly speaking) combines "the next" integer `i` to multiply with the
array entry A[i.bit_length().bit_length()] (and continues that with the new
partial product, & so on, until hitting an array slot containing 1).
----------
_______________________________________
Python tracker <[email protected]>
<https://bugs.python.org/issue46868>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com