Time complexity of Knuth divide a/b=q approximately the same as complexity of school multiplication b*q and is length(b)*length(q) = length(b) * (length(a) - length(b)). So the new heuristic seems reasonable for me.
On Sat, Nov 23, 2013 at 1:47 AM, Brian Burkhalter < brian.burkhal...@oracle.com> wrote: > Hello, > > We've been doing fairly extensive micro-benchmark testing of the effect of > adjusting the various algorithm crossover thresholds in BigInteger (cf. > https://bugs.openjdk.java.net/browse/JDK-8022181). There'll be another > post on that particular topic in the near future. This testing did however > expose another topic for which an issue is not yet (but likely will be) on > file, i.e., the heuristic for determining when to use Burnikel-Ziegler > divide and conquer division instead of "standard" division. > > At present the test used is > > if divisor_int_length < BZ_THRESHOLD or dividend_int_length < > BZ_THRESHOLD then > use standard division > else > use B-Z division > > (where BZ_THRESHOLD is currently 50). > > Let {N,D} represent a dividend-divisor pair with the int-length of the > former being N and that of the latter being D. The following cases were > among those initially tested: {T,2T}, {T,T+2}, {T,T}, {T+2,T}, and {2T,T} > where T is the BZ threshold. In all but the last case, where the dividend > is twice the length of the divisor, a (sometimes serious) performance > regression was observed. > > The implementation was reviewed along with the original paper [1]. The > implementation is accurate with respect to the paper as had already been > verified some months ago [2]. It was observed however that something > interesting happens for the case where the int-length of the dividend A is > equal to that of the divisor B. For this situation, the algorithm parameter > t (step 5 in algorithm 3 in the paper) becomes 2. This causes A to be > virtually padded out to a 2T-length value [0,A]. The net result is that the > BZ algorithm ends up being applied in effect to a pair {2T,T} which is > significantly slower than using the standard algorithm on the pair {T,T}. > This appears to be primarily due to the algorithm overhead rather than a > length dependent problem. For the {T,T} case, the following recursive call > sequence of significant operations ensues: > > divide > divideBZ > divide2n1n > divide3n2n > divide2n1n > divideKnuth > multiply > subtract > divide3n2n > divide2n1n > divideKnuth > multiply > subtract > add > > Despite the fact that some of the method calls reduce to simple operations > like dividing zero by non-zero, dividing something by itself, or > multiplying something by zero or one, the algorithm overhead causes the > performance to be worse than had this sequence been used: > > divide > divideKnuth > > Through micro-benchmark testing it was discovered that, given the original > BZ_THRESHOLD criterion is met, for some value C the BZ algorithm will start > to outperform the standard algorithm for the case {T+C,T}, i.e., the > dividend is C ints longer than the divisor. Testing supports the value of > the offset C being independent of the value of the BZ threshold. This > suggests that the following heuristic would be more appropriate for this > algorithm: > > if divisor_int_length < BZ_THRESHOLD or dividend_int_length - > divisor_int_length < BZ_OFFSET_THRESHOLD then > use standard division > else > use B-Z division > > Any comments on this alternative heuristic would be appreciated as I would > not be in the least bit surprised to have missed some fine point altogether. > > Thanks, > > Brian > > [1] Christoph Burnikel and Joachim Ziegler, "Fast Recursive Division", > Max-Planck-Institut fuer Informatik Research Report MPI-I-98-1-022, > http://www.mpi-sb.mpg.de/~ziegler/TechRep.ps.gz. > [2] > http://mail.openjdk.java.net/pipermail/core-libs-dev/2013-July/018922.html