Dear Niels, > > page 1: the division instruction is now much faster than before on modern > > processors > > According to https://gmplib.org/~tege/x86-timing.pdf, they're still an > order of magnitute slower than multiplication. E.g 86 vs 3 cycles on > Intel skylake. And in addition, multiplication is piplined, division > likely isn't. > > Do you have any suggestion for better wording than "vastly slower" ?
maybe give the number of cycles and refer to x86-timing.pdf? > > page 3, line 5 of Section 3: <d_1, d_0 0> -> <d_1, d_0, 0> > > I think <d_1, d_0> is right. yes sorry the extra 0 should have been removed (same in the algorithm) > > page 4, line 27: for the upper bound, I agree you can subtract the maximal > > value of the u_0 term in the proof of Theorem 3 from [4], but this gives > > \beta-1, not \beta. > > This is a bit subtle, maybe I should explicitly repeat the proof form > [4] with needed modifications. My argument is that the proof in [4] is > of the form > > \tilde r = ... + u0 < ... + \beta <= c = max (\beta^2 - D, q_0 \beta) > > But if u_0 < \beta is the only thing that lets us conclude that \tilde r > < c, with strict inequality, then there's an off-by-one error here. > > Now there are multiple bounds involved; to get \tilde r = c we'd also > need something like u_1, u_0 and q_0 maximal and d_0 = 0. if you need to save \beta with respect to the proof of [4], yes maybe you need to repeat that proof to explain how you save the extra +1. > > I wonder whether we can get rid of that term, or always subtract -1. > > In the latter case (assuming the proof with the p_0 > 0 extra term is > > correct), > > the only problem would be when p_0 = 0. > > I have looked at that, so far without success. If I remember correctly, > always subtracting -1 can be treated in the analysis as extending the > single-word range for r_0 to 0 <= r_0 <= \beta. But that breaks the > analysis of the R' < 0 case. We have > > {r_1, r_0} >= q_0 \beta > > We want to conclude r_1 >= q_0, but that no longer holds if we allow r_0 > = \beta. To make it work, we'd need to streighten the lower bound > > R' >= q_0 \beta - \beta^2 > > Hmm... but from another look at the proof from [4], seems to give us > strict inequality > > R' > q_0 \beta - \beta^2 > > with no extra effort. So maybe we can do that. I'll have a look. The exhaustive search up to \beta=2^7 says that always subtracting 1 seems to work, and the lower bound even improves to -2\beta < R' instead of -2\beta <= R'. Paul PS: an extra typo in the new version: page 5, lines 4 and 5, d_o should be d_0 _______________________________________________ gmp-devel mailing list gmp-devel@gmplib.org https://gmplib.org/mailman/listinfo/gmp-devel