Re: udiv_qr_3by2 vs divappr

2018-08-28 Thread Niels Möller
paul zimmermann writes: > if you need to save \beta with respect to the proof of [4], yes maybe you need > to repeat that proof to explain how you save the extra +1. I think we can make it work. We have the reciprocal v, and a corresponding "remainder" K = \beta^3 - {d_1, d_0} (\beta + v) in

Re: udiv_qr_3by2 vs divappr

2018-08-28 Thread paul zimmermann
Dear Niels, it works with r = (u_0 - q d_1 - p_1 - 1) \bmod \beta line 6 in all cases, assuming it works with -1 replaced by - [p_0 > 0]. We only need to check the case p_0 = 0. p_0 = 0 means that q d_0 is divisible by \beta, i.e., R' is multiple of \beta. Let still be the two low words

Re: udiv_qr_3by2 vs divappr

2018-08-28 Thread paul zimmermann
Dear Niels, > > page 1: the division instruction is now much faster than before on modern > > processors > > According to https://gmplib.org/~tege/x86-timing.pdf, they're still an > order of magnitute slower than multiplication. E.g 86 vs 3 cycles on > Intel skylake. And in addition,