ni...@lysator.liu.se (Niels Möller) writes: > To get going, I've written C implementations of mpn_div_qr_1n_pi1 and > mpn_divf_qr1n_pi1, and made divrem_1 call them.
Below, also an mpn_div_qr_1, using these primitives (and with some inspiration from divrem_1). For return value, I use the type typedef struct { mp_limb_t r; mp_limb_t qh; } gmp_qr_1_t; The order here is a micro-optimization. I expect that on most ABI:s, for the typical code fragment gmp_qr_1_t res; ... res.r = mpn_foo (...); return res; the return value from mpn_foo will already be in the right register, and only qh needs to be moved from some callee-save register to the second return value register. Are there any problems with using a struct return values like here, in terms of performance, portability, style...? If you agree this makes sense, I'm considering checking in the div_qr_1* functions fairly soon (maybe divrem_1 changes can wait until div_qr_1*_pi1 has settled. The interface should allow for additional precomputed values). I also found some old x86_64 assembly code for the new div_qr_1 algorithm. With perfect scheduling, it could run at 10 cycles on opteron (the main cpu of interest at the time). Main loop is 28 instructions, two independent mul, and decoder limited). Regards, /Niels /* mpn_div_qr_1 -- mpn by limb division. Contributed to the GNU project by Niels Möller and Torbjörn Granlund Copyright 1991, 1993, 1994, 1996, 1998, 1999, 2000, 2002, 2003, 2013 Free Software Foundation, Inc. This file is part of the GNU MP Library. The GNU MP Library is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation; either version 3 of the License, or (at your option) any later version. The GNU MP Library is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details. You should have received a copy of the GNU Lesser General Public License along with the GNU MP Library. If not, see http://www.gnu.org/licenses/. */ #include "gmp.h" #include "gmp-impl.h" #include "longlong.h" #ifndef DIV_QR_1_NORM_THRESHOLD #define DIV_QR_1_NORM_THRESHOLD 3 #endif #ifndef DIV_QR_1_UNNORM_THRESHOLD #define DIV_QR_1_UNNORM_THRESHOLD 3 #endif #if GMP_NAIL_BITS > 0 #error Nail bits not supported #endif /* Divides {up, n} by d. Writes the n-1 low quotient limbs at {qp, * n-1}. Returns remainder and high quotient limb. */ gmp_qr_1_t mpn_div_qr_1 (mp_ptr qp, mp_srcptr up, mp_size_t n, mp_limb_t d) { gmp_qr_1_t res; unsigned cnt; mp_limb_t uh; ASSERT (n > 0); ASSERT (d > 0); if (d & GMP_NUMB_HIGHBIT) { /* Normalized case */ mp_limb_t dinv, q; uh = up[--n]; q = (uh >= d); res.qh = q; uh -= (-q) & d; if (BELOW_THRESHOLD (n, DIV_QR_1_NORM_THRESHOLD)) { cnt = 0; plain: while (n > 0) { mp_limb_t ul = up[--n]; udiv_qrnnd (qp[n], uh, uh, ul, d); } res.r = uh >> cnt; return res; } invert_limb (dinv, d); res.r = mpn_div_qr_1n_pi1 (qp, up, n, uh, d, dinv); return res; } else { /* Unnormalized case */ mp_limb_t dinv, ul; if (! UDIV_NEEDS_NORMALIZATION && BELOW_THRESHOLD (n, DIV_QR_1_UNNORM_THRESHOLD)) { uh = up[--n]; udiv_qrnnd (res.qh, uh, 0, uh, d); cnt = 0; goto plain; } count_leading_zeros (cnt, d); d <<= cnt; #if HAVE_NATIVE_div_qr_1u_pi1 /* FIXME: Call loop doing on-the-fly normalization */ #endif /* Shift up front, use qp area for shifted copy. A bit messy, since we have only n-1 limbs available, and shift the high limb manually. */ uh = up[--n]; ul = (uh << cnt) | mpn_lshift (qp, up, n, cnt); uh >>= (GMP_LIMB_BITS - cnt); if (UDIV_NEEDS_NORMALIZATION && BELOW_THRESHOLD (n, DIV_QR_1_UNNORM_THRESHOLD)) { udiv_qrnnd (res.qh, uh, uh, ul, d); up = qp; goto plain; } invert_limb (dinv, d); udiv_qrnnd_preinv (res.qh, uh, uh, ul, d, dinv); res.r = mpn_div_qr_1n_pi1 (qp, qp, n, uh, d, dinv) >> cnt; return res; } } -- Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26. Internet email is subject to wholesale government surveillance. _______________________________________________ gmp-devel mailing list gmp-devel@gmplib.org http://gmplib.org/mailman/listinfo/gmp-devel