Re: mpn_mulhigh_basecase for Broadwell

Albin Ahlbäck Tue, 13 Feb 2024 07:47:38 -0800

It is understandable that it may not be used in GMP, it was mainlydeveloped to reduce the time for multiplications of two interval numbers.

The returned limb in this case can be used to "extend the precision ofthe produced result"/reduce the produced interval compared to whensimply using MPFR's mulhigh.

Now, MPFR does not deal with interval arithmetic, but for intervalarithmetic where we can afford a somewhat more sloppy interval, thiswill speed up computations compared to simply using mpn_mul_basecase.For profiling results (comparing against mpn_mul_basecase andmpfr_mulhigh_n) on Zen 3, seehttps://github.com/flintlib/flint/pull/1768#issuecomment-1930356804.

More importantly, this function probably reduces the number ofreallocations as we do not require a destination with `2 * n' limbs, butrather `n' limbs.


Best,
Albin

On 2/11/24 19:35, marco.bodr...@tutanota.com wrote:

Thank you Albin,

currently we don't have any mulhi function, so we didn't decide apossible interface for it.There are some comments in the code where a full mul is used, but some"mulhi" function could be enough. Many of them seem to use unbalancedoperands, and might require the exact result or more limbs than thehigher half...

By the way, I like the idea of computing one more limb, that way one mayknow whether the limbs obtained by mulhi are exact or not.


Ĝis,
mb

6 feb 2024, 23:27 da albin.ahlb...@gmail.com:

    Hello,

    I just wrote an implementation for mpn_mulhigh_basecase for
    Broadwell-type processors (that is, x86_64 with BMI2 and ADX
    instructions) based on Torbjörn's `mpn_mullo_basecase'.

    It is currently declared on the form

    mp_limb_t flint_mpn_mulhigh_basecase(mp_ptr rp, mp_srcptr xp,
    mp_srcptr yp, mp_size_t n),

    as it was developed for FLINT (Fast Library for Number Theory). Note
    that `rp' cannot be aliased with `xp' or `yp'.

    It will compute an approximation of the upper most `n + 1' limbs of
    `{xp, n}' times `{yp, n}', where the upper most `n' limbs are pushed
    to `rp[0]', ..., `rp[n - 1]' and the least significant computed limb
    is returned (via %rax). This returned limb should have an error of
    something along `n' ULPs.

    Note that this differs from MPFR's (private) function
    `mpfr_mulhigh_n', which computes the approximation of the upper most
    `n' limbs into `rp[n]', ..., `rp[2 * n - 1]', where `rp[n]' has an
    error of something along `n' ULPs at most.

    Feel free to change it according to your needs (perhaps you do not
    want to compute `n + 1' limbs, but rather `n' limbs).

    If this code will be used in GMP, feel free to remove the copyright
    claim for FLINT and put my name (spelled Albin Ahlbäck) in the GMP
    copyright claim instead.

    Just some notes:

    - We use our own M4 syntax for the beginning and ending of the
    function, but it should be easy to translate to GMP's syntax.
    - It currently only works for n > 5 (I believe) as we in FLINT have
    specialized routines for small n.
    - It would be nice to avoid pushing five register, and only push four.
    - Reduce the size of the `L(end)' code, and try to avoid multiple
    jumps therein.
    - Move the code-snippet of `L(f2)' to just above `L(b2)', so that no
    jump is needed in between. (This currently does not work because
    `L(end)' as well as this code-snippet is too large for a relative
    8-bit jump.)
    - Start out with an mul_1 sequence with just a mulx+add+adc chain,
    just like in `mpn_mullo_basecase'.
    - Remove the first multiplication in each `L(fX)' and put it in
    `L(end)' instead.
    - The `adcx' instruction in `L(fX)' can be removed (then one needs
    to adjust the `L(bX)'-label), but I found it to be slower. Can we
    remove it and somehow maintain the same performance?

    Best,
    Albin

_______________________________________________
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

Re: mpn_mulhigh_basecase for Broadwell

Reply via email to