Dear Torbjörn,
> I am not sure one can draw any conclusions about the relative
> performance of the current GMP code and their suggested new method
> considering how they present the performance results.
I agree. Moreover the experiments consider the first 10^4 primes, which have
at most 1
which compares to GMP 6.0.0: http://eprint.iacr.org/2014/760
Paul
___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel
http://eprint.iacr.org/2014/755.pdf, see Fig. 1 page 17.
Paul
___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel
ok I see we are far from a public mulmod_bnm1 interface. May I ask that in the
meantime, you don't change the internal functions mpn_mul_fft, mpn_fft_best_k
and mpn_fft_next_size, so that we can use them for the wrap-around trick
(for example in GMP-ECM)?
Paul
_
Hi,
the GMP 6.0.0 documentation mentions mpn_mulmod_bnm1 in the Contributors
section, but this function is not in the public API.
Currently in GMP-ECM we use the internal function mpn_mul_fft for the
"wraparound trick", but it would be better to use a documented function.
Are there any pl
Niels,
sorry for the late reply, we were quite busy with the MPFR/MPC workshop
(which was very successful).
> > it seems it won't be sufficient for
> > our needs in MPFR. Consider for example the following function, which
> > generates nbits random bits into mp[]:
>
> I see. In this parti
> the issue reported in September 2010 is still present:
>
> Such things happens sometimes. It means that we volunteers did not have
> enough spare time for making the GMP gift better in that specific
> respect.
there was no criticism in my mail (sorry if it seemed), I just wanted to
put thi
Hi,
the issue reported in September 2010 is still present:
https://gmplib.org/list-archives/gmp-devel/2010-September/001654.html
tarte% ./speed -s 1000,3000,1 -r mpn_sqrtrem mpn_rootrem.2
overhead 0.2 secs, precision 1 units of 3.53e-10 secs, CPU freq
2833.00 MHz
Dear Marco,
> > Seed GMP_CHECK_RANDOMIZE=1391280408 (include this in bug reports)
> > repl-vsnprintf.c:379: GNU MP assertion failed: len < total_width
> > /bin/bash: line 5: 7656 Aborted (core dumped)
> > MPFR_QUIET=1 ${dir}$tst
> > FAIL: tprintf
>
> > I trust that
> PS: when doing "make check" with mpfr development version and the gmp
> snapshot
> from last night, I get one failure:
>
> Seed GMP_CHECK_RANDOMIZE=1391280408 (include this in bug reports)
> repl-vsnprintf.c:379: GNU MP assertion failed: len < total_width
> /bin/bash: line 5: 7656
Niels,
> There's a ChangeLog entry from the day before yesterday:
>
> * doc/gmp.texi: Undocument mpz_array_init.
>
> I take it Torbjörn really wants to deprecate this function.
>
> Are you using it in mpfr?
no.
Paul
___
gmp-devel maili
Hi,
we have looked at the mpz_limbs interface, it seems it won't be sufficient for
our needs in MPFR. Consider for example the following function, which
generates nbits random bits into mp[]:
/* generate nbits random bits into mp[], assuming mp was allocated to contain
a sufficient numb
Hi,
we have looked at the mpz_limbs interface, it seems it won't be sufficient for
our needs in MPFR. Consider for example the following function, which
generates nbits random bits into mp[]:
/* generate nbits random bits into mp[], assuming mp was allocated to contain
a sufficient numb
Niels,
> > Here are the division functions I had to re-implement on top of mini-gmp
> > (I know mpn_divrem is obsolete, but its interface is better suited for
> > MPFR):
>
> Do you need fraction limbs? From a quick look, it seems like you
> implemented that for mpn_divrem_1, but not for m
Niels,
> > The most wanted are the mpn division functions, I wonder why those functions
> > are not exported.
>
> I think the main reason is that the interface of the internal division
> functions in mini-gmp, in particular mpn_div_qr, is quite different from
> the older interface used in
Niels,
I finally managed to configure MPFR with mini-gmp, and run make and make check.
I had to disable about 15 tests which use either mpf_t or mpq_t.
All the 155 remaining tests pass, except one (texp), I need to investigate.
But so far I found no bug in mini-gmp. Good!
I had to do a f
Hi,
I tried to compile MPFR with mini-gmp. I needed to do a few changes:
Changes in mini-gmp.c:
1) I had to add #define mpz_init __gmpz_init, because in MPFR
configure we check for __gmpz_init
2) I had to add a line:
unsigned int mp_bits_per_limb = GMP_NUMB_BITS;
3) I had to chang
Hi,
about the following macro in gmp-impl.h:
/* Compute the number of limbs corresponding to ndigits base-b digits, rounding
up. */
#define LIMBS_PER_DIGIT_IN_BASE(res, ndigits, b)\
do { \
mp
Hi Torbjörn,
> A significant fraction of the point-wise products' time will be spent in
> mpn_mul_basecase/mpn_sqr_basecase. The time ration for those is close
> to 1/2. This ratio seems to be reflected in toom functions.
this is a very good point. As a consequence, we should have S/M =
Hi,
asymptotically, the cost of a squaring by FFT should be at least 2/3 of the
cost of a multiply, since a multiply performs 3 transforms and 1 pointwise
multiplication, whereas a squaring saves 1 transform.
Moreover GMP is using Schönhage-Strassen's algorithms, where the pointwise
multip
Torbjörn,
> we mean "faster than GMP's conversion functions", but still using GMP for
> the
> low-level operations.
>
> Then please say so in the paper.
indeed that was not clear, we will say it in the revised version.
Paul
___
gmp-devel
Torbjörn,
> From: Torbjorn Granlund
> Date: Mon, 07 Oct 2013 10:45:56 +0200
>
> I did a quick first read.
>
> I take it that "faster than GMP" means "faster than GMP's conversion
> functions, using GMP"... Or did you actually beat GMP without GMP?
we mean "faster than GMP's conversion
Niels,
> From: ni...@lysator.liu.se (Niels Möller)
> Date: Mon, 07 Oct 2013 08:20:13 +0200
>
> Zimmermann Paul writes:
>
> > with Cyril Bouvier we have written a preprint describing several
> > division-free
> > algorithms to convert from binary to
Hi,
with Cyril Bouvier we have written a preprint describing several division-free
algorithms to convert from binary to decimal (the mp?_get_str routines):
http://www.loria.fr/~zimmerma/papers/get_str.pdf
Our implementation of those algorithms gives speedups of 50% (or more) wi
Hi Torbjörn,
> I got back to this now, and finally committed a patch.
>
> New diagrams at gmplib.org/devel/.
>
> Things are still not perfect. More work is needed.
thank you for the feedback. Yes the new curve is not everywhere optimal, but
the important thing is that it is much more re
Marc,
> Date: Fri, 12 Apr 2013 11:28:01 +0200 (CEST)
> From: Marc Glisse
>
> On Fri, 12 Apr 2013, Zimmermann Paul wrote:
>
> > the mpq_get_d function rounds towards zero (i.e., truncates).
> > In several applications, people usually prefer rounding to ne
Hi,
the mpq_get_d function rounds towards zero (i.e., truncates).
In several applications, people usually prefer rounding to nearest.
Is it planned to provide a function (say mpq_get_d_nearest) for that?
I could contribute it, based on the code below (which does not deal with
subnormal numb
Marco,
> This means that MPFR too consider that -0.9 does not fit an unsigned int
> when rounded towards zero... I'd expect all "1".
this is a bug, that is fixed in the development version of MPFR (that now
matches the MPFR documentation).
Paul
Marco,
> Date: Mon, 18 Mar 2013 08:28:55 +0100 (CET)
> From: bodr...@mail.dm.unipi.it
>
> Ciao,
>
> I was reading some sources in mpf, and I do not understand a detail.
>
> Documentation of mpf_fits_uint_p reads:
> "Return non-zero if op would fit in the respective C data type, when
> tr
Torbjörn,
> From: Torbjorn Granlund
> Date: Wed, 06 Mar 2013 11:13:15 +0100
>
> Zimmermann Paul writes:
>
> Finally to increase the technical content on this list Alexander Kruppa
> told me about the following preprint of Ernst Mayer:
> arxiv.org/abs/1303.0
Niels,
> We need a forum for both posting patches, and discussing the same
> patches. To me, it seems easiest to have both on the same list.
however maybe not all people on gmp-devel are interested by the detail of all
patches and the corresponding discussions?
Personnally I subscribed to
Hi all,
the gmp-devel list is for "Technical discussions between developers". We have
seen recently several patches posted, which I believe do no match the list
definition. If there is no other way to transfer source code, maybe one should
create a gmp-patch list?
Paul Zimmermann
_
> There are two minor differences between our functions, one is that the
> bdiv function allow any dividend size, the other is the "rounding mode"
> for the quotient.
maybe a 3rd difference is that the mpn_redc_{1,2} functions do not provide
results in canonical form (contrary to mpn_redc_n I beli
Torbjörn,
> You might want to try this patch. It removes lots of code, originally
> put in place for saving memory. It allows the divappr function to do
> its job more smoothly.
this is much better with this patch. Before I got:
frite% ./bug2 201 101
mpz_tdiv_q(201,101)
Marco,
> After the patch, only changing the way tune/speed allocate memory for the
> operands, their results are comparable:
>
> $ tune/speed -s 80 mpn_mul_n mpn_mul mpn_mul_n mpn_mul
> overhead 0.2 secs, precision 1 units of 3.12e-10 secs, CPU
> freq 3200.20 MHz
>
> if the culprit is the macro used in speed, it should be fixed!
>
> I stared at it for an hour yesterday, and I cannot see any problems.
>
> Operand alignment will differ, but then we shouldn't get consistently
> worse performance from mpn_mul.
strange indeed. Did you try to use the same op
Marco,
> Date: Wed, 6 Feb 2013 17:59:44 +0100 (CET)
> From: bodr...@mail.dm.unipi.it
>
> Ciao Paul!
Ciao!!!
> Of course. With current implementation, unbalanced multiplications need
> some more memory and a few additions/subtractions, but this should not
> give a measurable slow-down. Th
Hi,
> From: Torbjorn Granlund
> Date: Tue, 05 Feb 2013 21:10:29 +0100
>
> ni...@lysator.liu.se (Niels Möller) writes:
>
> > The fix is choosing the inverse size less stupidly, and organising the
> > computation in more similarly sized quotient blocks.
>
> So there should be k bl
Hi,
on January 25 I reported efficiency issues with unbalanced multiplication.
Here I report similar issues with unbalanced division. Consider the program
below. I get on a 2Ghz Intel Xeon E7540 with GMP 5.1.0:
barbecue% ./bug2 201 101
mpz_tdiv_q(201,101) took 2244ms
barbe
Dear Niels,
> But I miss some features in the public interface to do this [...]
I'm glad you hit this issue. I hit it for GMP-ECM and MPFR several years ago.
I reported missing functions in the public interface, and some of them were
added during the years. You have to be patient.
I would
Marco,
> Date: Sat, 26 Jan 2013 16:21:28 +0100 (CET)
> From: bodr...@mail.dm.unipi.it
>
> Ciao,
>
> Il Sab, 26 Gennaio 2013 4:01 pm, bodr...@mail.dm.unipi.it ha scritto:
> > I mean, which timing do you obtain with the following?
> > ./speed -s $((100+775660)/2) mpn_mul_n mpn_mul_n
>
Hi,
in GMP 5.1.0, a multiplication of n x m limbs for m < n can be slower than
a multiplication of n x n limbs. Compare for example the line starting with
775660 in the first output from speed, and the one starting with 100 in
the second one below.
frite% ./speed -s 50-100 -f 1
Hi,
it seems the speed command considers for mpn_set_str the size of the *input*,
i.e., the number of *bytes* of the input string, and not the number of limbs
of the corresponding mpn number.
This makes it difficult to compare it to other mpn functions:
frite% ./speed -s 1000-110 -f 2
Niels,
> Note that large parts of the jacobi code is replaced in the imminent 5.1
> release. So it might be interesting to benchmark also the latest code.
I'm glad you asked (on the AMD processor):
frite% gcc -I$GMP/include -O2 -g e.c $GMP/lib/libgmp.a
frite% ./a.out
GMP: header 5.0.5, li
> It is a pity people will need to locate mpz_kronecker_ui when they want
> a plain Legendre symbol. I don't think the Legendre -> Jacobi ->
> Kronecker generalisations are so widely know that people will
> automatically look for functions with the other names.
by the way, here are some timings o
Niels,
> From: ni...@lysator.liu.se (Niels Möller)
> Date: Mon, 17 Dec 2012 16:53:52 +0100
>
> Zimmermann Paul writes:
>
> > Assume one wants to compute the Legendre symbol (A/P) for P an unsigned long
> > (and A too). What is more efficient?
> >
&
Hi,
I wonder why there are some ui/si variants for mpz_kronecker, but not for the
simpler functions mpz_jacobi and mpz_legendre.
Assume one wants to compute the Legendre symbol (A/P) for P an unsigned long
(and A too). What is more efficient?
1) use mpz_kronecker_ui, which forces to conve
> Is the pseudo-code better now?
>
> http://gmplib.org/devel/
I see no difference, even after clearing the browser cache.
Paul
___
gmp-devel mailing list
gmp-devel@gmplib.org
http://gmplib.org/mailman/listinfo/gmp-devel
Dear Marco,
> > no, I thought about the classical non-redundant representation 0 <= r <
> > B^rn-1.
>
> On the other side... this redundancy shouldn't hurt your wrap-around
> trick, does it?
in the use we have for GMP-ECM, I guess no. But I can imagine other
(wrap-around trick) applicatio
Dear Marco,
> Date: Sat, 20 Oct 2012 17:31:33 +0200 (CEST)
> From: bodr...@mail.dm.unipi.it
>
> Ciao Paul, and hi GMP-ECM developers!
ciao!!!
> Il Gio, 18 Ottobre 2012 11:06 am, Zimmermann Paul ha scritto:
>
> >> > void mpn_mulmod_bnm1 (mp_ptr rp, mp_size
Niels,
> From: ni...@lysator.liu.se (Niels Möller)
> Date: Thu, 18 Oct 2012 10:43:34 +0200
>
> Zimmermann Paul writes:
>
> > void mpn_mulmod_bnm1 (mp_ptr rp, mp_size_t rn, mp_srcptr ap, mp_size_t an,
> > mp_srcptr, bp, mp_size_t bn, mp
Hi GMP developers,
following your answers yesterday, we'll use the following "wrap-around"
interface in GMP-ECM for the next release after GMP 5.1.0 (which is planned
for 2012 according to http://gmplib.org/#FUTURE), or for GMP 5.1.0 if those
functions are already available. This means that
Hi,
> It definitely makes the wrap-around trick useful for fairly small sizes.
indeed. I've changed the development version of GMP-ECM to use mpn_mulmod_bnm1
instead of mpn_mul_fft for the wrap-around trick.
However we also use mpn_mul_fft when the input number is a Fermat number, and
in
Torbjörn,
> As Niels said, mpn_mulmod_bnm1 might be put there. But mpn_mul_fft will
> never be put there, at least not under that name.
>
> (We prefer functions names that describes a function, rather than an
> algorithm.)
ok.
> Note also that mpn_mulmod_bnm1 is currently signi
Niels,
thank you for your answer.
> From: ni...@lysator.liu.se (Niels Möller)
> Date: Wed, 17 Oct 2012 10:22:43 +0200
>
> Zimmermann Paul writes:
>
> > several applications using GMP (in particular GMP-ECM) use the internal fft
> > functions (mainly mpn_mul_
Hi GMP developers,
several applications using GMP (in particular GMP-ECM) use the internal fft
functions (mainly mpn_mul_fft and mpn_fft_best_k) for multiplication modulo
2^N+1. This saves a factor of two with respect to mpn_mul_n if for example you
know the lower part of a N-bit product, a
Marc,
> If the function mpz_get_ld is provided, I expect it to work for the whole
> (exponent) range of long double value, not just the range of double.
> Should we first do something about large z, then apply your method and fix
> the result with ldexpl?
you are right. Here is an impro
Hi,
I am contributing the following function:
long double
mpz_get_ld (mpz_t z)
{
double h, l;
mpz_t tmp;
h = mpz_get_d (z);
mpz_init (tmp);
mpz_set_d (tmp, h);
mpz_sub (tmp, tmp, z);
l = mpz_get_d (tmp);
mpz_clear (tmp);
return (long double) h + (long double) l;
}
Pau
Niels,
> > agreed. But to avoid this dependent multiplication, I believe
> > Montgomery-Svoboda has more potential. Instead of performing n addmul_1
> > calls
> > with N, you perform n-1 call with k*N such that the low limb of k*N is -1
> > (thus the quotient selection is trivial) and one
Niels,
> I think there's some potential for speed up of the linear term, which is
> mostly relevant for small sizes. The addmul_1 calls can run at 3 cycles
> per limb or so. But then the computing the quotient involves dependent
> multiplications with longer latency, so one may be able to c
Niels,
> For these moderate sizes, does it pay off to precompute a full inverse
> for the montgomery reduction, rather than using redc_1 or redc_2? (I
> don't quite understand which lines in you benchmark data I should look
> at).
I believe those sizes are too small. With a full inverse, w
Torbjörn,
> Providing special code for many un,vn combinations (as separate
> functions are as part of mpn_mul_basecase) quickly become unmanageable.
> If we want to handle all sizes <= 16 (say) we'll need 136 variants.
>
> I don't think it makes much sense providing code for just un=vn (e
Hi,
GMP currently has variable-size assembly code for mpn_mul_n on some
processors. Could it be faster to have fixed-size assembly code for
small values of n (say up to n=32)? Then mpn_mul_n() would simply be
a wrapper to those fixed-size functions, or to a variable-size code
for n>32.
Pau
Hi Niels, Torbjörn, Marco,
> I agree with Torbjorn. For base 256 conversion, the function mpz_getnth_ui
> he proposes is far better than the current mpz_getlimbn, because of the
> possible internal use of nails...
since that discussion has no longer to do with mpfr, please follow-up to
gmp
64 matches
Mail list logo