ni...@lysator.liu.se (Niels Möller) writes:
No immediate plans. To me, it seems stable enough, if documented
together with mpn_mulmod_bnm1_itch and mpn_mulmod_bnm1_next_size.
We should integrate the small primes FFT code,
ni...@lysator.liu.se (Niels Möller) writes:
It's usually defined in gmp-mparam.h for your machine, with a fallback
definition in gmp-impl.h. But that definition isn't picked up by
assembly files, so it should also be defined in config.m4, generated by
configure. Not sure how configure
ni...@lysator.liu.se (Niels Möller) writes:
And the value you use (30) is different from the default in gmp-impl.h
(10). I take it you think the larger value is more appropriate for
powerpc?
I looked at the measured values for the powerpc64 hardware we run on,
and 30 seemed to be in the
I think what you suggest is very close to the pseudo code at
https://gmplib.org/devel/, under the header mpz_powm and mpz_powm_ui.
But you suggest several additional refinements.
I wasn't considering of the case when the base is just a single limb,
but any time 2 * log(b) = log(m).
These
ni...@lysator.liu.se (Niels Möller) writes:
if (reps 25 mpz_cmpabs_ui (n, 5000*5000+5000 + 41) 0)
reps = 25;
else if (reps 5000)
reps = 5000;
I didn't follow this thread too closely, but that code seems to suggest
that an argument of 5000 makes sense.
Even the most
ni...@lysator.liu.se (Niels Möller) writes:
It seems mpz_probably_prime_p considers negated primes to also be prime.
E.g, for n == -29 it returns 2, meaning definitely prime.
Mathematically, I think -29 is usually considered neither prime, nor
composite (its prime factorization is -1 * 29
bodr...@mail.dm.unipi.it writes:
Il Ven, 17 Gennaio 2014 1:10 pm, Vincent Lefevre ha scritto:
you may also have optimizations based on the fact that some variable
cannot be zero. But you have no types that don't include zero. The
right solution is to make sure that the compiler knows
Perhaps we should add a simple one-level version to mini-gmp?
#define GMP_MINI_VERSION 17
It does not need to be bumped with GMP release if mini-gmp did not
change. Perhaps it should be bumped at each checkin?
Torbjörn
Please encrypt, key id 0xC8601622
John Sully j...@csquare.ca writes:
While testing the latest development code we've discovered that it fails on
specific Pentium D Haswell CPUs. These CPUs are odd in that they don't
have the BMI2 instruction set. Because of this GMP will crash when it
attempts to execute a MULX. The
ni...@lysator.liu.se (Niels Möller) writes:
I'd suggest doing the below (also undoing Marco's previous fix).
To fix the actual failure, one would also need to edit the two
gmp-mparam.h files which set DIV_QR_1_NORM_THRESHOLD to zero.
I checked that in yesterday. But I'm a bit
bodr...@mail.dm.unipi.it writes:
Code says:
if (d GMP_NUMB_HIGHBIT)
{ /* Normalized case */
uh = up[--n]; /* Here n goes to 0 */
...
if (BELOW_THRESHOLD (n, DIV_QR_1_NORM_THRESHOLD))
{
while (n 0)
udiv_qrnnd (...);
We currently have many spurious failures flagged in red at
https://gmplib.org/devel/tm-date.html, mainly due to hardware errors
with the system `biko'.
But the `hark' failure looks real:
hark$ cd /var/tmp/gmp-obj/hark-stat-64
hark$ GMP_CHECK_RANDOMIZE=3526906869 tests/mpn/t-div
ni...@lysator.liu.se (Niels Möller) writes:
Hmm. Look like it's the returning of the high limb via the separate *qp
which is broken? And there's no (non-inline) assembly involved, its
generic/mpn_div_qr_1.c.
I traced it to a nn=0 call to the underlying pi1 call. Dunno if that's
ni...@lysator.liu.se (Niels Möller) writes:
Given the current implementation, it's natural. But we could document
that it is required that any left over bits in the top limb must be
zero. Would that be better?
My take on this is that asking users to keep that zero isn't a
requirement
ni...@lysator.liu.se (Niels Möller) writes:
Torbjorn Granlund t...@gmplib.org writes:
Please use something else than ebits, since that sounds like the
arguments contains bits with individual meaning. IIRC enb would
follow conventions used elsewhere in the manual.
Naming
ni...@lysator.liu.se (Niels Möller) writes:
Why isn't __gmp_extract_double's style OK for mpn_set_d? Is its
conventions not neat enough, or are there efficiency reasons?
I found the conventions of __gmp_extract_double hard to understand. And
I think returning a base 2 exponent is
ni...@lysator.liu.se (Niels Möller) writes:
Below is a patch to do this (and return value is long, not mp_bitcnt_t,
since it needs to be signed).
What do you think?
I'm to busy to make an educated analysis.
Why isn't __gmp_extract_double's style OK for mpn_set_d? Is its
conventions
ni...@lysator.liu.se (Niels Möller) writes:
Any idea what's going on?
I quick guess is that the exponent is fixed for powm, not a function of
the input size.
___
gmp-devel mailing list
gmp-devel@gmplib.org
For what it is worth, there are now 32-bit and 64-bit fbsd5 and fbsd56
systems in the test array: https://gmplib.org/devel/testsystems.html
Both seem to allow XMM access. The problem might be limited to fbsd4.
At some point, it would be nice to clean up the broken logics for this
in GMP's
Il Sab, 25 Gennaio 2014 7:15 pm, Torbjorn Granlund ha scritto:
operating system support. Now, we suppress use of (some) gcc
sse-related options which trigger bad behaviour (via the acinclude.m4
GMP_GCC_PENTIUM4_SSE2) and in that context check of the OS handles XMM
(via
ni...@lysator.liu.se (Niels Möller) writes:
I think the conclusion was that volatile was not very useful for this
function, but if we add that later, does it make any sense to have const
volatile *ap? The intended meaning would be that writes are invalid, and
that no reads should be
Our configure logic for excluding XMM register use is flawed. We
should to keep SSE2 availability from XMM availability apart, since a
CPU which supports SSE2 will always handle SSE2+MMX, while XMM requires
operating system support. Now, we suppress use of (some) gcc
sse-related options which
Zimmermann Paul paul.zimmerm...@inria.fr writes:
the issue reported in September 2010 is still present:
Such things happens sometimes. It means that we volunteers did not have
enough spare time for making the GMP gift better in that specific
respect.
Torbjörn
Please encrypt, key id
bodr...@mail.dm.unipi.it writes:
Maybe our printf/repl-vsnprintf.c is not tested enough?
Oddly enough it is not even listed at e.g.,
https://gmplib.org/devel/lcov/hannahnbsd32v61/gmp/printf/index.html.
Of the existing 21 function in printf/ only 17 are there. Great
coverage analysis! :-(
Marc Glisse marc.gli...@inria.fr writes:
By the way, do we have a policy about breaking binary compatibility?
In this case, mixing old and new objects could result in crashes
(almost certainly at -O0, seldom at -O3). It should be possible to
prevent this issue by renaming __gmp_unary_expr
bodr...@mail.dm.unipi.it writes:
Well, it is wrapped with
#if ! HAVE_VSNPRINTF /* only need this file if we don't have vsnprintf */
[...]
#endif /* ! HAVE_VSNPRINTF */
so, on many systems it is not compiled at all... (and that's a reason why
it is less tested than other chunks
ni...@lysator.liu.se (Niels Möller) writes:
I see. In this particular case, I think the right gmp interface change
is to add mpn_urandomb and mpn_rrandomb (similar to current mpn_random
and mpn_random2, but with a randstate argument). If I understand this
correctly, the main obstacle is
ni...@lysator.liu.se (Niels Möller) writes:
Torbjorn Granlund t...@gmplib.org writes:
ni...@lysator.liu.se (Niels Möller) writes:
I see. In this particular case, I think the right gmp interface change
is to add mpn_urandomb and mpn_rrandomb (similar to current mpn_random
Marc Glisse marc.gli...@inria.fr writes:
We already have function mpz_array_init which encourages thinking of
I removed its docs the other day.
Torbjörn
___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel
ni...@lysator.liu.se (Niels Möller) writes:
This assumes that C++ allows initializers with arbitrary non-constant
expressions (does it?), and that we implement mpn_set_d.
The top-level file extract-dbl.c kind-of does that already.
Torbjörn
___
I noticed that we still have tests for traditional C varargs.h versus
ISO C90 stdarg.h everywhere a variying # of arguments are used.
Since we cleaned out KR stuff a few years back, we could require
stdarg.h without causing additional portability problems, right?
Torbjörn
Please encrypt, key id
I did a lot of cleanup changes today:
1. All LGPL copyright headers should now have the same layout, except
for file format mandated line prefixes.
2. Old KR varargs config checks and conditional code is now gone.
3. mpq_t now used everywhere in place of the old MP_RAT.
4. The old
ni...@lysator.liu.se (Niels Möller) writes:
Question is, when is it useful for our purposes? First example,
mpn_sec_add_1:
mp_limb_t
mpn_sec_add_1 (mp_limb_t *rp, mp_limb_t *ap, mp_size_t n, mp_limb_t b,
mp_ptr scratch)
{
scratch[0] = b;
MPN_ZERO
ni...@lysator.liu.se (Niels Möller) writes:
Torbjorn Granlund t...@gmplib.org writes:
* Make some other sec functions from Niels' list public?
Here's a first patch adding a couple of other functions. Benchmarking
and testing is missing (except that the sec_minvert tests still pass
bodr...@mail.dm.unipi.it writes:
Indeed. I pushed a fix.
Any comment about marking them also with __GMP_NOTHROW ?
Perhaps that too. I suppose __GMP_ATTRIBUTE_PURE should really be the
stronger ATTRIBUTE_CONST, except that we don't yet have any name space
clean way of doing that for
ni...@lysator.liu.se (Niels Möller) writes:
Torbjorn Granlund t...@gmplib.org writes:
I notice you make this non-public. Is it premature to make it part of
the public interface?
Pushed now, with declarations moved to gmp-h.in.
And now some 450 nightly builds have run
ni...@lysator.liu.se (Niels Möller) writes:
* Finalise and commit mpn_sec_minvert.
Here's a new version, including tests. Seems to work. I'll try to get
this committed fairly soon.
Nice!
I notice you make this non-public. Is it premature to make it part of
the public interface?
ni...@lysator.liu.se (Niels Möller) writes:
I just put the declarations together with the other mpn_sec_* functions.
I think it makes sense to make mpn_sec_div_*, mpn_sec_minvert and
mpn_sec_powm public together. Does mpn_sec_powm need more work (besides
the rename) before made public?
This is what I want to fix before the 5.2 release. Please remind me if
I have forgotten something.
* Strongly consider making mpn_sec_div_qr return high quotient limb and
write just nn-dn quotient limbs to qp area.
* Finalise and commit mpn_sec_minvert.
* Add some other sec functions from
ni...@lysator.liu.se (Niels Möller) writes:
As you can see, it depends on a couple of other functions,
mpn_sec_add_1, mpn_cnd_neg, mpn_cnd_swap, mpn_sec_eq_ui, which would
probably have to be written in assembly to ensure that they avoid
operations with branches or data-dependent timing.
I think the mpn_sec_ and mpn_cnd_ functions should never allocate any
memory. Instead, callers should pass all scratch areas to allow the use
of secure memory.
Or is this pointless? The stack frames may get sensitive data, as
determined by the compiler used. When we allocate (small) scratch
ni...@lysator.liu.se (Niels Möller) writes:
Create zero vector, invoke mpn_sub_n.
That doesn't make it conditional. And I see no obvious way to do
conditional negation on top of mpn_cnd_sub_n.
Oops.
Compute T = 2 x A using mpn_add_n or mpn_lshift.
Use mpn_cnd_sub_n with A, T as
ni...@lysator.liu.se (Niels Möller) writes:
Should work (except if T is computed mod B^n, one doesn't get the
correct carry out, but that isn't needed here). But it's a bit awkward,
I realise one needs some (straightforward) handling of carry out.
and this is a performacne critical
I suppose I already suggested that one computes a^{-1} mod b
as a^{b-1} mod b, using a plain old modexp.
I realise that this will be asymptotically slower, in this setting
O(n^3) vs O(n^2), but it ought have a much lower constant factor.
Torbjörn
___
Vincent Lefevre vinc...@vinc17.net writes:
On 2013-12-25 12:13:39 +0100, Marc Glisse wrote:
Oups, looks like I already asked about that:
https://gmplib.org/list-archives/gmp-bugs/2011-November/002443.html
and the reply was to try including tests.h before gmp-impl.h.
I'd say
Vincent Lefevre vinc...@vinc17.net writes:
I've tried to find something about that on Google, but couldn't
find anything. Any reference?
Perhaps ChangeLog has some history.
Torbjörn
___
gmp-devel mailing list
gmp-devel@gmplib.org
Your powerpc64le patch is now in the main GMP repo,
https://gmplib.org/repo/gmp/.
Thanks for your contribution!
(We're still waiting for any reaction from the FSF staff,
but we have decided to time out after a reasonable time.
Should some problem arise, we'll address it appropriately.)
Ulrich Weigand ulrich.weig...@de.ibm.com writes:
Testing cpp symbols for ABI version makes me a bit nervous. Such things
can easily get out-of-synch. It might be more resilient to check a
generated object.
Well, the _CALL_ELF check is what we use for all other packages that
Vasili Burdo vasili.bu...@gmail.com writes:
I implemented basecase multiplication and squaring for x86 using SSE2
instructions and Comba column-wise multiplication method.
On Ivy Bridge (Intel Core i7 3517U) multiplication 10-20% faster than
present GMP basecase MMX multiplication.
Ulrich Weigand uweig...@de.ibm.com writes:
this patch updates GMP to support the little-endian PowerPC64
platform (powerpc64le-linux). This requires two changes:
- Update configfsf.guess/sub to current upstream versions.
I think Niels volunteered to do that...
- Change
Just to make sure I start from the right spot: You're talking about
Hensel norm division here, right?
When Paul posted results in March, I thought your work was on plain old
Euclidean norm.
We (mainly I and Niels, I suppose) have spent much more time on
Euclidean norm division/mod that on the
GMP 4.3:
shell$ ./speed -p1 -s100-1 -f10 mpn_toom3_mul_n
overhead 0.2 secs, precision 1 units of 3.13e-10 secs, CPU
freq 3200.00 MHz
mpn_toom3_mul_n
100 0.05181
1000 0.000169392
1 0.005313959
100.159352000
GMP repo:
shell$ ./speed
I think I understand this issue now.
In the various toom functions, we suppress tests for recursive calls
which cannot happen when each function is invoked for the intended
range. These things are controlled by the relative TOOM threshold.
This makes tune/speed measurements look bad, but
Zimmermann Paul paul.zimmerm...@inria.fr writes:
Moreover GMP is using Schönhage-Strassen's algorithms, where the pointwise
multiplications are not negligible, thus we should have a ratio well above
2/3.
However in GMP 5.1.3 the ratio is around 2/3, and sometimes even below:
Any
Exact decision for the change? I'm not sure what you mean by 'decision'
there. If you're wondering about the _reason_ for the change (why we did
it), the answer is so that ASLR is applied not just to the code in shared
libraries but also the code in executables. If you're wondering
Philip Guenther guent...@gmail.com writes:
Ah, but you are, sorta. In OpenBSD 5.3, platforms where the compiler and
toolchain support were for robust for it were switched to build PIE
objects and executables by default. So yes, that object _is_ expected to
be position independent.
I am working on getting the GMP bignum library to work better on
OpenBSD.
With current GMP sources (GMP 5.0.x, 5.1.x, and development head) a
'fat' build will not work on amd64 under OpenBSD 5.3 and 5.4. With
older version of OpenBSD (I've tested 4.9, 5.0, 5.2) things work as
expected.
The
ni...@lysator.liu.se (Niels Möller) writes:
A long time ago, we choose an interface for sbpi1_div_qr which does
*not* store the most significant limb; instead it returns it. I think it
was the intention that a new top-level mpn_div_qr should follow that
convention, and not store the top
ni...@lysator.liu.se (Niels Möller) writes:
The interesting thing is that the next higher function, mpn_div_qr_1,
should return the high quotient limb separately.
I am not sure I agree. Please explain.
You're saying that en n-limb consecutive dividend should yield an
(n-1)-limb consecutive
I pushed initial C versions of these functions:
mpn_div_qr_1n_pi2
mpn_div_qr_1u_pi2
I have had these for a long time, judging from the file time stamps.
These accept n-limb dividends in a single consecutive operand and
generate n-limb quotients also in a consecutive operand. I now
Marc Glisse marc.gli...@inria.fr writes:
On the homepage gmplib.org:
Externally supported: High-level floating-point accurately rounding
arithmetic functions (mpfr). See the mpfr site for more
information. Starting with GMP 4.2, mpfr is released separately from
GMP. (New projects should
Regarding http://gmplib.org/devel/tm-date.html.
Some of you might have spotted build errors for solaris, with -fat.
This is due to a static allocation of their m4. I've worked around it,
so next build round should pass.
A much worse error happens with Intel Haswell under FreeBSD 8 ad 9; here
ni...@lysator.liu.se (Niels Möller) writes:
Torbjorn Granlund t...@gmplib.org writes:
* The code is no win for AMD k10/k8 (although close to 10 c/l might well be
possible)
I tried replacing one masking op by cmov, as you suggested. We then get
down to 11.25 c/l on K10. I put
I turned out the code was a bit slower on k8.
This patch changes that. With it applied, things takes 11 c/l on both
pipelines. This is also a 2 c/l improvement for piledriver.
I have not tested that this is correct. If you like the patch, please
consider putting the result in the k8 subdir.
I played more with the code, now trying to break the add-adc-sbb-cmov
chain, for the benefit of most Intel processors.
But I lack unit testing code for the function, making hacking quite
cumbersome. I don't feel safe hacking *any* GMP assembly code without
tests/devel/try.c's function and access
ni...@lysator.liu.se (Niels Möller) writes:
ni...@lysator.liu.se (Niels Möller) writes:
But sure, support also in try.c would be good.
Added now. Please have a look if it the changes are sane. I use the
second source for the uh input, and I added a DATA_DIV_QR_1 to get it in
the
ni...@lysator.liu.se (Niels Möller) writes:
And sure enough, it detects some bugs in the new assembly code. For size
n==1, there's a missing mov. I'll add that shortly. Then there's another
problem with n==2, which needs a bit more debugging.
Good. So now you have debugged the new try.c
I added data for the new code at http://gmplib.org/devel/asm.html.
There is a line for div_qr_1u_pi1 as well, since that will also be
needed. It might actually be more common that the divisor is not
normalised.
I should try to wrap up div_qr_1n_pi2 and div_qr_1u_pi2 as well, and
then add
ni...@lysator.liu.se (Niels Möller) writes:
Will try that. I think one could also try to delay the quotient store
one iteration, keeping Q1 in a register until the next iteration. Then
one gets rid of the
adc Q2,8(QP, UN, 8)
in the loop, using only a single store per
I looked at the logic following this:
sbb U2, U2 C 7 13
You negate the U2 copy in Q2. It seems that three adc by sbb
could avoid the neg.
I might also be possible to replace the early loop and stuff by cmov.
Note that the carry flag survives dec, although that causes a
ni...@lysator.liu.se (Niels Möller) writes:
Torbjorn Granlund t...@gmplib.org writes:
I think x86-64, x86-32, arm32, arm64, powerpc-64, sparc-64 matter.
Unfortunately, powerpc-64 (and -32) return these types onto the stack
via an implicit pointer.
Ok, I think I'll stick
ni...@lysator.liu.se (Niels Möller) writes:
On my core2 laptop:
$ ./speed -s 2-10,100,500 -C mpn_divrem_1.0x
mpn_div_qr_1.0x
overhead 6.13 cycles, precision 1 units of 8.33e-10 secs, CPU freq
1200.00 MHz
mpn_divrem_1.0x
ni...@lysator.liu.se (Niels Möller) writes:
ni...@lysator.liu.se (Niels Möller) writes:
(about using a small struct as return value)
If the caller is going to store the returned value directly in memory
anyway, there's little difference. And if the caller is going to operate
on
ni...@lysator.liu.se (Niels Möller) writes:
ni...@lysator.liu.se (Niels Möller) writes:
To get going, I've written C implementations of mpn_div_qr_1n_pi1 and
mpn_divf_qr1n_pi1, and made divrem_1 call them.
Below, also an mpn_div_qr_1, using these primitives (and with some
I agree with Niels (don't understand + not appropriate for
low-level...).
We should replace mpn_divexact_1 with code that:
(1) Uses Jebelean's trick with a Euclidean division working left-to-
right and a simultaneous Hensel division working right-to-left.
This is faster in the
Zimmermann Paul paul.zimmerm...@inria.fr writes:
we mean faster than GMP's conversion functions, but still using GMP for the
low-level operations.
Then please say so in the paper.
not only. For large operands we believe there is still room to improve our
code. In particular an
ni...@lysator.liu.se (Niels Möller) writes:
Another feature, which I looked into while ago without getting very far
with the loopmixer, is to make it understand associativity. I.e, try
reordering certain instructions with the same destination register, like
xor %r8, %rax
xor %r9,
Ondřej Bílka nel...@seznam.cz writes:
It is possible enchancement, but I am not yet at stage of calculating
register dependencies on jumps.
That's someting we do, but we only handle a simple jump-back for the
loop.
(That branch limitation is a slght problem for some division loops,
which
Ondřej Bílka nel...@seznam.cz writes:
I am writing a tool that might be useful, a simple optimizer of assembly
routines. You need to write a benchmark that measures performance and
prints elapsed time and assembly file. Currently it has two optimization
patterns, first is enclosing block
For the last few months, I have been working on writing and rewriting
basecase code for X64-64 processors. The result is now in the
mainline GMP repo.
The basecase code I have focused on is: mul_basecase, sqr_basecase,
mullo_basecase, and Hensel remainder via redc_1.
At the start of this
Mark Sofroniou ma...@wolfram.com writes:
Thanks. I wasn't completely sure what the right type was in all cases.
Most of the changes are to use mp_size_t instead of int - these are
the important ones. There are a couple (related to the variables K2 and K3)
that change unsigned int to
I fixed some typos using a program I had, plus some of the typos you
found.
--
Torbjörn
___
gmp-devel mailing list
gmp-devel@gmplib.org
http://gmplib.org/mailman/listinfo/gmp-devel
Ondřej Bílka nel...@seznam.cz writes:
Are leading ws all spaces or tabs followed by less than 8 spaces?
That's probably what we usually do.
Are there some form-feeds and are they useful?
I don't understand this question.
I understand that some hackers find whitespace consistency to be
I'd like to test GMP on an Intel Haswell CPU.
Could you perhaps offer a guest account for GMP use?
These CPUs have a new bignum-oriented instruction, MULX, which avoids
overwriting the carry flag. That should help GMP a bit, perhaps
significantly.
--
Torbjörn
Daniel Lichtblau d...@wolfram.com writes:
If you do not manage to locate them I can scan and send a pdf. (Least
I can do for someone who shared a room for two months with that
Torbjörn fellow..)
I started to write a reply, but decided against sending it after I read
this unprovoked
Daniel Lichtblau d...@wolfram.com writes:
I simply have no idea why you would choose to take such offense. If it
serves any purpose, it is one I quite fail to see. That said, I'll not
trouble you with further communication.
If you cannot assume a professional attitude on the GMP lists,
Hello Daniel!
We don't yet have any transform-only interface in GMP, but this will
probably change at some point.
The current FFT code uses coefficient rings mod 2^m+1, as per the
Schönhage-Strassen algorithm. In this algorithm, m = O(sqrt(n)) where n
= O(log(a) + log(b)) for multiplication of
Zimmermann Paul paul.zimmerm...@inria.fr writes:
thank you for the feedback. Yes the new curve is not everywhere optimal, but
the important thing is that it is much more regular, which is critical
for algorithms assuming that when we cut both numerator and divisor (for
a fixed-size
The ia64 mpn_divrem_2 bug reported today (and fixed yesterday...)
highlights some shortcomings of GMP testing.
For x86, x86_64 and (since GMP 5.1.2) arm32 we have calling conventions
checking code via tests/*call.asm and tests/*check.c. but this is then
invoked from tests/devel/try.
Flaws:
1.
ni...@lysator.liu.se (Niels Möller) writes:
Torbjorn Granlund t...@gmplib.org writes:
Should we move any of the mini-gmp changes to 5.1.2?
I think the following would make sense to include:
2013-02-25 Niels Möller ni...@lysator.liu.se
* mini-gmp/tests/t-double.c
Should we move any of the mini-gmp changes to 5.1.2?
--
Torbjörn
___
gmp-devel mailing list
gmp-devel@gmplib.org
http://gmplib.org/mailman/listinfo/gmp-devel
I think it is time for a 5.1.2 release, since we've found and fixed a
couple of bugs since the last release.
I am redirecting the nightly build scripts to use the 5.1 repo.
The main repository will thus be untested for a while.
Unless I hear protests, I'll make the new release towards the end of
Marc Glisse marc.gli...@inria.fr writes:
I need to backport a couple changes that I made soon after 5.1
branched, I'll try to do that soon...
Sorry, I had missed that.
I will not make the release until you have the time to address this.
--
Torbjörn
ni...@lysator.liu.se (Niels Möller) writes:
I think Newton analogues exist only when b is a power, not in general.
And the most important case is prime b.
I think it exists also for b can be factorised into prime powers...
I am not familiar with the Jebelean (or Möller!) criteria for
[I fixed the grammar in my self-quotations, hopefully not against some
netiqette]
We don't need to insist on keeping operands positive.
Hmm. In general, one needs to replace the largest number, to make
progres. But I guess in the case of many high bits being equal, it might
not
I started a web page on this: gmplib.org/devel/sec.html
Feel free to make changes as usual.
--
Torbjörn
___
gmp-devel mailing list
gmp-devel@gmplib.org
http://gmplib.org/mailman/listinfo/gmp-devel
Richard Henderson r...@twiddle.net writes:
Building on the copyi that tege committed the other day, use neon for
the logical operations too.
I committed the 128 bit version to arm/neon, making it become used for
all Neon capable processors. I put it there since it is a speedup for
A9 as
I've been busy improving addmul_1 and submul_1 for Cortex-A15 lately.
It turned out to be possible to reach 2 c/l for addmul_1 using plain
(non-SIMD) operations; such code is in the repo since a few days. The
trick was to move the recurrency path away from multiply-accumulate
instructions, and
David Miller da...@davemloft.net writes:
From: Torbjorn Granlund t...@gmplib.org
Date: Tue, 16 Apr 2013 14:43:58 +0200
If we cannot make an configure test, we need to know if there is a
release where the assembler can be trusted.
After some discussions with my Oracle contact, I
Where to go from here? If we want to clean up some old SPARC code,
then we have learnt that we to test the result on several key platforms.
We also don't want to create slower code, unless the old code is clearly
broken (in more than a hypothetical way).
For the 64bit case, it is safe to assume
1 - 100 of 335 matches
Mail list logo