Re: GMP 6 the incomatible GMP?
Most people seem to prefer splitting the alloc data over a field in the structure and a field before mp_d[0], while I think that'd be slower as well as more complex than the "floating-point" coding I have in mind. My proposal will make allocations somewhat inaccurate, but I'll make sure it is never more that one percent of the allocation. I'll try my proposal at some point, to evaluate its effects. I might need to introduce a few more macros in the process, which should then make it easy to try various other alloc representeation styles. -- Torbjörn ___ gmp-devel mailing list gmp-devel@gmplib.org http://gmplib.org/mailman/listinfo/gmp-devel
Re: GMP 6 the incomatible GMP?
Marc Glisse writes: Experimentally (for me, others can have wildly different use cases), what really helps for small nums isn't to save space, it is to not allocate at all, for instance storing a few limbs in the base type and only allocating when that's not enough (recycling storage can work even better, but is for very specific uses and may not make sense for a generic library). There are plans to add a bignum type to the C++ standard, and people are discussing whether it should be part of the specification that constructing small numbers (not even just 0) doesn't use dynamic allocation. Now the target users of this bignum type and GMP are not exactly the same (bignum would be used a lot by people who only need int but want to be safe), so it makes sense if the best solutions differ. I am sure this is often a god idea, perhaps it would be good also in GMP. My main reason for not originally doing it like that is code complexity. Having three operands (thing mpz_add) each with two possible representations, will give some alternative execution paths. Another problem is mpz_t size. It is currently 16 bytes on 64-bit machines (and 12 bytes on on 32-bit machines). Even if only one bit of these bytes are used for typing, we ain't getting much headroom from the remaining 127 and 95 bits, respectively. Making the structures larger is to costly. -- Torbjörn ___ gmp-devel mailing list gmp-devel@gmplib.org http://gmplib.org/mailman/listinfo/gmp-devel
Re: GMP 6 the incomatible GMP?
From: Marc Glisse Date: Wed, 9 Jan 2013 00:14:15 +0100 (CET) > PS: thanks a lot for your sparc assembly work! No problem, it's actually fun, like solving sudoku puzzles, for me. ___ gmp-devel mailing list gmp-devel@gmplib.org http://gmplib.org/mailman/listinfo/gmp-devel
Re: GMP 6 the incomatible GMP?
On Tue, 8 Jan 2013, David Miller wrote: From: Torbjorn Granlund Date: Tue, 08 Jan 2013 12:50:01 +0100 Marc Glisse writes: Then we might as well always put it in _mp_d[-1], no? IIRC that's what mpfr does. I see some problems with that: * There is a serial memory dependency problem, making the allocation check at least one more cache latency away. I don't think OoO execution will be able to hide that, as this is used. My intuition would be that this won't matter, but you have more experience than me with that. I don't know how amicable you are to an idea like this, but if user applications can only treat pointers to these objects as opaque then you can encode information in the lowest bits of the pointer. For example, you could encode in the lowest bit whether the _mp_d[-1] thing is being done or not. Then it's at worst a mis-predicted branch in the allocation check. That seems similar (not saying one is better than the other) to what Niels was proposing, where he was encoding it as a special value of the bitfield _mp_alloc (so basically in a few bits of _mp_size) instead of _mp_d. What is done by bignum libraries in many languages is that the base object is a single pointer, and depending on the lowest (?) bit it is interpreted either as a pointer to (alloc+)size+limb[] or as a small integer. But again, their intended main use is with numbers that would fit in an int. -- Marc Glisse PS: thanks a lot for your sparc assembly work! ___ gmp-devel mailing list gmp-devel@gmplib.org http://gmplib.org/mailman/listinfo/gmp-devel
Re: GMP 6 the incomatible GMP?
From: Torbjorn Granlund Date: Tue, 08 Jan 2013 12:50:01 +0100 > Marc Glisse writes: > > Then we might as well always put it in _mp_d[-1], no? IIRC that's what > mpfr does. > > I see some problems with that: > > * There is a serial memory dependency problem, making the allocation > check at least one more cache latency away. I don't think OoO > execution will be able to hide that, as this is used. I don't know how amicable you are to an idea like this, but if user applications can only treat pointers to these objects as opaque then you can encode information in the lowest bits of the pointer. For example, you could encode in the lowest bit whether the _mp_d[-1] thing is being done or not. Then it's at worst a mis-predicted branch in the allocation check. ___ gmp-devel mailing list gmp-devel@gmplib.org http://gmplib.org/mailman/listinfo/gmp-devel
Re: GMP 6 the incomatible GMP?
Torbjorn Granlund writes: > * There is a serial memory dependency problem, making the allocation > check at least one more cache latency away. I don't think OoO > execution will be able to hide that, as this is used. And that should matter for small bignums only, right, if we care about relative overhead? Then, it's desirable to have *small* allocs directly in the mpz_t struct. > * There is a slight type error mixing limbs and sizes. The ASL patch by > Per Olofsson will require many limbs to represent a size. I think this can be solved in a type safe way, no matter which of mp_limb_T and mp_size_t is the larger type. One would need to use something more complicated than _mp_d[-1] (and this shouold be hidden in some macro anyay), involving something like struct mp_storage { mp_size_t alloc; mp_limb_t limbs[1]; /* Variable size */ }; Use this when allocating, pass around pointers to the limbs member, and to access the alloc field, do some casting and a subtraction of offsetof(struct mp_storage, limbs). Regards, /Niels -- Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26. Internet email is subject to wholesale government surveillance. ___ gmp-devel mailing list gmp-devel@gmplib.org http://gmplib.org/mailman/listinfo/gmp-devel
Re: GMP 6 the incomatible GMP?
Marc Glisse writes: Then we might as well always put it in _mp_d[-1], no? IIRC that's what mpfr does. I see some problems with that: * There is a serial memory dependency problem, making the allocation check at least one more cache latency away. I don't think OoO execution will be able to hide that, as this is used. * There is a slight type error mixing limbs and sizes. The ASL patch by Per Olofsson will require many limbs to represent a size. But this is a testing feature, which should not get in they way of an otherwise desirable change. -- Torbjörn ___ gmp-devel mailing list gmp-devel@gmplib.org http://gmplib.org/mailman/listinfo/gmp-devel
Re: GMP 6 the incomatible GMP?
On Tue, 8 Jan 2013, Torbjorn Granlund wrote: > ni...@lysator.liu.se (Niels M?ller) writes: > > Hmm. I guess one could go down to only 8 bits or so for the alloc field, > and for larger allocations, store the number of allocated limbs at the > head of the limb array (in the case limb size is not artificially small, > it could be just _mp_d[-1]). > > That's another idea. > > Do you think 48 bits (or 47 bits absolute value) for _mp_size will be > large enough for the foreseeable future (say, 10 years)? > > I think that will be sufficient for some hundred years, since any > super-linear algorithm applied to an operand of 2^53 bits (2^47 64-bit > limbs) would hardly ever terminate even with a multi-THz CPU. > > There are also other tweaks discussed for mpz_t, to allow initialization > without allocation. > > Yep, we should do that too. I cannot recall the details, but I assume > that this would not create an incompatibility? Another idea is to embed limbs for small numbers into __mpz_struct itself (I was playing with that idea to reduce the number of allocations done by ISL used by GCC now). Sort of have a magic _mp_alloc number (zero?) that tells GMP that everything after _mp_alloc are the actual limbs. Thus have typedef struct { int _mp_alloc; int _mp_size; union { mp_libm_t *_mp_d; mp_libm_t embed_limbs[N]; }; } __mpz_struct; with suitable N (what's a suitable number of limbs to special-case, given that limb-size is target dependent? One limb would be easiest (just use the space of the _mp_d pointer), extra "half" limbs would be ugly I suppose. ISL has lots of arrays of __mpz_structs which seem to be often copied around and the actual numbers are usually small (would fit into 64bits). Doing this wouldn't create an incompatibility I think, but if you re-org things anyway maybe there is a better opportunity to layout things. Thanks, Richard. -- Richard Biener SUSE / SUSE Labs SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746 GF: Jeff Hawn, Jennifer Guild, Felix Imend ___ gmp-devel mailing list gmp-devel@gmplib.org http://gmplib.org/mailman/listinfo/gmp-devel
Re: GMP 6 the incomatible GMP?
On Tue, 8 Jan 2013, Niels Möller wrote: Marc Glisse writes: Then we might as well always put it in _mp_d[-1], no? IIRC that's what mpfr does. That will increase overhead (both allocation and processing, I think) for small bignums, which may be undesirable. But I don't really know how significant that is. I am not sure about processing. The allocated size is hidden behind the pointer, but you are already dereferencing that pointer anyway, and you are saving on the bit manipulations to separate size from alloc. It does indeed waste a bit of heap space (not sure that's significant), which can also increase the processing in malloc/free. Experimentally (for me, others can have wildly different use cases), what really helps for small nums isn't to save space, it is to not allocate at all, for instance storing a few limbs in the base type and only allocating when that's not enough (recycling storage can work even better, but is for very specific uses and may not make sense for a generic library). There are plans to add a bignum type to the C++ standard, and people are discussing whether it should be part of the specification that constructing small numbers (not even just 0) doesn't use dynamic allocation. Now the target users of this bignum type and GMP are not exactly the same (bignum would be used a lot by people who only need int but want to be safe), so it makes sense if the best solutions differ. -- Marc Glisse ___ gmp-devel mailing list gmp-devel@gmplib.org http://gmplib.org/mailman/listinfo/gmp-devel
Re: GMP 6 the incomatible GMP?
Marc Glisse writes: > Then we might as well always put it in _mp_d[-1], no? IIRC that's what > mpfr does. That will increase overhead (both allocation and processing, I think) for small bignums, which may be undesirable. But I don't really know how significant that is. Regards, /Niels -- Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26. Internet email is subject to wholesale government surveillance. ___ gmp-devel mailing list gmp-devel@gmplib.org http://gmplib.org/mailman/listinfo/gmp-devel
Re: GMP 6 the incomatible GMP?
On Tue, 8 Jan 2013, Niels Möller wrote: Torbjorn Granlund writes: The idea is to instead make a 64-bit bitfield of the _mp_size and _mp_alloc fields. One would give _mp_size about 48 bits, and _mp_alloc the remaining 16. The alloc field would lose granularity, but should code small sizes exactly. Hmm. I guess one could go down to only 8 bits or so for the alloc field, and for larger allocations, store the number of allocated limbs at the head of the limb array (in the case limb size is not artificially small, it could be just _mp_d[-1]). Then we might as well always put it in _mp_d[-1], no? IIRC that's what mpfr does. -- Marc Glisse ___ gmp-devel mailing list gmp-devel@gmplib.org http://gmplib.org/mailman/listinfo/gmp-devel
Re: GMP 6 the incomatible GMP?
On Tue, Jan 8, 2013 at 12:28 AM, Torbjorn Granlund wrote: > ni...@lysator.liu.se (Niels Möller) writes: > > Hmm. I guess one could go down to only 8 bits or so for the alloc field, > and for larger allocations, store the number of allocated limbs at the > head of the limb array (in the case limb size is not artificially small, > it could be just _mp_d[-1]). > > That's another idea. > > Do you think 48 bits (or 47 bits absolute value) for _mp_size will be > large enough for the foreseeable future (say, 10 years)? > > I think that will be sufficient for some hundred years, since any > super-linear algorithm applied to an operand of 2^53 bits (2^47 64-bit > limbs) would hardly ever terminate even with a multi-THz CPU. Well, pi has been computed to 10 trillion digits, which is 2^45 bits. Extrapolating the records from the last 60 years, a 2^53-bit multiplication can be expected to occur within 30 years, though certainly with highly parallel hardware, not a single CPU. Actually, NTTs of that size ought to be feasible on the current generation of top supercomputers, in theory at least. Of course 10+ years from now you could just release GMP 7 and break backwards compatibility again ;-) Fredrik ___ gmp-devel mailing list gmp-devel@gmplib.org http://gmplib.org/mailman/listinfo/gmp-devel
Re: GMP 6 the incomatible GMP?
ni...@lysator.liu.se (Niels Möller) writes: Hmm. I guess one could go down to only 8 bits or so for the alloc field, and for larger allocations, store the number of allocated limbs at the head of the limb array (in the case limb size is not artificially small, it could be just _mp_d[-1]). That's another idea. Do you think 48 bits (or 47 bits absolute value) for _mp_size will be large enough for the foreseeable future (say, 10 years)? I think that will be sufficient for some hundred years, since any super-linear algorithm applied to an operand of 2^53 bits (2^47 64-bit limbs) would hardly ever terminate even with a multi-THz CPU. There are also other tweaks discussed for mpz_t, to allow initialization without allocation. Yep, we should do that too. I cannot recall the details, but I assume that this would not create an incompatibility? -- Torbjörn ___ gmp-devel mailing list gmp-devel@gmplib.org http://gmplib.org/mailman/listinfo/gmp-devel
Re: GMP 6 the incomatible GMP?
Torbjorn Granlund writes: > The idea is to instead make a 64-bit bitfield of the _mp_size and > _mp_alloc fields. One would give _mp_size about 48 bits, and _mp_alloc > the remaining 16. The alloc field would lose granularity, but should > code small sizes exactly. Hmm. I guess one could go down to only 8 bits or so for the alloc field, and for larger allocations, store the number of allocated limbs at the head of the limb array (in the case limb size is not artificially small, it could be just _mp_d[-1]). Do you think 48 bits (or 47 bits absolute value) for _mp_size will be large enough for the foreseeable future (say, 10 years)? There are also other tweaks discussed for mpz_t, to allow initialization without allocation. Regards, /Niels -- Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26. Internet email is subject to wholesale government surveillance. ___ gmp-devel mailing list gmp-devel@gmplib.org http://gmplib.org/mailman/listinfo/gmp-devel
GMP 6 the incomatible GMP?
We have discussed, perhaps mostly off-list, to fix several nuisances in the library that will break binary compatibility. There is actually a terse page about it (linked from the developer's corner): https://gmplib.org/devel/incompatibility.html I would like to lift the precision limit (for mpf) and size limit (for mpz, mpq) by changing how mpz_t (etc) works. We could simply change the int fields to size_t, but that would make the structure a lot larger. Programs that need to juggle lots of mostly small GMP numbers would suffer. The idea is to instead make a 64-bit bitfield of the _mp_size and _mp_alloc fields. One would give _mp_size about 48 bits, and _mp_alloc the remaining 16. The alloc field would lose granularity, but should code small sizes exactly. I haven't worked out the details, but I think this can be done without overhead for small allocations (say, up to 255 limbs) and that for larger sizes the overhead will be negligible compared to the time used for the computation. With clever coding, a small result which is about to be written to an operand which has previously been large enough to have had its alloc field coded for large sizes, a plain comparison like needed_alloc < ALLOC(r) would return true. There will only be false positives for larger needed allocation, where a reallocation will be assumed to be needed, but then decoding the _mp_alloc field would conclude new allocation wasn't really needed. Why should we consider this? Well, we today cannot handle operands > 2^37, which is hardly "arithmetic without limitations" on today's computers. And the new small primes multiplication code (the latest version at http://gmplib.org:8000/gcd-nisse/) will boost huge multiplication enough to make operands of 2^40 bits feasible. (It will be challening to test GMP for greater sizes, since one operand of 2^37 is 16 GiB of RAM.) -- Torbjörn ___ gmp-devel mailing list gmp-devel@gmplib.org http://gmplib.org/mailman/listinfo/gmp-devel