Re: [mpir-devel] Re: mpn_lshift alignment bug

2010-01-03 Thread Bill Hart
I've made the proposed improvement to lshift.asm the subject of trac ticket #271. At present issuing a release is being held up by me not setting up the necessary infrastructure to fix our autotools installation, which is broken. But I anticipate having it done in the next few weeks, as I find the

Re: [mpir-devel] Re: mpn_lshift alignment bug

2009-12-21 Thread Bill Hart
2009/12/21 Bill Hart : > 2009/12/21 Dan Grayson : >> Right.  The proposal would be to declare that a bug.  Then we would >> have to fix it.  Fixing it would mean rewriting mpn_lshift so it >> crashes or returns the right answer with misaligned limbs.  If it had >> been written that way in the first

Re: [mpir-devel] Re: mpn_lshift alignment bug

2009-12-21 Thread Bill Hart
x86_64 was introduced less than 10 years ago. It is likely that the assumptions you made then were valid then. Note that on 32 bit machines you only need 4 byte alignment. It is only the x86_64 ABI which requires alignment to 8 bytes. Bill. 2009/12/21 Dan Grayson : > The memory allocator provided

Re: [mpir-devel] Re: mpn_lshift alignment bug

2009-12-21 Thread Bill Hart
2009/12/21 Dan Grayson : > Right.  The proposal would be to declare that a bug.  Then we would > have to fix it.  Fixing it would mean rewriting mpn_lshift so it > crashes or returns the right answer with misaligned limbs.  If it had > been written that way in the first place, we wouldn't be having

Re: [mpir-devel] Re: mpn_lshift alignment bug

2009-12-21 Thread Bill Hart
sizeof(void *) will be fine on most systems including all the important ones we support, as MPIR usually tries to select a limb which is either long or long long depending on what is available, and this is the size of a pointer on all the important systems I am aware of. Bill. 2009/12/21 Dan Gray

Re: [mpir-devel] Re: mpn_lshift alignment bug

2009-12-21 Thread Bill Hart
But as I pointed out. Fixing this "bug" would mean more than rewriting lshift. We'd have to potentially rewrite all the functions in MPIR. Furthermore we'd have to rewrite the entire test suite to certify that each and every function obeyed these new rules. Then we'd have to figure out when the C c

[mpir-devel] Re: mpn_lshift alignment bug

2009-12-21 Thread Dan Grayson
The memory allocator provided with Singular-Factory is in the file libcfmem.a, and the routines are called getBlock, freeBlock, and reallocBlock. I copied the routines more than 10 years ago and continue to use the modified copy. I may even have destroyed 8 byte alignment then while optimizing it

[mpir-devel] Re: mpn_lshift alignment bug

2009-12-21 Thread Dan Grayson
Right. The proposal would be to declare that a bug. Then we would have to fix it. Fixing it would mean rewriting mpn_lshift so it crashes or returns the right answer with misaligned limbs. If it had been written that way in the first place, we wouldn't be having this discussion, because I would

[mpir-devel] Re: mpn_lshift alignment bug

2009-12-20 Thread Cactus
On Dec 20, 8:32 pm, Dan Grayson wrote: > I don't see how your "make check" routine could ever even come into > contact with my custom allocator.  Also, it's probably not worth your > while to try to distinguish a custom allocator from your native > allocator at run time, as it will waste time. >

Re: [mpir-devel] Re: mpn_lshift alignment bug

2009-12-20 Thread Bill Hart
2009/12/20 Dan Grayson : > I don't see how your "make check" routine could ever even come into > contact with my custom allocator. True. > Also, it's probably not worth your > while to try to distinguish a custom allocator from your native > allocator at run time, as it will waste time. Agreed.

[mpir-devel] Re: mpn_lshift alignment bug

2009-12-20 Thread Dan Grayson
I don't see how your "make check" routine could ever even come into contact with my custom allocator. Also, it's probably not worth your while to try to distinguish a custom allocator from your native allocator at run time, as it will waste time. I suggest checking every pointer returned by the a

[mpir-devel] Re: mpn_lshift alignment bug

2009-12-11 Thread Cactus
On Dec 11, 5:48 pm, Bill Hart wrote: > OK, I have now documented the requirements in the relevant section of > the documentation. See here for a preliminary version: > > http://sage.math.washington.edu/home/wbhart/mpir-ac/doc/mpir.pdf(note > the date on the title page is incorrect). > > I have a

Re: [mpir-devel] Re: mpn_lshift alignment bug

2009-12-11 Thread Bill Hart
2009/12/11 Bill Hart : > OK, I have now documented the requirements in the relevant section of > the documentation. See here for a preliminary version: > > http://sage.math.washington.edu/home/wbhart/mpir-ac/doc/mpir.pdf (note > the date on the title page is incorrect). > > I have also added all th

Re: [mpir-devel] Re: mpn_lshift alignment bug

2009-12-11 Thread Bill Hart
OK, I have now documented the requirements in the relevant section of the documentation. See here for a preliminary version: http://sage.math.washington.edu/home/wbhart/mpir-ac/doc/mpir.pdf (note the date on the title page is incorrect). I have also added all the papers which we have used, and Pe

[mpir-devel] Re: mpn_lshift alignment bug

2009-12-11 Thread Cactus
On Dec 11, 5:12 pm, Bill Hart wrote: > Oh OK!! I've been looking at the Core2 file. Sorry. In fact that > instruction is used in the K10 and atom files. > > However, > > a) The K10 file passes the try test code on Selmer, which is a K10 > > b) If the data was not aligned, a segfault would occur,

Re: [mpir-devel] Re: mpn_lshift alignment bug

2009-12-11 Thread Bill Hart
Oh OK!! I've been looking at the Core2 file. Sorry. In fact that instruction is used in the K10 and atom files. However, a) The K10 file passes the try test code on Selmer, which is a K10 b) If the data was not aligned, a segfault would occur, not a wrong answer. Bill. 2009/12/11 Bill Hart : >

Re: [mpir-devel] Re: mpn_lshift alignment bug

2009-12-11 Thread Bill Hart
Right, but the try test would fail in that case if it wasn't handled properly, as it definitely tests alignments on alternate limbs. I would consider it a bug if MPIR used this instruction on data which was not aligned. It should have special code to handle the unaligned case, or use the MOVDQU ins

Re: [mpir-devel] Re: mpn_lshift alignment bug

2009-12-11 Thread Jason Martin
I believe that the alignment needs to be 16 bytes, not just 8 bytes for the MOVDQA instruction. Jason Worth Martin Asst. Professor of Mathematics http://www.math.jmu.edu/~martin On Fri, Dec 11, 2009 at 9:31 AM, Cactus wrote: > > > On Dec 11, 2:11 pm, Bill Hart wrote: >> omalloc is easy to fix

Re: [mpir-devel] Re: mpn_lshift alignment bug

2009-12-11 Thread Bill Hart
I checked the code for Boehm's gc and on x86_64 all pointers are aligned to 8 byte boundaries (see the files gcconfig.h for x86_64 and the definition of ALIGNMENT, gc_priv.h for the definition of BYTES_TO_WORDS, ROUNDED_UP_WORDS and ALIGNED_WORDS. One assumes that this code is actually #included an

[mpir-devel] Re: mpn_lshift alignment bug

2009-12-11 Thread Cactus
On Dec 11, 2:11 pm, Bill Hart wrote: > omalloc is easy to fix. Just configure it with --with-align=8 on 64 > bit systems. There are other configure options for how alignment is > done. In fact it should be configured this way by the build system on 64-bit systems: #ifndef OM_ALIGN_8 /* define

Re: [mpir-devel] Re: mpn_lshift alignment bug

2009-12-11 Thread Bill Hart
omalloc is easy to fix. Just configure it with --with-align=8 on 64 bit systems. There are other configure options for how alignment is done. Bill. 2009/12/11 Cactus : > > > On Dec 10, 9:13 pm, Dan Grayson wrote: >> Well, Macaulay2 comes with thousands of tests, and only one failed, >> deep insi

[mpir-devel] Re: mpn_lshift alignment bug

2009-12-11 Thread Cactus
On Dec 10, 9:13 pm, Dan Grayson wrote: > Well, Macaulay2 comes with thousands of tests, and only one failed, > deep inside a D-modules computation.  It took a lot of single stepping > to figure it out.  Part of the problem is that we have two custom > memory allocators, Boehm's gc and one copied

[mpir-devel] Re: mpn_lshift alignment bug

2009-12-11 Thread Cactus
On Dec 11, 2:38 am, Jason Martin wrote: > Dan, thanks for this bug report.  I could also see that it probably > took you many many tedious hours to find this bug.  Amazing work!! > Thank you sooo much to taking the time to do it!!! > > This is a really subtle bug.  I just checked that malloc doc

Re: [mpir-devel] Re: mpn_lshift alignment bug

2009-12-10 Thread Bill Hart
2009/12/11 Bill Hart : > There's still the question of how we have people who write garbage > allocators recognise that this is a bug. > Just in case there is any doubt. Yes, this was only a joke. > Bill. -- You received this message because you are subscribed to the Google Groups "mpir-devel"

Re: [mpir-devel] Re: mpn_lshift alignment bug

2009-12-10 Thread Bill Hart
It's definitely not that simple, unfortunately. Jason's assembler code optimises the leadins of all the functions. Even a single cycle difference means the entire function needs to be reoptimised. Given that there are now scores of assembler functions, you are talking about months of computing ti

Re: [mpir-devel] Re: mpn_lshift alignment bug

2009-12-10 Thread Jason Martin
Dan, thanks for this bug report. I could also see that it probably took you many many tedious hours to find this bug. Amazing work!! Thank you sooo much to taking the time to do it!!! This is a really subtle bug. I just checked that malloc documentation on several Linux distros, Mac OS X, BSD,

Re: [mpir-devel] Re: mpn_lshift alignment bug

2009-12-10 Thread Bill Hart
There's still the question of how we have people who write garbage allocators recognise that this is a bug. Bill. 2009/12/10 Cactus : > > > On Dec 10, 10:20 pm, Bill Hart wrote: >> Hi Dan, >> >> indeed I had inferred that you had done a *lot* of work to track this bug >> down. >> >> It's not im

[mpir-devel] Re: mpn_lshift alignment bug

2009-12-10 Thread Cactus
On Dec 10, 10:20 pm, Bill Hart wrote: > Hi Dan, > > indeed I had inferred that you had done a *lot* of work to track this bug > down. > > It's not immediately clear how we can cause an error message to be > output if a custom allocator returns non-aligned pointers. > > Two possibilities spring

Re: [mpir-devel] Re: mpn_lshift alignment bug

2009-12-10 Thread Bill Hart
By the way, which allocator was it? I did not know about the Singular one and had assumed it was Boehm. Bill. 2009/12/10 Bill Hart : > Hi Dan, > > indeed I had inferred that you had done a *lot* of work to track this bug > down. > > It's not immediately clear how we can cause an error message to

Re: [mpir-devel] Re: mpn_lshift alignment bug

2009-12-10 Thread Bill Hart
Hi Dan, indeed I had inferred that you had done a *lot* of work to track this bug down. It's not immediately clear how we can cause an error message to be output if a custom allocator returns non-aligned pointers. Two possibilities spring to mind: 1) Introduce a make check test which checks tha

[mpir-devel] Re: mpn_lshift alignment bug

2009-12-10 Thread Dan Grayson
That's a fine decision, but now it becomes super important to help developers discover that their custom allocator is breaking the rule, if it does. It took me many hours of single-stepping to discover it. It would have saved me a lot of time if the program had aborted with a suitable error messag

[mpir-devel] Re: mpn_lshift alignment bug

2009-12-10 Thread Dan Grayson
Well, Macaulay2 comes with thousands of tests, and only one failed, deep inside a D-modules computation. It took a lot of single stepping to figure it out. Part of the problem is that we have two custom memory allocators, Boehm's gc and one copied from Singular-Factory long ago. The latter is us

[mpir-devel] Re: mpn_lshift alignment bug

2009-12-10 Thread Cactus
On Dec 10, 8:09 pm, Bill Hart wrote: > In reality it is rarely a problem for applications, as the ISA usually > only catches people out for SSE stuff, and then the result is a > segfault, which they'll eventually track down. Also, as portable > generic C needs to make no assumption about the end

Re: [mpir-devel] Re: mpn_lshift alignment bug

2009-12-10 Thread Bill Hart
In reality it is rarely a problem for applications, as the ISA usually only catches people out for SSE stuff, and then the result is a segfault, which they'll eventually track down. Also, as portable generic C needs to make no assumption about the endianness of the CPU then good C code is rarely wr

[mpir-devel] Re: mpn_lshift alignment bug

2009-12-10 Thread Cactus
On Dec 10, 7:09 pm, Bill Hart wrote: > In fact it seems there is plenty of documentary evidence on the web > that GMP also expects 8 byte aligned memory locations for 8 byte limb > arrays. > > Apparently some of the gcc optimisations also expect this (an array of > unsigned longs on a 64 bit mac

Re: [mpir-devel] Re: mpn_lshift alignment bug

2009-12-10 Thread Bill Hart
In fact the x86_64 ABI definition requires that unsigned longs are 8 byte aligned: http://www.x86-64.org/documentation/abi-0.99.pdf (page 12). This is why gcc can make this assumption. Of course different vendors implement the ABI differently, and usually drop the requirement. Bill. 2009/12/10

Re: [mpir-devel] Re: mpn_lshift alignment bug

2009-12-10 Thread Bill Hart
In fact it seems there is plenty of documentary evidence on the web that GMP also expects 8 byte aligned memory locations for 8 byte limb arrays. Apparently some of the gcc optimisations also expect this (an array of unsigned longs on a 64 bit machine is assumed to have 8 byte aligned addresses -

Re: [mpir-devel] Re: mpn_lshift alignment bug

2009-12-10 Thread Bill Hart
Yeah I was wrong! See my other response. 2009/12/10 Cactus : > > > On Dec 10, 5:55 pm, Bill Hart wrote: >> But then why test for non-aligned limbs? >> >> Currently try test for all possible byte alignments! Or at least it is >> supposed to. I'm still not sure I see what is going wrong with try. T

Re: [mpir-devel] Re: mpn_lshift alignment bug

2009-12-10 Thread Bill Hart
Yeah, I see why the test code passes. Here is how it does alignment. Basically it sets the pointer to the source region as follows: s[i].p = s[i].region.ptr + s[i].align; Here the s[i].align is supposed to take care of the alignment. Of course the type of s[i].p can be determined from the defini

[mpir-devel] Re: mpn_lshift alignment bug

2009-12-10 Thread Cactus
On Dec 10, 5:55 pm, Bill Hart wrote: > But then why test for non-aligned limbs? > > Currently try test for all possible byte alignments! Or at least it is > supposed to. I'm still not sure I see what is going wrong with try. To > me this is still a puzzle. Are you sure about this? I have run t

[mpir-devel] Re: mpn_lshift alignment bug

2009-12-10 Thread Cactus
On Dec 10, 5:15 pm, Cactus wrote: > On Dec 10, 3:56 pm, Dan Grayson wrote: > > > > > > > He could also mean that the machine instructions that move 8 byte > > words to vector registers would segfault if you tried to use them on > > nonaligned data, and that he tried to write mpn_lshift to try t

Re: [mpir-devel] Re: mpn_lshift alignment bug

2009-12-10 Thread Bill Hart
But then why test for non-aligned limbs? Currently try test for all possible byte alignments! Or at least it is supposed to. I'm still not sure I see what is going wrong with try. To me this is still a puzzle. I didn't actually look at the assembler yet. I am quite sure it *is* broken, for some s

[mpir-devel] Re: mpn_lshift alignment bug

2009-12-10 Thread Cactus
On Dec 10, 3:56 pm, Dan Grayson wrote: > He could also mean that the machine instructions that move 8 byte > words to vector registers would segfault if you tried to use them on > nonaligned data, and that he tried to write mpn_lshift to try to avoid > doing that.  Those are the lines of code I

Re: [mpir-devel] Re: mpn_lshift alignment bug

2009-12-10 Thread Bill Hart
Ah, ok, that makes more sense. Yeah the SSE stuff requires aligned data. Looking into the test code, I can't figure out why it doesn't fail on Core2 with the current test code, which appears to be using all possible alignments. (I've just run it again for some time and it doesn't fail.) I still h

[mpir-devel] Re: mpn_lshift alignment bug

2009-12-10 Thread Dan Grayson
He could also mean that the machine instructions that move 8 byte words to vector registers would segfault if you tried to use them on nonaligned data, and that he tried to write mpn_lshift to try to avoid doing that. Those are the lines of code I pointed to in the original report. It would be a

Re: [mpir-devel] Re: mpn_lshift alignment bug

2009-12-10 Thread Bill Hart
Actually, if you are preparing binaries of Macaulay the normal thing would be to use --enable-fat. But for core2, this will specifically pick the broken core2 code in your case. So, in actual fact, the --build=x86_64 would be the only safe option if you are using custom allocation, at the moment an

Re: [mpir-devel] Re: mpn_lshift alignment bug

2009-12-10 Thread Bill Hart
Either way, I believe the test code alternately uses both uniformly distributed numbers and numbers with long strings of zeros and ones. Actually, I am embarrassed to say that I don't actually know if it allows the top bit to be zero, or not. Bill. 2009/12/10 Dan Grayson : > I think the bug is ac

[mpir-devel] Re: mpn_lshift alignment bug

2009-12-10 Thread Dan Grayson
... oops, I was implicitly referring to mpn/tdiv_qr.c, which counts leading bits before shifting by that amount; probably mpn_lshift itself has no such dependency. On Dec 10, 8:28 am, Dan Grayson wrote: > I think the bug is activated only when the number of lead zero bits in > the denominator is

[mpir-devel] Re: mpn_lshift alignment bug

2009-12-10 Thread Dan Grayson
I think the bug is activated only when the number of lead zero bits in the denominator is in a certain narrow range, since that number determines by how much to shift, but I haven't explored that carefully. Therefore, for tests involving alignment, picking integers with random bits is not good (if

[mpir-devel] Re: mpn_lshift alignment bug

2009-12-10 Thread Dan Grayson
Yup, that's how I was going to work around it. Actually, I should have been doing it anyway, because I'm trying to prepare distributions of Macaulay2 that will work on all architectures. Somehow I forgot to do it. (A consequence of that is that if someone wants a fast Macaulay2 in which the mpir