Re: [mpir-devel] Re: mpn_lshift alignment bug

2009-12-10 Thread Bill Hart
2009/12/11 Bill Hart : > There's still the question of how we have people who write garbage > allocators recognise that this is a bug. > Just in case there is any doubt. Yes, this was only a joke. > Bill. -- You received this message because you are subscribed to the Google Groups "mpir-devel"

Re: [mpir-devel] Re: mpn_lshift alignment bug

2009-12-10 Thread Bill Hart
It's definitely not that simple, unfortunately. Jason's assembler code optimises the leadins of all the functions. Even a single cycle difference means the entire function needs to be reoptimised. Given that there are now scores of assembler functions, you are talking about months of computing ti

Re: [mpir-devel] Re: mpn_lshift alignment bug

2009-12-10 Thread Jason Martin
Dan, thanks for this bug report. I could also see that it probably took you many many tedious hours to find this bug. Amazing work!! Thank you sooo much to taking the time to do it!!! This is a really subtle bug. I just checked that malloc documentation on several Linux distros, Mac OS X, BSD,

Re: [mpir-devel] Re: mpn_lshift alignment bug

2009-12-10 Thread Bill Hart
There's still the question of how we have people who write garbage allocators recognise that this is a bug. Bill. 2009/12/10 Cactus : > > > On Dec 10, 10:20 pm, Bill Hart wrote: >> Hi Dan, >> >> indeed I had inferred that you had done a *lot* of work to track this bug >> down. >> >> It's not im

[mpir-devel] Re: mpn_lshift alignment bug

2009-12-10 Thread Cactus
On Dec 10, 10:20 pm, Bill Hart wrote: > Hi Dan, > > indeed I had inferred that you had done a *lot* of work to track this bug > down. > > It's not immediately clear how we can cause an error message to be > output if a custom allocator returns non-aligned pointers. > > Two possibilities spring

Re: [mpir-devel] Re: mpn_lshift alignment bug

2009-12-10 Thread Bill Hart
By the way, which allocator was it? I did not know about the Singular one and had assumed it was Boehm. Bill. 2009/12/10 Bill Hart : > Hi Dan, > > indeed I had inferred that you had done a *lot* of work to track this bug > down. > > It's not immediately clear how we can cause an error message to

Re: [mpir-devel] Re: mpn_lshift alignment bug

2009-12-10 Thread Bill Hart
Hi Dan, indeed I had inferred that you had done a *lot* of work to track this bug down. It's not immediately clear how we can cause an error message to be output if a custom allocator returns non-aligned pointers. Two possibilities spring to mind: 1) Introduce a make check test which checks tha

[mpir-devel] Re: mpn_lshift alignment bug

2009-12-10 Thread Dan Grayson
That's a fine decision, but now it becomes super important to help developers discover that their custom allocator is breaking the rule, if it does. It took me many hours of single-stepping to discover it. It would have saved me a lot of time if the program had aborted with a suitable error messag

[mpir-devel] Re: mpn_lshift alignment bug

2009-12-10 Thread Dan Grayson
Well, Macaulay2 comes with thousands of tests, and only one failed, deep inside a D-modules computation. It took a lot of single stepping to figure it out. Part of the problem is that we have two custom memory allocators, Boehm's gc and one copied from Singular-Factory long ago. The latter is us

[mpir-devel] Re: mpn_lshift alignment bug

2009-12-10 Thread Cactus
On Dec 10, 8:09 pm, Bill Hart wrote: > In reality it is rarely a problem for applications, as the ISA usually > only catches people out for SSE stuff, and then the result is a > segfault, which they'll eventually track down. Also, as portable > generic C needs to make no assumption about the end

Re: [mpir-devel] Re: mpn_lshift alignment bug

2009-12-10 Thread Bill Hart
In reality it is rarely a problem for applications, as the ISA usually only catches people out for SSE stuff, and then the result is a segfault, which they'll eventually track down. Also, as portable generic C needs to make no assumption about the endianness of the CPU then good C code is rarely wr

[mpir-devel] Re: mpn_lshift alignment bug

2009-12-10 Thread Cactus
On Dec 10, 7:09 pm, Bill Hart wrote: > In fact it seems there is plenty of documentary evidence on the web > that GMP also expects 8 byte aligned memory locations for 8 byte limb > arrays. > > Apparently some of the gcc optimisations also expect this (an array of > unsigned longs on a 64 bit mac

Re: [mpir-devel] Re: mpn_lshift alignment bug

2009-12-10 Thread Bill Hart
In fact the x86_64 ABI definition requires that unsigned longs are 8 byte aligned: http://www.x86-64.org/documentation/abi-0.99.pdf (page 12). This is why gcc can make this assumption. Of course different vendors implement the ABI differently, and usually drop the requirement. Bill. 2009/12/10

Re: [mpir-devel] Re: mpn_lshift alignment bug

2009-12-10 Thread Bill Hart
In fact it seems there is plenty of documentary evidence on the web that GMP also expects 8 byte aligned memory locations for 8 byte limb arrays. Apparently some of the gcc optimisations also expect this (an array of unsigned longs on a 64 bit machine is assumed to have 8 byte aligned addresses -

Re: [mpir-devel] Re: mpn_lshift alignment bug

2009-12-10 Thread Bill Hart
Yeah I was wrong! See my other response. 2009/12/10 Cactus : > > > On Dec 10, 5:55 pm, Bill Hart wrote: >> But then why test for non-aligned limbs? >> >> Currently try test for all possible byte alignments! Or at least it is >> supposed to. I'm still not sure I see what is going wrong with try. T

Re: [mpir-devel] Re: mpn_lshift alignment bug

2009-12-10 Thread Bill Hart
Yeah, I see why the test code passes. Here is how it does alignment. Basically it sets the pointer to the source region as follows: s[i].p = s[i].region.ptr + s[i].align; Here the s[i].align is supposed to take care of the alignment. Of course the type of s[i].p can be determined from the defini

[mpir-devel] Re: mpn_lshift alignment bug

2009-12-10 Thread Cactus
On Dec 10, 5:55 pm, Bill Hart wrote: > But then why test for non-aligned limbs? > > Currently try test for all possible byte alignments! Or at least it is > supposed to. I'm still not sure I see what is going wrong with try. To > me this is still a puzzle. Are you sure about this? I have run t

[mpir-devel] Re: mpn_lshift alignment bug

2009-12-10 Thread Cactus
On Dec 10, 5:15 pm, Cactus wrote: > On Dec 10, 3:56 pm, Dan Grayson wrote: > > > > > > > He could also mean that the machine instructions that move 8 byte > > words to vector registers would segfault if you tried to use them on > > nonaligned data, and that he tried to write mpn_lshift to try t

Re: [mpir-devel] Re: mpn_lshift alignment bug

2009-12-10 Thread Bill Hart
But then why test for non-aligned limbs? Currently try test for all possible byte alignments! Or at least it is supposed to. I'm still not sure I see what is going wrong with try. To me this is still a puzzle. I didn't actually look at the assembler yet. I am quite sure it *is* broken, for some s

[mpir-devel] Re: mpn_lshift alignment bug

2009-12-10 Thread Cactus
On Dec 10, 3:56 pm, Dan Grayson wrote: > He could also mean that the machine instructions that move 8 byte > words to vector registers would segfault if you tried to use them on > nonaligned data, and that he tried to write mpn_lshift to try to avoid > doing that.  Those are the lines of code I

Re: [mpir-devel] Re: mpn_lshift alignment bug

2009-12-10 Thread Bill Hart
Ah, ok, that makes more sense. Yeah the SSE stuff requires aligned data. Looking into the test code, I can't figure out why it doesn't fail on Core2 with the current test code, which appears to be using all possible alignments. (I've just run it again for some time and it doesn't fail.) I still h

[mpir-devel] Re: mpn_lshift alignment bug

2009-12-10 Thread Dan Grayson
He could also mean that the machine instructions that move 8 byte words to vector registers would segfault if you tried to use them on nonaligned data, and that he tried to write mpn_lshift to try to avoid doing that. Those are the lines of code I pointed to in the original report. It would be a

Re: [mpir-devel] mpn_lshift alignment bug

2009-12-10 Thread Bill Hart
I found the following in correspondence with Jason (I can't just ask him, as his internet seems to be offline at the moment): "Misaligned data will segfault on some arches and instructions eg K10 and lshift" So presumably he has assumed data is aligned on K10, written a fast version of the code f

Re: [mpir-devel] Re: mpn_lshift alignment bug

2009-12-10 Thread Bill Hart
Actually, if you are preparing binaries of Macaulay the normal thing would be to use --enable-fat. But for core2, this will specifically pick the broken core2 code in your case. So, in actual fact, the --build=x86_64 would be the only safe option if you are using custom allocation, at the moment an

Re: [mpir-devel] Re: mpn_lshift alignment bug

2009-12-10 Thread Bill Hart
Either way, I believe the test code alternately uses both uniformly distributed numbers and numbers with long strings of zeros and ones. Actually, I am embarrassed to say that I don't actually know if it allows the top bit to be zero, or not. Bill. 2009/12/10 Dan Grayson : > I think the bug is ac

[mpir-devel] Re: mpn_lshift alignment bug

2009-12-10 Thread Dan Grayson
... oops, I was implicitly referring to mpn/tdiv_qr.c, which counts leading bits before shifting by that amount; probably mpn_lshift itself has no such dependency. On Dec 10, 8:28 am, Dan Grayson wrote: > I think the bug is activated only when the number of lead zero bits in > the denominator is

[mpir-devel] Re: mpn_lshift alignment bug

2009-12-10 Thread Dan Grayson
I think the bug is activated only when the number of lead zero bits in the denominator is in a certain narrow range, since that number determines by how much to shift, but I haven't explored that carefully. Therefore, for tests involving alignment, picking integers with random bits is not good (if

[mpir-devel] Re: mpn_lshift alignment bug

2009-12-10 Thread Dan Grayson
Yup, that's how I was going to work around it. Actually, I should have been doing it anyway, because I'm trying to prepare distributions of Macaulay2 that will work on all architectures. Somehow I forgot to do it. (A consequence of that is that if someone wants a fast Macaulay2 in which the mpir

Re: [mpir-devel] mpn_lshift alignment bug

2009-12-10 Thread Bill Hart
By the way, in the mean time, the following may be a workaround for you. As allocation on non-limb aligned boundaries will imply a performance penalty and throw off all the cycle counts for the highly optimised MPIR assembly functions anyway, you may as well use a default x86_64 build if your memo

Re: [mpir-devel] mpn_lshift alignment bug

2009-12-10 Thread Bill Hart
Hi Dan, Thanks for the report! I'm a little surprised that this bug was not picked up during testing. The try program specifically tests for all memory alignment possibilities. I'll have to look into why it was not picked up. Nonetheless, a bug is a bug. We'll sort this out before releasing MPIR

[mpir-devel] mpn_lshift alignment bug

2009-12-10 Thread Dan Grayson
/* Bug in mpir 1.2.1. This program demonstrates that mpn_lshift, as implemented on 64 bit intel machines by mpn/x86_64/core2/lshift.as, gives the wrong answer if the limbs are not aligned to an 8 byte boundary. The confusion in the code starts with these lines: and r9