I've made the proposed improvement to lshift.asm the subject of trac
ticket #271.
At present issuing a release is being held up by me not setting up the
necessary infrastructure to fix our autotools installation, which is
broken. But I anticipate having it done in the next few weeks, as I
find the
2009/12/21 Bill Hart :
> 2009/12/21 Dan Grayson :
>> Right. The proposal would be to declare that a bug. Then we would
>> have to fix it. Fixing it would mean rewriting mpn_lshift so it
>> crashes or returns the right answer with misaligned limbs. If it had
>> been written that way in the first
x86_64 was introduced less than 10 years ago. It is likely that the
assumptions you made then were valid then. Note that on 32 bit
machines you only need 4 byte alignment. It is only the x86_64 ABI
which requires alignment to 8 bytes.
Bill.
2009/12/21 Dan Grayson :
> The memory allocator provided
2009/12/21 Dan Grayson :
> Right. The proposal would be to declare that a bug. Then we would
> have to fix it. Fixing it would mean rewriting mpn_lshift so it
> crashes or returns the right answer with misaligned limbs. If it had
> been written that way in the first place, we wouldn't be having
sizeof(void *) will be fine on most systems including all the
important ones we support, as MPIR usually tries to select a limb
which is either long or long long depending on what is available, and
this is the size of a pointer on all the important systems I am aware
of.
Bill.
2009/12/21 Dan Gray
But as I pointed out. Fixing this "bug" would mean more than rewriting
lshift. We'd have to potentially rewrite all the functions in MPIR.
Furthermore we'd have to rewrite the entire test suite to certify that
each and every function obeyed these new rules. Then we'd have to
figure out when the C c
The memory allocator provided with Singular-Factory is in the file
libcfmem.a, and the routines are called getBlock, freeBlock, and
reallocBlock. I copied the routines more than 10 years ago and
continue to use the modified copy. I may even have destroyed 8 byte
alignment then while optimizing it
Right. The proposal would be to declare that a bug. Then we would
have to fix it. Fixing it would mean rewriting mpn_lshift so it
crashes or returns the right answer with misaligned limbs. If it had
been written that way in the first place, we wouldn't be having this
discussion, because I would
On Dec 20, 8:32 pm, Dan Grayson wrote:
> I don't see how your "make check" routine could ever even come into
> contact with my custom allocator. Also, it's probably not worth your
> while to try to distinguish a custom allocator from your native
> allocator at run time, as it will waste time.
>
2009/12/20 Dan Grayson :
> I don't see how your "make check" routine could ever even come into
> contact with my custom allocator.
True.
> Also, it's probably not worth your
> while to try to distinguish a custom allocator from your native
> allocator at run time, as it will waste time.
Agreed.
I don't see how your "make check" routine could ever even come into
contact with my custom allocator. Also, it's probably not worth your
while to try to distinguish a custom allocator from your native
allocator at run time, as it will waste time.
I suggest checking every pointer returned by the a
On Dec 11, 5:48 pm, Bill Hart wrote:
> OK, I have now documented the requirements in the relevant section of
> the documentation. See here for a preliminary version:
>
> http://sage.math.washington.edu/home/wbhart/mpir-ac/doc/mpir.pdf(note
> the date on the title page is incorrect).
>
> I have a
2009/12/11 Bill Hart :
> OK, I have now documented the requirements in the relevant section of
> the documentation. See here for a preliminary version:
>
> http://sage.math.washington.edu/home/wbhart/mpir-ac/doc/mpir.pdf (note
> the date on the title page is incorrect).
>
> I have also added all th
OK, I have now documented the requirements in the relevant section of
the documentation. See here for a preliminary version:
http://sage.math.washington.edu/home/wbhart/mpir-ac/doc/mpir.pdf (note
the date on the title page is incorrect).
I have also added all the papers which we have used, and Pe
On Dec 11, 5:12 pm, Bill Hart wrote:
> Oh OK!! I've been looking at the Core2 file. Sorry. In fact that
> instruction is used in the K10 and atom files.
>
> However,
>
> a) The K10 file passes the try test code on Selmer, which is a K10
>
> b) If the data was not aligned, a segfault would occur,
Oh OK!! I've been looking at the Core2 file. Sorry. In fact that
instruction is used in the K10 and atom files.
However,
a) The K10 file passes the try test code on Selmer, which is a K10
b) If the data was not aligned, a segfault would occur, not a wrong answer.
Bill.
2009/12/11 Bill Hart :
>
Right, but the try test would fail in that case if it wasn't handled
properly, as it definitely tests alignments on alternate limbs. I
would consider it a bug if MPIR used this instruction on data which
was not aligned. It should have special code to handle the unaligned
case, or use the MOVDQU ins
I believe that the alignment needs to be 16 bytes, not just 8 bytes
for the MOVDQA instruction.
Jason Worth Martin
Asst. Professor of Mathematics
http://www.math.jmu.edu/~martin
On Fri, Dec 11, 2009 at 9:31 AM, Cactus wrote:
>
>
> On Dec 11, 2:11 pm, Bill Hart wrote:
>> omalloc is easy to fix
I checked the code for Boehm's gc and on x86_64 all pointers are
aligned to 8 byte boundaries (see the files gcconfig.h for x86_64 and
the definition of ALIGNMENT, gc_priv.h for the definition of
BYTES_TO_WORDS, ROUNDED_UP_WORDS and ALIGNED_WORDS. One assumes that
this code is actually #included an
On Dec 11, 2:11 pm, Bill Hart wrote:
> omalloc is easy to fix. Just configure it with --with-align=8 on 64
> bit systems. There are other configure options for how alignment is
> done.
In fact it should be configured this way by the build system on 64-bit
systems:
#ifndef OM_ALIGN_8
/* define
omalloc is easy to fix. Just configure it with --with-align=8 on 64
bit systems. There are other configure options for how alignment is
done.
Bill.
2009/12/11 Cactus :
>
>
> On Dec 10, 9:13 pm, Dan Grayson wrote:
>> Well, Macaulay2 comes with thousands of tests, and only one failed,
>> deep insi
On Dec 10, 9:13 pm, Dan Grayson wrote:
> Well, Macaulay2 comes with thousands of tests, and only one failed,
> deep inside a D-modules computation. It took a lot of single stepping
> to figure it out. Part of the problem is that we have two custom
> memory allocators, Boehm's gc and one copied
On Dec 11, 2:38 am, Jason Martin wrote:
> Dan, thanks for this bug report. I could also see that it probably
> took you many many tedious hours to find this bug. Amazing work!!
> Thank you sooo much to taking the time to do it!!!
>
> This is a really subtle bug. I just checked that malloc doc
2009/12/11 Bill Hart :
> There's still the question of how we have people who write garbage
> allocators recognise that this is a bug.
>
Just in case there is any doubt. Yes, this was only a joke.
> Bill.
--
You received this message because you are subscribed to the Google Groups
"mpir-devel"
It's definitely not that simple, unfortunately.
Jason's assembler code optimises the leadins of all the functions.
Even a single cycle difference means the entire function needs to be
reoptimised.
Given that there are now scores of assembler functions, you are
talking about months of computing ti
Dan, thanks for this bug report. I could also see that it probably
took you many many tedious hours to find this bug. Amazing work!!
Thank you sooo much to taking the time to do it!!!
This is a really subtle bug. I just checked that malloc documentation
on several Linux distros, Mac OS X, BSD,
There's still the question of how we have people who write garbage
allocators recognise that this is a bug.
Bill.
2009/12/10 Cactus :
>
>
> On Dec 10, 10:20 pm, Bill Hart wrote:
>> Hi Dan,
>>
>> indeed I had inferred that you had done a *lot* of work to track this bug
>> down.
>>
>> It's not im
On Dec 10, 10:20 pm, Bill Hart wrote:
> Hi Dan,
>
> indeed I had inferred that you had done a *lot* of work to track this bug
> down.
>
> It's not immediately clear how we can cause an error message to be
> output if a custom allocator returns non-aligned pointers.
>
> Two possibilities spring
By the way, which allocator was it? I did not know about the Singular
one and had assumed it was Boehm.
Bill.
2009/12/10 Bill Hart :
> Hi Dan,
>
> indeed I had inferred that you had done a *lot* of work to track this bug
> down.
>
> It's not immediately clear how we can cause an error message to
Hi Dan,
indeed I had inferred that you had done a *lot* of work to track this bug down.
It's not immediately clear how we can cause an error message to be
output if a custom allocator returns non-aligned pointers.
Two possibilities spring to mind:
1) Introduce a make check test which checks tha
That's a fine decision, but now it becomes super important to help
developers discover that their custom allocator is breaking the rule,
if it does. It took me many hours of single-stepping to discover it.
It would have saved me a lot of time if the program had aborted with a
suitable error messag
Well, Macaulay2 comes with thousands of tests, and only one failed,
deep inside a D-modules computation. It took a lot of single stepping
to figure it out. Part of the problem is that we have two custom
memory allocators, Boehm's gc and one copied from Singular-Factory
long ago. The latter is us
On Dec 10, 8:09 pm, Bill Hart wrote:
> In reality it is rarely a problem for applications, as the ISA usually
> only catches people out for SSE stuff, and then the result is a
> segfault, which they'll eventually track down. Also, as portable
> generic C needs to make no assumption about the end
In reality it is rarely a problem for applications, as the ISA usually
only catches people out for SSE stuff, and then the result is a
segfault, which they'll eventually track down. Also, as portable
generic C needs to make no assumption about the endianness of the CPU
then good C code is rarely wr
On Dec 10, 7:09 pm, Bill Hart wrote:
> In fact it seems there is plenty of documentary evidence on the web
> that GMP also expects 8 byte aligned memory locations for 8 byte limb
> arrays.
>
> Apparently some of the gcc optimisations also expect this (an array of
> unsigned longs on a 64 bit mac
In fact the x86_64 ABI definition requires that unsigned longs are 8
byte aligned:
http://www.x86-64.org/documentation/abi-0.99.pdf (page 12). This is
why gcc can make this assumption.
Of course different vendors implement the ABI differently, and usually
drop the requirement.
Bill.
2009/12/10
In fact it seems there is plenty of documentary evidence on the web
that GMP also expects 8 byte aligned memory locations for 8 byte limb
arrays.
Apparently some of the gcc optimisations also expect this (an array of
unsigned longs on a 64 bit machine is assumed to have 8 byte aligned
addresses -
Yeah I was wrong! See my other response.
2009/12/10 Cactus :
>
>
> On Dec 10, 5:55 pm, Bill Hart wrote:
>> But then why test for non-aligned limbs?
>>
>> Currently try test for all possible byte alignments! Or at least it is
>> supposed to. I'm still not sure I see what is going wrong with try. T
Yeah, I see why the test code passes.
Here is how it does alignment. Basically it sets the pointer to the
source region as follows:
s[i].p = s[i].region.ptr + s[i].align;
Here the s[i].align is supposed to take care of the alignment. Of
course the type of s[i].p can be determined from the defini
On Dec 10, 5:55 pm, Bill Hart wrote:
> But then why test for non-aligned limbs?
>
> Currently try test for all possible byte alignments! Or at least it is
> supposed to. I'm still not sure I see what is going wrong with try. To
> me this is still a puzzle.
Are you sure about this?
I have run t
On Dec 10, 5:15 pm, Cactus wrote:
> On Dec 10, 3:56 pm, Dan Grayson wrote:
>
>
>
>
>
> > He could also mean that the machine instructions that move 8 byte
> > words to vector registers would segfault if you tried to use them on
> > nonaligned data, and that he tried to write mpn_lshift to try t
But then why test for non-aligned limbs?
Currently try test for all possible byte alignments! Or at least it is
supposed to. I'm still not sure I see what is going wrong with try. To
me this is still a puzzle.
I didn't actually look at the assembler yet. I am quite sure it *is*
broken, for some s
On Dec 10, 3:56 pm, Dan Grayson wrote:
> He could also mean that the machine instructions that move 8 byte
> words to vector registers would segfault if you tried to use them on
> nonaligned data, and that he tried to write mpn_lshift to try to avoid
> doing that. Those are the lines of code I
Ah, ok, that makes more sense. Yeah the SSE stuff requires aligned data.
Looking into the test code, I can't figure out why it doesn't fail on
Core2 with the current test code, which appears to be using all
possible alignments. (I've just run it again for some time and it
doesn't fail.)
I still h
He could also mean that the machine instructions that move 8 byte
words to vector registers would segfault if you tried to use them on
nonaligned data, and that he tried to write mpn_lshift to try to avoid
doing that. Those are the lines of code I pointed to in the original
report. It would be a
Actually, if you are preparing binaries of Macaulay the normal thing
would be to use --enable-fat. But for core2, this will specifically
pick the broken core2 code in your case. So, in actual fact, the
--build=x86_64 would be the only safe option if you are using custom
allocation, at the moment an
Either way, I believe the test code alternately uses both uniformly
distributed numbers and numbers with long strings of zeros and ones.
Actually, I am embarrassed to say that I don't actually know if it
allows the top bit to be zero, or not.
Bill.
2009/12/10 Dan Grayson :
> I think the bug is ac
... oops, I was implicitly referring to mpn/tdiv_qr.c, which counts
leading bits before shifting by that amount; probably mpn_lshift
itself has no such dependency.
On Dec 10, 8:28 am, Dan Grayson wrote:
> I think the bug is activated only when the number of lead zero bits in
> the denominator is
I think the bug is activated only when the number of lead zero bits in
the denominator is in a certain narrow range, since that number
determines by how much to shift, but I haven't explored that
carefully. Therefore, for tests involving alignment, picking integers
with random bits is not good (if
Yup, that's how I was going to work around it. Actually, I should
have been doing it anyway, because I'm trying to prepare distributions
of Macaulay2 that will work on all architectures. Somehow I forgot to
do it. (A consequence of that is that if someone wants a fast
Macaulay2 in which the mpir
50 matches
Mail list logo