Yeah, I see why the test code passes. Here is how it does alignment. Basically it sets the pointer to the source region as follows:
s[i].p = s[i].region.ptr + s[i].align; Here the s[i].align is supposed to take care of the alignment. Of course the type of s[i].p can be determined from the definition of the struct: struct source_t { struct region_t region; int high; mp_size_t align; mp_ptr p; }; i.e. it is an mp_ptr, i.e. an mp_limb_t *. In other words, align tests alignment to different limbs in memory! In this case, alignment can have one of 4 values, 0, 1, 2, 3. But SSE is only 128 bits, i.e. two limbs, so I just don't get this at all. Yes it tests for non-limb aligned stuff which might break SSE code. But it doesn't test for different byte alignments, as I had originally thought. On the other hand, destination pointers do test different byte alignments, provided that tr->dst_bytes[i] is nonzero for destination i. However, this is only set for the get_str function (as one would imagine it should be). Thus the very clear intent of the test code is that for n byte limbs, only n byte aligned data is allowed. Thus the conclusion is clear. There is no intention to support any other kind of alignment and so the documentation must be changed. I am surprised that I did not know this before now, however it makes full sense, as there is no point having a high performance math library if the custom memory allocator can basically ruin the performance. But Dan is right that the documentation most definitely should mention this. Bill. 2009/12/10 Bill Hart <goodwillh...@googlemail.com>: > But then why test for non-aligned limbs? > > Currently try test for all possible byte alignments! Or at least it is > supposed to. I'm still not sure I see what is going wrong with try. To > me this is still a puzzle. > > I didn't actually look at the assembler yet. I am quite sure it *is* > broken, for some suitable definition of broken. But the real question > is how can that be? This code passes the tests and most certainly > should not! > > Bill. > > 2009/12/10 Cactus <rieman...@googlemail.com>: >> >> >> On Dec 10, 3:56 pm, Dan Grayson <d...@math.uiuc.edu> wrote: >>> He could also mean that the machine instructions that move 8 byte >>> words to vector registers would segfault if you tried to use them on >>> nonaligned data, and that he tried to write mpn_lshift to try to avoid >>> doing that. Those are the lines of code I pointed to in the original >>> report. It would be a bug to write code that depended on alignment, >>> as I pointed out, unless you change the documentation about custom >>> memory allocators, which should not be done lightly. >>> >>> On Dec 10, 9:43 am, Bill Hart <goodwillh...@googlemail.com> wrote: >>> >>> >>> >>> > I found the following in correspondence with Jason (I can't just ask >>> > him, as his internet seems to be offline at the moment): >>> >>> > "Misaligned data will segfault on some arches and instructions eg K10 >>> > and lshift" >>> >>> > So presumably he has assumed data is aligned on K10, written a fast >>> > version of the code for this case, then propagated it to other arches >>> > where no segfault will occur. Actually, I am unsure whether that is >>> > what he means, but that is what I infer. >>> >>> > If so, the solution would seem to be to replace the Core2 and Atom >>> > code with a safer version and leave the K10 code as is. That's really >>> > something Jason should do though, so we probably have to wait for him >>> > to pop up again.- Hide quoted text - >>> >>> - Show quoted text - >> >> Looking at the code, it seems to me that it has been designed for >> doing 16 byte operations that are at least 8 byte aligned. This is >> what I would expect in 64-bit assembler code, which an reasonably be >> designed to rely on 8-byte alignment. >> >> Moreover, Jason's code does appear to account for any additional 8- >> byte segments are needed at the start or the end of a sequence of 16- >> byte aligned operations. >> >> So it looks to me like a code bug rather than a documentation bug. >> >> But the Custom Allocation section of tthe manual says nothing at all >> about alignment. It doesn't even require that limb allocations are >> properly aligned on the target machine - i.e. 4 byte alignment in 32- >> bit MPIR and 8 byte alignment in 64-bit MPIR! >> >> I suspect a lot of the assembler code won't deal well (if at all) with >> mis-aligned limbs so we should surely say something about this in the >> documentation of Custom Allocation. In my view we should indicate >> that use of assembler code is suspect if memory allocation does not >> ensure correct limb alignment - i.e. n byte limbs should be at least n >> byte aligned. >> >> Brian >> >> -- >> >> You received this message because you are subscribed to the Google Groups >> "mpir-devel" group. >> To post to this group, send email to mpir-de...@googlegroups.com. >> To unsubscribe from this group, send email to >> mpir-devel+unsubscr...@googlegroups.com. >> For more options, visit this group at >> http://groups.google.com/group/mpir-devel?hl=en. >> >> >> > -- You received this message because you are subscribed to the Google Groups "mpir-devel" group. To post to this group, send email to mpir-de...@googlegroups.com. To unsubscribe from this group, send email to mpir-devel+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/mpir-devel?hl=en.