Yeah, I see why the test code passes.

Here is how it does alignment. Basically it sets the pointer to the
source region as follows:

s[i].p = s[i].region.ptr + s[i].align;

Here the s[i].align is supposed to take care of the alignment. Of
course the type of s[i].p can be determined from the definition of the
struct:

struct source_t {
  struct region_t  region;
  int        high;
  mp_size_t  align;
  mp_ptr     p;
};

i.e. it is an mp_ptr, i.e. an mp_limb_t *.

In other words, align tests alignment to different limbs in memory! In
this case, alignment can have one of 4 values, 0, 1, 2, 3. But SSE is
only 128 bits, i.e. two limbs, so I just don't get this at all. Yes it
tests for non-limb aligned stuff which might break SSE code. But it
doesn't test for different byte alignments, as I had originally
thought.

On the other hand, destination pointers do test different byte
alignments, provided that tr->dst_bytes[i] is nonzero for destination
i. However, this is only set for the get_str function (as one would
imagine it should be).

Thus the very clear intent of the test code is that for n byte limbs,
only n byte aligned data is allowed. Thus the conclusion is clear.
There is no intention to support any other kind of alignment and so
the documentation must be changed.

I am surprised that I did not know this before now, however it makes
full sense, as there is no point having a high performance math
library if the custom memory allocator can basically ruin the
performance. But Dan is right that the documentation most definitely
should mention this.

Bill.

2009/12/10 Bill Hart <goodwillh...@googlemail.com>:
> But then why test for non-aligned limbs?
>
> Currently try test for all possible byte alignments! Or at least it is
> supposed to. I'm still not sure I see what is going wrong with try. To
> me this is still a puzzle.
>
> I didn't actually look at the assembler yet. I am quite sure it *is*
> broken, for some suitable definition of broken. But the real question
> is how can that be? This code passes the tests and most certainly
> should not!
>
> Bill.
>
> 2009/12/10 Cactus <rieman...@googlemail.com>:
>>
>>
>> On Dec 10, 3:56 pm, Dan Grayson <d...@math.uiuc.edu> wrote:
>>> He could also mean that the machine instructions that move 8 byte
>>> words to vector registers would segfault if you tried to use them on
>>> nonaligned data, and that he tried to write mpn_lshift to try to avoid
>>> doing that.  Those are the lines of code I pointed to in the original
>>> report.  It would be a bug to write code that depended on alignment,
>>> as I pointed out, unless you change the documentation about custom
>>> memory allocators, which should not be done lightly.
>>>
>>> On Dec 10, 9:43 am, Bill Hart <goodwillh...@googlemail.com> wrote:
>>>
>>>
>>>
>>> > I found the following in correspondence with Jason (I can't just ask
>>> > him, as his internet seems to be offline at the moment):
>>>
>>> > "Misaligned data will segfault on some arches and instructions eg K10
>>> > and lshift"
>>>
>>> > So presumably he has assumed data is aligned on K10, written a fast
>>> > version of the code for this case, then propagated it to other arches
>>> > where no segfault will occur. Actually, I am unsure whether that is
>>> > what he means, but that is what I infer.
>>>
>>> > If so, the solution would seem to be to replace the Core2 and Atom
>>> > code with a safer version and leave the K10 code as is. That's really
>>> > something Jason should do though, so we probably have to wait for him
>>> > to pop up again.- Hide quoted text -
>>>
>>> - Show quoted text -
>>
>> Looking at the code, it seems to me that it has been designed for
>> doing 16 byte operations that are at least 8 byte aligned.  This is
>> what I would expect in 64-bit assembler code, which an reasonably be
>> designed to rely on 8-byte alignment.
>>
>> Moreover, Jason's code does appear to account for any additional 8-
>> byte segments are needed at the start or the end of a sequence of 16-
>> byte aligned operations.
>>
>> So it looks to me like a code bug rather than a documentation bug.
>>
>> But the Custom Allocation section of tthe manual says nothing at all
>> about alignment. It doesn't even require that limb allocations are
>> properly aligned on the target machine - i.e. 4 byte alignment in 32-
>> bit MPIR and 8 byte alignment in 64-bit MPIR!
>>
>> I suspect a lot of the assembler code won't deal well (if at all) with
>> mis-aligned limbs so we should surely say something about this in the
>> documentation of Custom Allocation.  In my view we should indicate
>> that use of assembler code is suspect if memory allocation does not
>> ensure correct limb alignment - i.e. n byte limbs should be at least n
>> byte aligned.
>>
>>    Brian
>>
>> --
>>
>> You received this message because you are subscribed to the Google Groups 
>> "mpir-devel" group.
>> To post to this group, send email to mpir-de...@googlegroups.com.
>> To unsubscribe from this group, send email to 
>> mpir-devel+unsubscr...@googlegroups.com.
>> For more options, visit this group at 
>> http://groups.google.com/group/mpir-devel?hl=en.
>>
>>
>>
>

--

You received this message because you are subscribed to the Google Groups 
"mpir-devel" group.
To post to this group, send email to mpir-de...@googlegroups.com.
To unsubscribe from this group, send email to 
mpir-devel+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/mpir-devel?hl=en.


Reply via email to