> On Aug 3, 2017, at 11:40 AM, David Miller <da...@davemloft.net> wrote:
> 
> From: Qing Zhao <qing.z...@oracle.com>
> Date: Thu, 3 Aug 2017 10:37:15 -0500
> 
>> all the special handling on STRICT_ALIGNMENT or
>> SLOW_UNALIGNMENT_ACCESS in these codes have the following common
>> logic:
>> 
>> if the memory access is known to be not-aligned well during
>> compilation time, if the targeted platform does NOT support faster
>> unaligned memory access, the compiler will try to make the memory
>> access aligned well. Otherwise, if the targeted platform supports
>> faster unaligned memory access, it will leave the compiler-time
>> known not-aligned memory access as it, later the hardware support
>> will kicked in for these unaligned memory access.
>> 
>> this behavior is consistent with the high level definition of 
>> STRICT_ALIGNMENT. 
> 
> That's exactly the problem.
> 
> What you want with this M8 feature is simply to let the compiler know
> that if it is completely impossible to make some memory object
> aligned, then the cpu can handle this with special instructions.

> 
> You still want the compiler to make the effort to align data when it
> can because the accesses will be faster than if it used the unaligned
> loads and stores.

I don’t think the above is true.

first, the compiler-time known misaligned memory access can always be emulated 
by aligned memory access ( by byte-size load/stores).  then there will be no 
compiler-time known 
misaligned memory access left for the special misaligned ld/st insns. 

second, there are always overhead cost for the compiler-time effort to make the 
compiler-time known unaligned memory access as aligned memory access. (adding 
additional
padding, or split the unaligned multi-bytes to single-byte load/store), all 
such overhead might be even bigger than the overhead of the special misaligned 
load/store itself.

to decide which is better (to use software emulation or use hardware misaligned 
load/store insns), experiments might be needed to justify the performance 
impact.

This set of change is to provide a way to use misaligned load/store insns to 
implement the compiler-time known unaligned memory access,  -mno-misalign can 
be used
to disable such behavior very easily if our performance data shows that 
misaligned load/store insns are slower than the current software emulation. 

Qing


> 
> This is incredibly important for on-stack objects.

Reply via email to