> On Aug 3, 2017, at 11:40 AM, David Miller <da...@davemloft.net> wrote: > > From: Qing Zhao <qing.z...@oracle.com> > Date: Thu, 3 Aug 2017 10:37:15 -0500 > >> all the special handling on STRICT_ALIGNMENT or >> SLOW_UNALIGNMENT_ACCESS in these codes have the following common >> logic: >> >> if the memory access is known to be not-aligned well during >> compilation time, if the targeted platform does NOT support faster >> unaligned memory access, the compiler will try to make the memory >> access aligned well. Otherwise, if the targeted platform supports >> faster unaligned memory access, it will leave the compiler-time >> known not-aligned memory access as it, later the hardware support >> will kicked in for these unaligned memory access. >> >> this behavior is consistent with the high level definition of >> STRICT_ALIGNMENT. > > That's exactly the problem. > > What you want with this M8 feature is simply to let the compiler know > that if it is completely impossible to make some memory object > aligned, then the cpu can handle this with special instructions.
> > You still want the compiler to make the effort to align data when it > can because the accesses will be faster than if it used the unaligned > loads and stores. I don’t think the above is true. first, the compiler-time known misaligned memory access can always be emulated by aligned memory access ( by byte-size load/stores). then there will be no compiler-time known misaligned memory access left for the special misaligned ld/st insns. second, there are always overhead cost for the compiler-time effort to make the compiler-time known unaligned memory access as aligned memory access. (adding additional padding, or split the unaligned multi-bytes to single-byte load/store), all such overhead might be even bigger than the overhead of the special misaligned load/store itself. to decide which is better (to use software emulation or use hardware misaligned load/store insns), experiments might be needed to justify the performance impact. This set of change is to provide a way to use misaligned load/store insns to implement the compiler-time known unaligned memory access, -mno-misalign can be used to disable such behavior very easily if our performance data shows that misaligned load/store insns are slower than the current software emulation. Qing > > This is incredibly important for on-stack objects.