On Mar 30, 2016, at 2:36 AM, Paul Sandoz <paul.san...@oracle.com> wrote: > > When access is performed in loops this can cost, as the alignment checks are > not hoisted out. Theoretically could for regular 2, 4, 8 strides through the > buffer contents. For such cases alignment of the base address can be checked. > Not sure how complicated that would be to support. > > I lack knowledge of the SPARC instruction set to know if we could do > something clever as an intrinsic.
A couple of partial thoughts: If we had bitfield type inference, we would be able to deduce that the low bits of p and p+8 are the same. Graal has this (because I gave them the formulae[1]). C2 may be too brittle to add it into TypeInt. Bitfield inference on expressions of the form p&7 and (p+8)&7 would allow commoning tests in an unrolled loop, hoisting the alignment logic to the top of the loop body, and (perhaps) through the phi to the loop head. [1]: http://hg.openjdk.java.net/graal/graal-core/file/ea5cc66ec5f2/graal/com.oracle.graal.compiler.common/src/com/oracle/graal/compiler/common/type/IntegerStamp.java#l460 <http://hg.openjdk.java.net/graal/graal-core/file/ea5cc66ec5f2/graal/com.oracle.graal.compiler.common/src/com/oracle/graal/compiler/common/type/IntegerStamp.java#l460> An intrinsic that would guide the JIT more explicitly would (I think) need an extra argument. Something like: getLongUnaligned(p, uo, ao), where uo and ao are both longs, but only ao is required to be naturally aligned ((ao&7)==0). Yuck. Maybe this could be pattern-matched in C2; it would be a kludge. — John