> On 31 Mar 2016, at 09:40, Peter Levart <peter.lev...@gmail.com> wrote:
> 
> 
> 
> On 03/31/2016 08:33 AM, John Rose wrote:
>> On Mar 30, 2016, at 2:36 AM, Paul Sandoz <paul.san...@oracle.com> wrote:
>>> When access is performed in loops this can cost, as the alignment checks 
>>> are not hoisted out. Theoretically could for regular 2, 4, 8 strides 
>>> through the buffer contents. For such cases alignment of the base address 
>>> can be checked. Not sure how complicated that would be to support.
>>> 
>>> I lack knowledge of the SPARC instruction set to know if we could do 
>>> something clever as an intrinsic.
>> A couple of partial thoughts:
>> 
>> If we had bitfield type inference, we would be able to deduce that the low 
>> bits of p and p+8 are the same.  Graal has this (because I gave them the 
>> formulae[1]).  C2 may be too brittle to add it into TypeInt.  Bitfield 
>> inference on expressions of the form p&7 and (p+8)&7 would allow commoning 
>> tests in an unrolled loop, hoisting the alignment logic to the top of the 
>> loop body, and (perhaps) through the phi to the loop head.
>> 
>> [1]: 
>> http://hg.openjdk.java.net/graal/graal-core/file/ea5cc66ec5f2/graal/com.oracle.graal.compiler.common/src/com/oracle/graal/compiler/common/type/IntegerStamp.java#l460
>>  
>> <http://hg.openjdk.java.net/graal/graal-core/file/ea5cc66ec5f2/graal/com.oracle.graal.compiler.common/src/com/oracle/graal/compiler/common/type/IntegerStamp.java#l460>
>> 
>> An intrinsic that would guide the JIT more explicitly would (I think) need 
>> an extra argument.  Something like:  getLongUnaligned(p, uo, ao), where uo 
>> and ao are both longs, but only ao is required to be naturally aligned 
>> ((ao&7)==0).  Yuck.  Maybe this could be pattern-matched in C2; it would be 
>> a kludge.
>> 
>> — John
> 
> Yes, an API that allows "indexed" unaligned access might be helpful. For 
> example (in Unsafe):
> 

Thanks, that actually works out quite well on x86, which appears in some cases 
to guide to the JIT to create better unrolled loops. Alas it does not perform 
so well on SPARC under the same tests.

Paul.

>    public final long getLongUnaligned(Object o, long offset, int i) {
>        long iOffset = (long) i << 3;
>        if ((offset & 7) == 0) {
>            return getLong(o, offset + iOffset);
>        } else if ((offset & 3) == 0) {
>            return makeLong(getInt(o, offset + iOffset),
>                            getInt(o, offset + iOffset + 4));
>        } else if ((offset & 1) == 0) {
>            return makeLong(getShort(o, offset + iOffset),
>                            getShort(o, offset + iOffset + 2),
>                            getShort(o, offset + iOffset + 4),
>                            getShort(o, offset + iOffset + 6));
>        } else {
>            return makeLong(getByte(o, offset),
>                            getByte(o, offset + iOffset + 1),
>                            getByte(o, offset + iOffset + 2),
>                            getByte(o, offset + iOffset + 3),
>                            getByte(o, offset + iOffset + 4),
>                            getByte(o, offset + iOffset + 5),
>                            getByte(o, offset + iOffset + 6),
>                            getByte(o, offset + iOffset + 7));
>        }
>    }
> 
> 
> ..with usage in ByteBufferAs-X-Buffer instead of:
> 
>      public $type$ get(int i) {
>          $memtype$ x = unsafe.get$Memtype$Unaligned(bb.hb, 
> byteOffset(checkIndex(i)),
>              {#if[boB]?true:false});
>          return $fromBits$(x);
>      }
> 
> 
> the following could then be used:
> 
>      public $type$ get(int i) {
>          $memtype$ x = unsafe.get$Memtype$Unaligned(bb.hb, bb.address, 
> checkIndex(i),
>              {#if[boB]?true:false});
>          return $fromBits$(x);
>      }
> 
> 
> Regards, Peter
> 

Reply via email to