Isn't the C2 intrinsic just reading the value starting at the specified offset directly (when unaligned access is supported) and not doing the branching?
On Thu, Mar 12, 2015 at 1:15 PM, Peter Levart <peter.lev...@gmail.com> wrote: > > > On 03/10/2015 08:02 PM, Andrew Haley wrote: > > The new algorithm does an N-way branch, always loading and storing > subwords according to their natural alignment. So, if the address is > random and the size is long it will access 8 bytes 50% of the time, 4 > shorts 25% of the time, 2 ints 12.5% of the time, and 1 long 12.5% of > the time. So, for every random load/store we have a 4-way branch. > > > > ...so do you think it would be better if the order of checks in if/else > chain: > > 972 public final long getLongUnaligned(Object o, long offset) { > 973 if ((offset & 7) == 0) { > 974 return getLong(o, offset); > 975 } else if ((offset & 3) == 0) { > 976 return makeLong(getInt(o, offset), > 977 getInt(o, offset + 4)); > 978 } else if ((offset & 1) == 0) { > 979 return makeLong(getShort(o, offset), > 980 getShort(o, offset + 2), > 981 getShort(o, offset + 4), > 982 getShort(o, offset + 6)); > 983 } else { > 984 return makeLong(getByte(o, offset), > 985 getByte(o, offset + 1), > 986 getByte(o, offset + 2), > 987 getByte(o, offset + 3), > 988 getByte(o, offset + 4), > 989 getByte(o, offset + 5), > 990 getByte(o, offset + 6), > 991 getByte(o, offset + 7)); > 992 } > 993 } > > > ...was reversed: > > if ((offset & 1) == 1) { > // bytes > } else if ((offset & 2) == 2) { > // shorts > } else if ((offset & 4) == 4) { > // ints > } else { > // longs > } > > > ...or are JIT+CPU smart enough and there would be no difference? > > > Peter > >