Vladimir Ivanov left some insights on how C2 handles this issue 
here http://richardstartin.uk/mmm-revisited/#comment-4915. In short, C2's 
autovectorizer gives up when it can't prove dest and source don't alias 
eachother, for instance, if there are offsets into the arrays, and the 
offsets differ.

On Saturday, January 19, 2019 at 7:51:33 PM UTC, Steven Stewart-Gallus 
wrote:
>
> It is well known that compilers can't optimise things as well if values 
> may alias. Because src and dest may overlap:
>
> public static void multiply(int[] accum, int accumOffset, int[] src, int 
> srcOffset, int n){
>     for (var ii = 0; ii < n; ++ii) {
>         dest[ii] *= src[ii];
>     }
> }
>
> does not have the same semantics as possibly better optimised versions.
>
> This is behind the difference between the memcpy and memmove functions in 
> C and why C99 has the restrict keyword and also why Fortran was 
> historically faster for a long time.
>
> Now Java doesn't have the restrict keyword but you can create separate 
> byte arrays and then checking whether arrays can possibly overlap is just a 
> pointer comparison and then if you have to you can dispatch to manually 
> optimised versions for that case.
>
> The thing is I am starting to wonder if aliasing is preventing one of my 
> programs from being optimised as much as it should.
>
> Do modern JVMs use aliasing information? Is it possible to give them more 
> aliasing information?
>
> Should I look into doing something like:
>
> public static void multiply(int[] dest, int destStart, int[] src, int 
> srcStart, int n) {
>     if (src != dest) {
>         multiply_nonaliasing(dest, destStart, src, srcStart, n);
>     } else if (srcStart + n < destStart || destStart + n < srcStart) {
>         multiply_nonoverlapping(dest, destStart, src, srcStart, n);
>     } else {
>         // maybe throw an error instead
>         multiply_overlapping(dest, destStart, src, srcStart, n);
>     }
> }
>
> How would one code multiply_nonalaising and nonoverlapping give the JVM 
> such information?
>
> I believe I have one of the worst case examples were I am emulating a big 
> long flat heap memory.
>
> private static byte[] bigHeap = new byte[8 * 4096];
>
> It seems silly to me but I am starting to wonder if an approach like:
>
> private static byte[][] bigHeapOfPages = new byte[8][4096];
>
> might theoretically be possible to make faster because I can make sure 
> things do not alias each other.
>
> There seem to be some complicated scholarly papers online about alias 
> disambiguation which mention this would be useful for solving performance 
> difficulties on the JVM but I don't know what the current state of things 
> are right now and what I can do personally right now.
>

-- 
You received this message because you are subscribed to the Google Groups 
"mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to mechanical-sympathy+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to