Just an update for the list, Steve seems to have narrowed it down to a compiler bug in the array assignment as Matt suggested. Mark
Quick update: just checking ierr isn’t enough to avoid the compiler bug, but adding zero to n1 is. Compiling the code as sent: 145, Loop not fused: function call before adjacent loop Generated vector sse code for the loop Generated 2 prefetch instructions for the loop … line 145 is the implied assignment loop for apar, which in the test code works. There is no output for the implied loop at the assignment of a_n1 = n1 If I add 0D0 to n1: 140, Loop not fused: function call before adjacent loop Generated vector sse code for the loop Generated 3 prefetch instructions for the loop Generated vector sse code for the loop Line 140 is where I do the add. Note we’ve changed from 2 prefetch instructions to 3 .. it’s issuing prefetches for all three assignments, where it used to just doing the last two. Note that the output is correct in this case. Now, for the full code: as written in the repository (the buggy version) scatter_to_xgc: 2184, Loop not vectorized/parallelized: contains call 2189, Loop not vectorized/parallelized: contains call 2194, Loop not vectorized/parallelized: contains call 2200, Loop not vectorized/parallelized: contains call Line 2200 is where the ‘n1’ assignment occurs, which is the only one that actually works. Lines 2204 and 2208, which are the broken apar and phi assignments, are curiously missing. If I add in the extra error checks (which makes the code work): scatter_to_xgc: 2184, Loop not vectorized/parallelized: contains call 2189, Loop not vectorized/parallelized: contains call 2194, Loop not vectorized/parallelized: contains call 2201, Loop not vectorized/parallelized: contains call 2206, Loop not vectorized/parallelized: contains call 2211, Loop not vectorized/parallelized: contains call Lines 2201, 2206, and 2211 are the n1, apar, and phi assignments respectively. I’m 99% sure this is a compiler bug involving the assignments. I’ll still try compiling the full code with ‘-g’ and valgrind, but I’m pretty sure this is it. --steve