On 08/16/2016 10:51 AM, Johan Engelen wrote: > On Tuesday, 16 August 2016 at 01:28:05 UTC, Ali Çehreli wrote: >> >> With ldc2, the best option is to go with a dynamic array ONLY IF you >> access the elements through the .ptr property. As seen in the last >> result, using the [] operator on the array is about 4 times slower >> than that. > > As Yuxuan Shui mentioned the difference is in vectorization. The > non-POINTER version is not vectorized because the semantics of the code > is not the same as the POINTER version. Indexing `arr`, and writing to > that address could change `arr.ptr`, and so the loop would do something > different when "caching" `arr.ptr` in `p` (POINTER version) versus the > case without caching (non-POINTER version). > > Evil code demonstrating the problem: > ``` > ubyte evil; > ubyte[] arr; > > void doEvil() { > // TODO: use this in the obfuscated-D contest > arr = (&evil)[0..50]; > } > ``` > > The compiler somehow has to prove that `arr[i]` will never point to > `arr.ptr` (it's called Alias Analysis in LLVM). > > Perhaps it is UB in D to have `arr[i]` ever point into `arr` itself, I > don't know. If so, the code is vectorizable and we can try to make it so. > > -Johan
Thank you all. That makes sense... Agreeing that the POINTER version is applicable only in some cases, looking only at the non-POINTER cases, for ldc2, a static array is faster, making the "arbitrary" 16MiB limit a performance issue. For ldc2, static array is about 40% faster:
6) ldc2 deneme.d -ofdeneme -O5 -release -boundscheck=off -d-version=STATIC 0.472s 8) ldc2 deneme.d -ofdeneme -O5 -release -boundscheck=off 0.792s It's the opposite for dmd: 2) dmd deneme.d -ofdeneme -O -boundscheck=off -inline -version=STATIC 4.238s 4) dmd deneme.d -ofdeneme -O -boundscheck=off -inline 3.845s Ali