On 08/16/2016 10:51 AM, Johan Engelen wrote:
> On Tuesday, 16 August 2016 at 01:28:05 UTC, Ali Çehreli wrote:
>>
>> With ldc2, the best option is to go with a dynamic array ONLY IF you
>> access the elements through the .ptr property. As seen in the last
>> result, using the [] operator on the array is about 4 times slower
>> than that.
>
> As Yuxuan Shui mentioned the difference is in vectorization. The
> non-POINTER version is not vectorized because the semantics of the code
> is not the same as the POINTER version. Indexing `arr`, and writing to
> that address could change `arr.ptr`, and so the loop would do something
> different when "caching" `arr.ptr` in `p` (POINTER version) versus the
> case without caching (non-POINTER version).
>
> Evil code demonstrating the problem:
> ```
> ubyte evil;
> ubyte[] arr;
>
> void doEvil() {
>     // TODO: use this in the obfuscated-D contest
>     arr = (&evil)[0..50];
> }
> ```
>
> The compiler somehow has to prove that `arr[i]` will never point to
> `arr.ptr` (it's called Alias Analysis in LLVM).
>
> Perhaps it is UB in D to have `arr[i]` ever point into `arr` itself, I
> don't know. If so, the code is vectorizable and we can try to make it so.
>
> -Johan

Thank you all. That makes sense... Agreeing that the POINTER version is applicable only in some cases, looking only at the non-POINTER cases, for ldc2, a static array is faster, making the "arbitrary" 16MiB limit a performance issue. For ldc2, static array is about 40% faster:

6) ldc2 deneme.d -ofdeneme  -O5 -release -boundscheck=off -d-version=STATIC

  0.472s


8) ldc2 deneme.d -ofdeneme  -O5 -release -boundscheck=off

  0.792s


It's the opposite for dmd:

2) dmd deneme.d -ofdeneme -O -boundscheck=off -inline -version=STATIC

   4.238s


4) dmd deneme.d -ofdeneme -O -boundscheck=off -inline

   3.845s

Ali

Reply via email to