On 15/05/2020 12:28, Joseph Rushton Wakeling via Digitalmars-d-announce wrote:
> On Thursday, 14 May 2020 at 13:26:23 UTC, Mike Parker wrote:
>> After reading a paper that grabbed his curiosity and wouldn't let go, Andrei 
>> set out to determine if Lomuto partitioning should still be considered 
>> inferior to Hoare for quicksort on modern hardware. This blog post details 
>> his results.
>>
>> Blog:
>> https://dlang.org/blog/2020/05/14/lomutos-comeback/
> 
> Nice stuff!
> 
> One curious question -- unless I've misread things horribly, it looks like 
> the D benchmarks for Lomuto branch-free are consistently slower than for C++. 
>  Any idea why that is?  I would expect gcc/gdc and clang/ldc to produce 
> effectively identical results for code like this.

Sorry for the belated response, as far as I can see, gdc and g++ only differ on 
one line.

    auto delta = smaller & (read - first);

This is lowered as:

    delta = smaller & (read - first) / 8;

However, g++ uses an exact divide operator (semantically that ignores 
rounding), whereas gdc uses a truncating divide operator (semantically rounds 
the quotient towards zero).

I'm willing to bet a beer on tweaking pointer subtraction will get gdc in 
lockstep with g++.

Reply via email to