On Thursday, 14 May 2020 at 13:26:23 UTC, Mike Parker wrote:
After reading a paper that grabbed his curiosity and wouldn't
let go, Andrei set out to determine if Lomuto partitioning
should still be considered inferior to Hoare for quicksort on
modern hardware. This blog post details his results.
Blog:
https://dlang.org/blog/2020/05/14/lomutos-comeback/
Nice stuff!
One curious question -- unless I've misread things horribly, it
looks like the D benchmarks for Lomuto branch-free are
consistently slower than for C++. Any idea why that is? I would
expect gcc/gdc and clang/ldc to produce effectively identical
results for code like this.