On Tuesday, 19 January 2016 at 00:11:40 UTC, Andrei Alexandrescu wrote:
4. sort was and is attackable before all of these changes

(b) Improve sort() first, then apply a similar strategy to improving topN. Do not use RNGs at all.

Since point 4 is in fact already fixed a good while ago, my suggestion would be (b): to do the same for topN.

Introselect (https://en.wikipedia.org/wiki/Introselect), already mentioned in this thread, uses median-of-medians to achieve worst case O(n) performance if we recurse too deep. There is already an issue suggesting to implement it: https://issues.dlang.org/show_bug.cgi?id=12141 (std.algorithm: implement deterministic topN).

In fact, the O(n log n) heap approach as it is implemented now could be faster than O(n) median-of-medians approach on reasonable inputs. So, someone will have to implement median-of-medians and, well, measure.

At the very least, googling for "median of medians in practice" and such yields the tag wiki from StackOverflow.com: "The constant factor in the O(n) is large, and the algorithm is not commonly used in practice.".

Ivan Kazmenko.

Reply via email to