[Bug libstdc++/35968] nth_element fails to meet its complexity requirements
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=35968 --- Comment #14 from Stephen Howe --- (In reply to Anders Kaseorg from comment #13) > (In reply to Patrick J. LoPresti from comment #12) > > I am familiar with the usual algorithmic complexity definitions. > > > > So, just to be clear... Your assertion is that the C++ standards committee > > adopted a specification that rules out a deterministic implementation? > > I should have been clearer: I’m saying it rules out quickselect with a > deterministic pivot selection rule that doesn’t inspect Θ(n) elements. > Quickselect with randomized Θ(1) pivot selection would satisfy the > specification, as would quickselect with deterministic Θ(n) pivot selection > by median-of-medians or similar, but not quickselect with deterministic Θ(1) > pivot selection by taking the first element or similar. I am the original bug reporter. The C++ standard is not good enough for this algorithm. In David Musser's original paper http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.14.5196=rep1=pdf he describes Introsort, and in the ISO C++ 2011 standard, the complexity requirements tightened for std::sort() from O(n * log n) on average, to O(n * log n) in the worse case. Introsort guarantees that. It uses quicksort unless it detects that pivot selection is going bad for portions of the array, it which case it uses heapsort for which has no known worse case. So guaranteed worse case performance. Now David Musser also wrote about intraselect which analogously uses quickselect but his paper did not say which algorithm should be swapped to if quickselect went bad. The median-of-medians (in at least groups of 5) is such an algorithm with O(n) performance. Heapselect is not O(n). So the C++ standard could use Intraselect with guaranteed O(n) performance, not just on average. The big O complexity could be tightened up in the same way that std::sort() was tightened up in ISO C++ 2011. > Quickselect with randomized Θ(1) pivot selection No. It can be beaten. Even randomized Θ(1) pivot selection is not good enough. Antiqsort by Doug McIlroy can beat it. See https://www.cs.dartmouth.edu/~doug/aqsort.c Cheers Stephen Howe
[Bug libstdc++/35968] nth_element fails to meet its complexity requirements
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=35968 --- Comment #13 from Anders Kaseorg --- (In reply to Patrick J. LoPresti from comment #12) > I am familiar with the usual algorithmic complexity definitions. > > So, just to be clear... Your assertion is that the C++ standards committee > adopted a specification that rules out a deterministic implementation? I should have been clearer: I’m saying it rules out quickselect with a deterministic pivot selection rule that doesn’t inspect Θ(n) elements. Quickselect with randomized Θ(1) pivot selection would satisfy the specification, as would quickselect with deterministic Θ(n) pivot selection by median-of-medians or similar, but not quickselect with deterministic Θ(1) pivot selection by taking the first element or similar.
[Bug libstdc++/35968] nth_element fails to meet its complexity requirements
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=35968 --- Comment #12 from Patrick J. LoPresti --- (In reply to Anders Kaseorg from comment #11) > (In reply to Patrick J. LoPresti from comment #10) > > Complexity: Linear on average. > > > > It is not obvious (to me) what distribution the "on average" refers to. All > > permutations of input with equal probability, I suppose (?) > > Expected running time is typically measured over the random choices (if any) > made internally by the algorithm, not over a random input distribution. For > example, quickselect with random pivot selection would satisfy this, but not > quickselect with deterministic pivot selection. I am familiar with the usual algorithmic complexity definitions. So, just to be clear... Your assertion is that the C++ standards committee adopted a specification that rules out a deterministic implementation?
[Bug libstdc++/35968] nth_element fails to meet its complexity requirements
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=35968 Anders Kaseorg changed: What|Removed |Added CC||andersk at mit dot edu --- Comment #11 from Anders Kaseorg --- (In reply to Patrick J. LoPresti from comment #10) > Complexity: Linear on average. > > It is not obvious (to me) what distribution the "on average" refers to. All > permutations of input with equal probability, I suppose (?) Expected running time is typically measured over the random choices (if any) made internally by the algorithm, not over a random input distribution. For example, quickselect with random pivot selection would satisfy this, but not quickselect with deterministic pivot selection. Sometimes expected running time on average-case inputs is studied, but referring to that definitely requires more words.
[Bug libstdc++/35968] nth_element fails to meet its complexity requirements
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=35968 Patrick J. LoPresti lopresti at gmail dot com changed: What|Removed |Added CC||lopresti at gmail dot com --- Comment #10 from Patrick J. LoPresti lopresti at gmail dot com --- Worth noting, perhaps, that the C++ standard does _not_ require O(n) worst-case behavior for nth_element. The exact wording (C++11 section 25.4.2 [alg.nth.element] paragraph (3)) says: Complexity: Linear on average. It is not obvious (to me) what distribution the on average refers to. All permutations of input with equal probability, I suppose (?) Anyway, just because GCC falls back to an O(N log N) algorithm under some circumstances does not necessarily mean it violates the C++ specification, as long as that fallback only happens in log N of the cases or thereabouts. Not that I am up for actually performing this complexity analysis.
[Bug libstdc++/35968] nth_element fails to meet its complexity requirements
--- Comment #9 from paolo dot carlini at oracle dot com 2010-01-28 17:16 --- Roger told me in private that he doesn't mean to actively work on this issue any time soon. Unassigning. -- paolo dot carlini at oracle dot com changed: What|Removed |Added AssignedTo|roger at eyesopen dot com |unassigned at gcc dot gnu ||dot org Status|ASSIGNED|NEW http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35968
[Bug libstdc++/35968] nth_element fails to meet its complexity requirements
--- Comment #8 from paolo dot carlini at oracle dot com 2009-12-15 17:19 --- Roger, are you still actively working on this? -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35968
[Bug libstdc++/35968] nth_element fails to meet its complexity requirements
--- Comment #7 from sjhowe at dial dot pipex dot com 2008-04-28 20:17 --- Roger I agree with your analysis. I am slightly crestfallen as I was suspicious that Barriato, Hofri etc's paper never mentioned worst case only approximate case. And I have now seen BFPRT73 paper where it mentions for the number of columns (c) Note, however, that we must have c = 5 for PICK to run in linear time. So yes, median-of median of 5 seems best you can do for worst case. The only point in shifting to a different algorithm for the worse case is if (i) It is cheaper in terms of total comparisons, swaps, assignments etc (ii) It is still linear in N in the worse case Cheers Stephen Howe -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35968
[Bug libstdc++/35968] nth_element fails to meet its complexity requirements
--- Comment #5 from roger at eyesopen dot com 2008-04-24 15:01 --- Well, I've now had time to read the Barriato, Hofri et al. 2002 paper, and the bad news is that such an approximate median selection algorithm can't be used to guarantee an O(N) worst-case std::nth_element implementation. It could be used in an implementation to guess a good pivot, but the quality of this median, i.e. how approximate it is, doesn't meet the necessary criterion to ensure an O(N) worst case. You'd still need a fallback method with guaranteed bounds or an exact median in order to achieve O(N). i.e. it could help improve the average case performance, but doesn't help with the worst case. For the mathematically inclined, in order to achieve an O(N) worst-case performance, you need to guarantee a constant fraction of elements can be eliminated at each level of the recursion. In comment #4, Steven fixates on just as long as N/2 elements are reduced each time round, but the equations for sum of geometric series show that doing better than any constant fraction guarantees O(N) worst case. Hence even if you only guarantee that you can eliminate 10% each round, you still achieve O(N) worst-case. Hence you need a method that provides an approximate median that worst-case can guarantee elimination of say 10% of elements from consideration. This is why approximate medians offer some utility over exact medians if they can be found faster. Unfortunately, the method of Battiato referenced in comment #1 doesn't provide such a constant fraction guarantee. An analysis shows that at each round, it can only eliminate (2^n-1)/3^n of the elements in its worst case, where n is log_3(N). By hand, naming the ranks 0..N-1, when N=3, the true median at rank 1 is selected. For N=9, the elements at rank 3,4 or 5 may be considered as a median, i.e. 1/3 eliminated. For N=27, the elements between ranks 8 and 20 may be returned as the median, i.e. 7/27 eliminated. In the limit, as N tends towards infinity (and n tends to infinity), the eliminated fraction (2^n-1)/3^n tends to zero unbounded. i.e. the larger the input size the less useful is the worst-case median. The poor quality of the median is lamented by the authors in the penultimate paragraph of section 4.1 of the paper. They then go on to show that statistically such a worst-case is rare, but unfortunately even a rare worst case breaks the C++ standard libraries O(N) constraint. This Achilles heel is already well documented in the algorithmic complexity community. The Blum, Floyd, Pratt, Rivest and Trarjan paper [BFRT73] and the Floyd and Rivest paper [FR75] analyse the issues with median-of-k-medians, and show that k=5 is the lowest value capable of guaranteed fractional worst case. i.e. they already consider and reject the algorithm given in the cited work (k=3) for the purpose of exact median finding. Anyway, I hope you find this interesting. There will always be efficient methods for finding approximate medians. The question is how efficient vs. how approximate. Many quicksort implementation select the first element as a pivot, an O(1) method for selecting an extremely approximate median! Statistically over all possible input orders, this first element will on average partition the input array at the median, with some variance. It's not that the paper is wrong or incorrect; it does what it describes in finding a statistically good approximate median very efficiently with excellent worst case performance. Unfortunately for the problem we need to solve, which is not the problem the paper's authors were attempting to solve, we need a better approximation perhaps using a more complex implementation. Anyway, thanks again for the reference. I'd not come across it before and really enjoyed reading it. Let me know if you spot a flaw in my reasoning above. Dr Roger Sayle, Ph.D. Computer Science -- -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35968
[Bug libstdc++/35968] nth_element fails to meet its complexity requirements
--- Comment #6 from bkoz at gcc dot gnu dot org 2008-04-24 23:39 --- Can I re-classify this as an enhancement? -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35968
[Bug libstdc++/35968] nth_element fails to meet its complexity requirements
--- Comment #3 from pcarlini at suse dot de 2008-04-21 08:24 --- Many thanks Roger for your further help on nth_element, excellent news. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35968
[Bug libstdc++/35968] nth_element fails to meet its complexity requirements
--- Comment #4 from sjhowe at dial dot pipex dot com 2008-04-21 18:51 --- Yes. You want a partition that is O(1) that each time round eliminates N/2 elements (bearing in mind the post-condition for nth_element where iterators greater than the kth iterator have elements that are = the kth element and iterators less than the kth iterator have elements that are = the kth element) So median-of-3 or for large N is a must. And this is run for 3/2 * log2(N) steps. If it has not converged by end of steps (so it has been a bad run) or new N is some constant (making binary insertion sort worthwhile) then at that point you want the cheapest approximate median algorithm that is _guaranteed_ O(N). The algorithm is still O(N) as the choosing median and partitioning is occuring in serial. In this case, it is minimising the constant factor that matters. The median-of-median of 5 is well known But this approximate median is less well known. So it is the combination of (i) guaranteed O(N) partitioning (ii) cheapest constant factor (so minimising iterator traversal, copy ctor, swapping, comparing etc) I have not yet checked median of median-of-5 against this, but I believe (ii) works out cheaper. And if it works that there exists an even cheaper guranteed O(N) partitioning that finds an approximate median, gcc should use that. It does not matter if the exact median is found each time round just as long as N/2 elements are reduced each time round. I have also been alerted to a intraselect variation of nth_element() that looks promising. It is this: If you wanted the iterator that was 20% in from the start and you sampled elements to choose a partition element, instead of choosing an approximate median, choose the element that is 20%. So if the sample was 5 elements, choose the 2nd element. Now if partitioning was lop-sided but you reduced the number of elements drastically leaving a tiny amount as candidates, you have massively reduced the problem. This is brand new research in nth_element but I have not yet seen much analysis. Stephen Howe -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35968
[Bug libstdc++/35968] nth_element fails to meet its complexity requirements
--- Comment #1 from pcarlini at suse dot de 2008-04-20 17:30 --- Roger, can you have a look? Thanks in advance. -- pcarlini at suse dot de changed: What|Removed |Added CC||roger at eyesopen dot com http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35968
[Bug libstdc++/35968] nth_element fails to meet its complexity requirements
--- Comment #2 from roger at eyesopen dot com 2008-04-21 03:22 --- Yep, now that we're back in stage1, it's about time I got around to submitting the O(n) worst case nth_element implementation that I mentioned last year. For Steven's benefit, the implementation I've already coded up uses the median-of-medians in groups of five strategy as a fallback to a modified quickselect. [Though I'll need to quickly read the paper you cite] The trick for libstdc++ is to attempt to make the typical case as fast or faster than the existing implementation. Whilst the standards now require O(n) worst case, the perceived performance of g++'s users is the average case and changing to an O(n) implementation that has a large co-efficient constant may upset some folks. -- roger at eyesopen dot com changed: What|Removed |Added AssignedTo|unassigned at gcc dot gnu |roger at eyesopen dot com |dot org | Status|UNCONFIRMED |ASSIGNED Ever Confirmed|0 |1 Last reconfirmed|-00-00 00:00:00 |2008-04-21 03:22:22 date|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35968