On 1/22/15 2:06 PM, Justin Whear wrote:
On Thu, 22 Jan 2015 13:40:56 -0800, Andrei Alexandrescu wrote:
There's this classic patter on Unix: |sort|uniq, i.e. sort some data and
only display the unique elements.
What would be a better integrated version - one that does sorting and
uniq in one shot? I suspect the combination could be quite a bit better
than doing the two in sequence.
A few google searches didn't yield much. Ideas?
Thanks,
Andrei
Efficiency improvement-wise, perhaps a generalization of a counting sort
(http://en.wikipedia.org/wiki/Counting_sort), see "Variant algorithms."
Thanks. I'm thinking of something based on general comparisons.
Something like a modified quicksort with fat pivot:
A. Partition into three subranges: R1<pivot, R2=pivot, R3>pivot
B. Recurse on R1 obtaining R11, a subrange of R1 containing only unique
elements.
C. Sort-move-unique R2 into the portion after R11, obtaining the result.
A and B seem fine, I'm unclear about C. It would be an algorithm that
sorts and simultaneously moves data in a range that overlaps the input.
Andrei