Re: sortUniq

H. S. Teoh via Digitalmars-d Thu, 22 Jan 2015 16:28:47 -0800

On Thu, Jan 22, 2015 at 04:10:45PM -0800, Andrei Alexandrescu via Digitalmars-d 
wrote:
> On 1/22/15 2:51 PM, H. S. Teoh via Digitalmars-d wrote:
> >Wait, if R1 and R3, after the recursion, are sorted and contain only
> >unique elements, doesn't that mean that we can just collapse R2 into
> >a single element along with the endpoints of R1 and R3? So if R1[$-1]
> >< pivot, then we're done, otherwise coalesce it with the pivot, and
> >likewise with R3[0].
> 
> Right, forgot to mention the pivot.
> 
> You don't recurse on the right, only the left; that way you massage
> the unique stuff in R1 toward the left in the possibly smaller range
> R11. Then the challenge is to sort, uinique-fy, and move in one fell
> swoop from R3 (padded with the pivot to the left) into the portion
> adjacent to R11.
> 
> Example: say the range has 100 elements. Then we divide like this:
> 
> r[0 .. 30] is <pivot
> r[30 .. 34] is =pivot
> r[34 .. 100] is >pivot
> 
> First, recurse on r[0 .. 30] getting a smaller range that's sorted and
> unique. Say r[0 .. 24].
> 
> The step needed now is to take r[33 .. 100], which is ONE copy of the
> pivot plus everything greater than pivot, and in one shot deposit it
> starting at r[24], sorted and uniquefied.
> 
> I'm thinking of a modified heapsort that progressively creates the
> heap at r[24], or something like that.
[...]


Well, that would work. Recurse on left to get a uniqSorted left segment,
then modified heapsort on right (instead of heapsorting in place,
deposit extracted heap elements into the target range).

The question, though, is whether this hybrid recursion actually wins us
anything, since heapsort is known to be somewhat slower than quicksort.
In the worst case, you choose a poor pivot and get *both* the worst case
of quicksort and the inferior performance of heapsort compounded
together.

Unless there's a way to modify heapsort to be more efficient as well?

Actually, another idea occurred to me: suppose we modify partition()
instead, so that instead of partitioning in-place, it deposits
partitioned elements into a target (possibly overlapping) range. Then we
modify the right-recursion, so that its partitioning step actually
simultaneously moves the range over to the target area for us. Not sure
if this will eliminate the copying cost, but it might, since partition
has to traverse the entire subrange anyway.


T

-- 
Старый друг лучше новых двух.

Re: sortUniq

Reply via email to