On Jun 27, 2007, at 5:29 PM, Andrew Lentvorski wrote:

Christopher Smith wrote:

Are we constructing or reusing lists?
Haskell is a purely functional language. :-)

Ignoring the obvious non sequitur ...
It's not a non-sequitur. In a functional language you are always *defining* a new list, and whether a copy is made or not isn't really specified by the source code. Since you can't specify state, you can't specify copying.

Unfortunately, I'm pretty sure that the algorithm is O(n^2) time *and* o(n^2) space in *all* cases because of the requirement to copy the lists around.
No such requirement exists (and yes, the wiki page suggests otherwise, but that's not exactly what it is saying), and even if it did, I don't see how that would do anything beyond growing constants on the time side and making the space side grow just as much as the time side (i.e. be O(nlogn) for the common case).

This reference seems to agree:
http://en.literateprograms.org/Quicksort_(Haskell)

Stewart should look at that link as it addresses most of his issues, although it doesn't use quickcheck to do unit testing.

and needs to use this code to get O(n log n) time with O(n) space:

This isn't needed to get this level of big O performance (not surprising, since that is a function of the algorithm, not the code). Rather, it is necessary to get better performance out of the code. Indeed, if it were, you'd expect qsort3 to be thousands of times (100000/log(100000) times) faster and consume 1/100000th the space than qsort1 for sorting 100,000 elements (well, I guess it is conceivable that qsort3's constant factor could be substantially worse, but there is no reason to think that this is true for at least running time).

qsort3' (x:xs) y = part xs [] [x] []
    where
        part [] l e g = qsort3' l (e ++ (qsort3' g y))
        part (z:zs) l e g
            | z > x     = part zs l e (z:g)
            | z < x     = part zs (z:l) e g
            | otherwise = part zs l (z:e) g


Doesn't look so clean or obvious anymore.

Perhaps I'm insane, but this doesn't seem any harder to read. I'll agree that coming up with it in the first place requires some knowledge of standard optimization tricks (namely the time honored tradition of avoiding concatenation), but reading it isn't that difficult.

In fact, what you described is the *classic* pitfall of functional languages. It is very easy to write code that performs absolutely horribly because of all the implied copying.

Yes, although runtimes can and do figure out how to minimize this, which is why you don't see the "slow" version being that much worse than the "fast" version. The same is not true with other languages.

In this instance, someone spent the extra time to do a functional- style mergesort that would be matched in performance (and better in space!) by someone who wrote a stupid shell sort ( O(n^2) time, O (n) space) in imperative style.

a) They didn't do a mergesort. The GHC List.sort function does a merge sort, which tends to be the preferred way to do a stable sort (for example, std::sort is quicksort but std::stable_sort is mergesort) for obvious reasons. b) "stupid" shell sort is theoretically faster than quicksort for worst cases, and not much worse on average. Quicksort tends to be faster in practice because real world machines don't follow the abstract computing model. Similarly, it is not clear that even the original qsort implementation posted would necessarily perform more slowly on average than "a stupid shell sort in imperative style".

Worse, a naive Haskell programmer would actually have been more successful with writing a shell sort even in Haskell (which would give the expected O(n^2) time, O(n) space)!
This is also far from clear to me.

Looking "just like the textbook algorithm" and then executing in the wrong O() bounds is hardly a good example of where functional programming shines.
Again, it is not clear to me at all that it would perform in the wrong O() bounds of the equivalent pseudo code.

--Chris

--
[email protected]
http://www.kernel-panic.org/cgi-bin/mailman/listinfo/kplug-lpsg

Reply via email to