> -----Original Message----- > From: Tom Lane [mailto:[EMAIL PROTECTED] > Sent: Friday, December 16, 2005 10:41 PM > To: Dann Corbit > Cc: Qingqing Zhou; Bruce Momjian; Luke Lonergan; Neil Conway; pgsql- > [EMAIL PROTECTED] > Subject: Re: [HACKERS] Re: Which qsort is used > > "Dann Corbit" <[EMAIL PROTECTED]> writes: > >> I've still got a problem with these checks; I think they are a net > >> waste of cycles on average. > > > The benchmarks say that they (order checks) are a good idea on average > > for ordered data, random data, and partly ordered data. > > There are lies, damn lies, and benchmarks ;-) > > The problem with citing a benchmark for this discussion is that a > benchmark can't tell you anything about real-world probabilities; > it only tells you about the probabilities occuring in the benchmark > case. You need to make the case that the benchmark reflects the > real world, which you didn't. > > > If you trace the algorithms in a debugger you will be surprised at how > > often the partitions are ordered, even with random sets as input. > > Well, I do agree that checking for orderedness on small partitions would > succeed more often than on larger partitions or the whole file --- but > the code-as-given checks all the way down. Moreover, the argument given > for spending these cycles is that insertion sort sucks on reverse-order > input ... where "sucks" means that it spends O(N^2) time. But it spends > O(N^2) in the average case, too.
I agree that in general these checks are not important and they complicate what is a simple and elegant algorithm. The cases where the checks are important (highly ordered data sets) are rare and so the value added is minimal. I am actually quite impressed with the excellence of Bentley's sort out of the box. It's definitely the best library implementation of a sort I have seen. ---------------------------(end of broadcast)--------------------------- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly