Dave Storrs wrote: > Hmmm...this is interesting. A friend of mine who is in the > process of getting her graduate degree in CS/information theory stuff > recently told me that it has been mathematically proven that no sort can > run faster than O(n log n) unless you know something about the incoming > data. I guess that radix sort _only_ works on strings (or stringified > things), then?
Radix works on numbers, too; you start with the least significant digit and work to the left. Of course, I think of numbers as stringified, anyways. There's no difference between base 10 and base n where n is the number of characters. Radix sorts in columns. So, if you are sorting 5 digit numbers, you make 5 passes through the data. This would be o(5n), but as n gets huge, the contant 5 really doesn't matter, so you drop it and it becomes O(n). Some algorithms are O(n^2) but really take o(n^2 + 2n + 1) to run, but big O makes it (n^2) because we're interested in what happens as n goes towards infinity. To take a stab at your question about "knowing something about the incoming data", and I'm guessing here, we do know how many columns of data we have, and therefore know just how big our constant in front of the n is. If this constant is too big (bigger than log n I would guess), we might want to consider a different sort. In order to use radix, we'd probably also want to know that our data is not "in order" or even "close to order". Part of the problem is that we now are taking extra time to look at our data and analyse it before deciding what sort to use. Maybe at this point you might want to just use mergesort or quicksort and be done with it. I'm going to go mess with Benchmark.pm and see what happens. Okay, maybe it's wandering a little far from Perl Beginners, but it sure is interesting! Pete -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]