From: "danielx" <[EMAIL PROTECTED]> Date: 22 Jul 2006 01:43:30 -0700
Boris Borcic wrote: > does > > x.sort(cmp = lambda x,y : cmp(random.random(),0.5)) > > pick a random shuffle of x with uniform distribution ? ... Let e be the element which was in the first position to begin with. What are its chances of being there after being "sorted"? Well, in order for it to still be there, it would have to survive N rounds of selection. In each selection, it has 1/2 chance of surviving. That means its total chance of survival is 1/(2**N), which is much less than 1/N. QED! This proof makes sense to me if the sorting algorithm makes a random decision every time it swaps. Couldn't we easily get an n*log(n) shuffle by performing a _single_ mapping of each element in the collection to a pair made up of a random key and its value, and then sorting based on the keys and stripping them out? i.e., in O(n) time, turn "2 clubs", "2 diamonds", "2 hearts", "2 spades", "3 clubs" into something like [0.395, "2 clubs"], [0.158, "2 diamonds"], [0.432, "2 hearts"], [0.192, "2 spades"], [0.266, "3 clubs"] and then in O(n*log(n)) time, sort it into [0.158, "2 diamonds"], [0.192, "2 spades"], [0.266, "3 clubs"], [0.395, "2 clubs"], [0.432, "2 hearts"] and then strip out the keys again in O(n) time? Or perhaps this has been discussed already and I missed that part? I just joined the list... apologies if this is a repeat. You can accomplish this by doing what I will call "random selection". It would be like linear selection, except you wouldn't bother checking the head against every other element. When you want to figure out what to put at the head, just pick at random! I suspect this is what Python does in random.shuffle, since it is rather an easy to see it would result in something uniform (I swear I haven't looked at the source code for random.shuffle :P). But, after making the linear selection, don't you still need to use O(log(n)) time to find and remove the item from the collection? I don't know much about Python's internal data structures, but if there's an array in there, you need O(n) time to move up everything after the deleted item; if there's a list, you need O(n) time to find the element you picked at random; if there's a balanced tree, you could find it and delete it in O(log(n)) time... Help me review my undergraduate data structures here -- is there some way to pick each item _exactly_once_ in O(n) time? Dave Wonnacott -- http://mail.python.org/mailman/listinfo/python-list