On Jan 4, 7:55 pm, [EMAIL PROTECTED] wrote: > Hello, > This is a question for the best method (in terms of performance > only) to choose a random element from a list among those that satisfy > a certain property. > > This is the setting: I need to pick from a list a random element > that satisfies a given property. All or none of the elements may have > the property. Most of the time, many of the elements will satisfy the > property, and the property is a bit expensive to evaluate. Chance of > having the property are uniform among elements. > > A simple approach is: > > import random > def random_pick(a_list,property): > '''Returns a random element from a list that has the property > > Returns None if no element has the property > ''' > random.shuffle(a_list) > for i in a_list: > if property(i): return i > > but that requires to shuffle the list every time. > > A second approach, that works if we know that at least one element of > the list has the property, is: > > import random > def random_pick(a_list,property): > '''Returns a random element from a list that has the property > > Loops forever if no element has the property > ''' > while 1: > i=random.choice(a_list) > if property(i): return i > > which is more efficient (on average) if many elements of the list have > the property and less efficient if only few elements of the list has > the property (and goes crazy if no element has the property) > > Yet another one: > > import random > def random_pick(a_list,property): > '''Returns a random element from a list that has the property > ''' > b_list=[x for x in a_list if property(x)] > try: > return random.choice(b_list) > finally: return None > > but this one checks the property on all the elements, which is no > good. > > I don't need strong random numbers, so a simple solution like: > import random > globalRNG=random.Random() > > def random_pick(a_list,property): > '''Returns a random element from a list that has the property > > Works only if len(a_list)+1 is prime: uses Fermat's little theorem > ''' > a=globalRNG(1,len(a_list)) > ind=a > for i in xrange(len(a_list)): > x=a_list[a-1] > if property(x):return x > ind*=a > > but this works only if len(a_list)+1 is prime!!! Now this one could be > saved if there were an efficient method to find a prime number given > than a number n but not very much greater... > > Any other ideas? Thanks everybody
Caching might help. If random_pick is called several times with the same list(s) then cache the result of [property(i) for i in a_list] against a_list. If random_pick is called several times with list(s) whith multiple instances of list items then cache property(i) against i for i in a_list . You could do both. You might investigate wether property(i) could be implemented using a faster algorithm, maybe table lookup, or interpolation from initial table lookup. - Paddy. -- http://mail.python.org/mailman/listinfo/python-list