Re: Weighted random selection from list of lists

2005-10-09 Thread Steven D'Aprano
On Sat, 08 Oct 2005 12:48:26 -0400, Jesse Noller wrote:

 Once main_list is populated, I want to build a sequence from items
 within the lists, randomly with a defined percentage of the sequence
 coming for the various lists. For example, if I want a 6 item
 sequence, I might want:
 
 60% from list 1 (main_list[0])
 30% from list 2 (main_list[1])
 10% from list 3 (main_list[2])

If you are happy enough to match the percentages statistically rather than
exactly, simply do something like this:

pr = random.random()
if pr  0.6:
list_num = 0
elif pr  0.9:
list_num = 1
else:
list_num = 2
return random.choice(main_list[list_num])

or however you want to extract an item.

On average, this will mean 60% of the items will come from list1 etc, but
for small numbers of trials, you may have significant differences.



-- 
Steven.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Weighted random selection from list of lists

2005-10-08 Thread Ron Adam
Jesse Noller wrote:


 60% from list 1 (main_list[0])
 30% from list 2 (main_list[1])
 10% from list 3 (main_list[2])
 
 I know how to pull a random sequence (using random()) from the lists,
 but I'm not sure how to pick it with the desired percentages.
 
 Any help is appreciated, thanks
 
 -jesse

Just add up the total of all lists.

 total = len(list1)+len(list2)+len(list3)
 n1 = .60 * total# number from list 1
 n2 = .30 * total# number from list 2
 n3 = .10 * total# number from list 3

You'll need to decide how to handle when a list has too few items in it.

Cheers,
Ron
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Weighted random selection from list of lists

2005-10-08 Thread Peter Otten
Jesse Noller wrote:

 I'm probably missing something here, but I have a problem where I am
 populating a list of lists like this:
 
 list1 = [ 'a', 'b', 'c' ]
 list2 = [ 'dog', 'cat', 'panda' ]
 list3 = [ 'blue', 'red', 'green' ]
 
 main_list = [ list1, list2, list3 ]
 
 Once main_list is populated, I want to build a sequence from items
 within the lists, randomly with a defined percentage of the sequence
 coming for the various lists. For example, if I want a 6 item
 sequence, I might want:
 
 60% from list 1 (main_list[0])
 30% from list 2 (main_list[1])
 10% from list 3 (main_list[2])
 
 I know how to pull a random sequence (using random()) from the lists,
 but I'm not sure how to pick it with the desired percentages.


If the percentages can be normalized to small integral numbers, just make a
pool where each list is repeated according to its weight, e. g.
list1 occurs 6, list2 3 times, and list3 once:

pools = [list1, list2, list3]
weights = [6, 3, 1]
sample_size = 10

weighted_pools = []
for p, w in zip(pools, weights):
weighted_pools.extend([p]*w)

sample = [random.choice(random.choice(weighted_pools))
for _ in xrange(sample_size)]


Another option is to use bisect() to choose a pool:

pools = [list1, list2, list3]
sample_size = 10

def isum(items, sigma=0.0):
for item in items:
sigma += item
yield sigma

cumulated_weights = list(isum([60, 30, 10], 0))
sigma = cumulated_weights[-1]

sample = []
for _ in xrange(sample_size):
pool = pools[bisect.bisect(cumulated_weights, random.random()*sigma)]
sample.append(random.choice(pool))

(all code untested)

Peter
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Weighted random selection from list of lists

2005-10-08 Thread Scott David Daniels
Jesse Noller wrote:
paraphrased
 Once main_list is populated, I want to build a sequence from items
 within the lists, randomly with a defined percentage of the sequence
 coming for the various lists. For example:
 60% from list 1 (main_list[0]), 30% from list 2 (main_list[1]), 10% from list 
 3 (main_list[2])


import bisect, random
main_list = [['a', 'b', 'c'],
  ['dog', 'cat', 'panda'],
  ['blue', 'red', 'green']]
weights = [60, 30, 10]

cumulative = []
total = 0
for index, value in enumerate(weights):
 total += value
 cumulative.append(total)

for i in range(20):
 score = random.random() * total
 index = bisect.bisect(cumulative, score)
 print random.choice(main_list[index]),


-- 
-Scott David Daniels
[EMAIL PROTECTED]
-- 
http://mail.python.org/mailman/listinfo/python-list