elsa wrote:
Hi guys,

I've got a problem with my program, in that the code just takes too
long to run. Here's what I'm doing. If anyone has any tips, they'd be
much appreciated!

So, say I have a list of lists that looks something like this (I'm
using a list of lists, rather than a list of tuples, as I need it to
be mutable):

myList = [[786,0],[45, 1],[673,1],...................[23,46]]

there are enough entries in the outer list, that the sum of myList[i]
[0] across all i could be as high as 10^7.

Now, what I need to do is randomly choose one myList[i], however the
distribution of my random choice needs to be proportional to the
values of myList[i][0]. So, for this list above, I'd have a much
higher chance of choosing myList[0] than myList[1].

Here is how I'm doing it at the moment:

def chooseI(myList):
        mySum=0
        choice = random.choice(range(1,sum([i[0] for i in myList])+1))
        for i in range(len(myList)):
                mySum+=myList[i][0]
                if mySum>=choice:
                        return i
                        break

This works just fine if sum([i[0] for i in myList]) is less than
10,000, say. However if its up around 10^7, the whole thing crashes.
Is there a faster way of doing this, that doesn't involve as many
computational steps?

Thanks!

elsa

"the whole thing crashes" -- give us the specific crash you see, including the stack trace. "Crash" is far too vague.

At first glance you could be running out of memory. But not with the values you're quoting. If a typical average value for i[0] is 50, that's only 200k elements. Anyway, you could print out the len(myList) to see how big the list is.

There are quicker ways to do the sum, but the first thing to change is the random.choice() stuff. Once you have a sum, you can use random.randint() on it directly. No need to make a range list.

Another approach might be to build a new list with a running sum in it. Then after doing the randint() on the last (highest) item, you can do a bisect.bisect on that list. The index that returns will be your return value.

DaveA

--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to