On Jul 25, 1:46 pm, Iain King <[EMAIL PROTECTED]> wrote: > On Jul 25, 10:57 am, Suresh Pillai <[EMAIL PROTECTED]> wrote: > > > > > I am performing simulations on networks (graphs). I have a question on > > speed of execution (assuming very ample memory for now). I simplify the > > details of my simulation below, as the question I ask applies more > > generally than my specific case. I would greatly appreciate general > > feedback in terms of computing and of course considerations specific to > > implementation in Python. > > > The nodes in my network may be ON or OFF. The network starts off with > > all nodes in the OFF state. I loop through the nodes. For each node > > that is OFF, I consider some probability of it turning ON based on the > > states of its neighbours. I MUST GO THROUGH ALL NODES BEFORE DECIDING > > WHICH ONES TO TURN ON. > > > So my question is whether it is faster to > > > 1. loop through a list of ALL nodes and check for OFF nodes using ifs > > > or to > > > 2. loop through a container of OFF nodes and remove from this when they > > turn ON > > or 3. build a new list every iteration intead of deleting from the old > one: > > while processing: > new_off_list = [] > for x in off_list: > if goes_on(x): > on_list.append(x) > else: > new_off_list.append(x) > off_list = new_off_list > generation += 1 > > Iain
I was curious to what extent the different methods varied in time, so I checked it out. Here there are three procedures: test_every which matches your (1), destructive which matches your (2), and constructive which is (3) as I've outlined above. On varying the size of the dataset I get this (probability a node goes on = 50%): Length of initial list: 100000 Test every: 1.16085492357 Destructive: 2.592310272 Constructive: 0.850312458886 Length of initial list: 200000 Test every: 2.48013843287 Destructive: 9.20894689718 Constructive: 1.73562198439 Length of initial list: 400000 Test every: 5.00652267447 Destructive: 44.9696004134 Constructive: 3.51687329373 Length of initial list: 800000 Test every: 9.67657648655 Destructive: 220.57583941 Constructive: 7.06614485537 and changing the probability that a nodes goes on (dataset size = 200000): Probability goes on: 1/2 Test every: 2.24765364513 Destructive: 9.28801971614 Constructive: 1.62770773816 Probability goes on: 1/4 Test every: 4.77387350904 Destructive: 13.4432467571 Constructive: 3.45467140006 Probability goes on: 1/8 Test every: 11.0514899721 Destructive: 18.4026878278 Constructive: 6.86778036177 Probability goes on: 1/16 Test every: 22.5896021593 Destructive: 25.7784044083 Constructive: 13.8631404605 Probability goes on: 1/32 Test every: 49.7667941179 Destructive: 39.3652502735 Constructive: 27.2527219598 Probability goes on: 1/64 Test every: 91.0523955153 Destructive: 65.7747103963 Constructive: 54.4087322936 Code: import random from timeit import Timer SIZE = 100000 MAX = 2 def goes_on(x): global MAX return random.randint(1,MAX) == 1 def test_every(): global SIZE print "Test every:", nodes = range(SIZE) is_on = [False for x in xrange(SIZE)] count = SIZE while count: for i,x in enumerate(nodes): if not is_on[i] and goes_on(x): is_on[i] = True count -= 1 def destructive(): global SIZE print "Destructive:", off_list = range(SIZE) on_list = [] count = SIZE while count: for i in xrange(len(off_list)-1, -1, -1): x = off_list[i] if goes_on(x): on_list.append(x) del(off_list[i]) count -= 1 def constructive(): global SIZE print "Constructive:", off_list = range(SIZE) on_list = [] count = SIZE while count: new_off_list = [] for x in off_list: if goes_on(x): on_list.append(x) count -= 1 else: new_off_list.append(x) off_list = new_off_list #SIZE = 200000 while True: print "Length of initial list:", SIZE #print "Probability goes on: 1/%d" % MAX print Timer("test_every()", "from __main__ import test_every").timeit(1) print Timer("destructive()", "from __main__ import destructive").timeit(1) print Timer("constructive()", "from __main__ import constructive").timeit(1) print SIZE *= 2 #MAX *= 2 Conclusions: On size, (2) really doesn't like bigger datasets, taking exponentially longer as it increases, while (1) and (3) happily increase linearly. (3) is faster. On probability it's (1) who's the loser, while (2) and (3) are happy. (3) is once again faster. I think (2)'s poor performance is being amplified by how python handles lists and list deletions; the effect may be stymied in other languages, or by using other data constructs in python (like a dictionary or a user made list class). If you were short on memory then (2) would have an advantage, but as it is, (3) is the clear winner. I'm a fan of list comprehensions, and it feels like they could be nice here, but since we are making two lists at once here I don't see how to... anyone see how to use them (or 'map' if you want to be old school)? Iain -- http://mail.python.org/mailman/listinfo/python-list