Liam Clarke wrote: > Hi all, > > I'm working in my first multi-threaded environments, and I think I > might have just been bitten by that. > > class Parser: > def __init__(self, Q): > self.Q = Q > self.players = {} > self.teams = {} > > def sendData(self): > if not self.players or not self.teams: return > self.Q.put((self.players, self.teams)) > self.resetStats() > > def resetStats(): > for key in self.players: > self.players[key] = 0 > for key in self.teams: > self.teams[key] = 0 >
> What I'm finding is that if a lot more sets of zeroed data are being > sent to the DAO than should occur. > > If the resetStats() call is commented out, data is sent correctly. I > need to reset the variables after each send so as to not try and > co-ordinate state with a database, otherwise I'd be away laughing. > > My speculation is that because the Queue is shared between two > threads, one of which is looping on it, that a data write to the Queue > may actually occur after the next method call, the resetStats() > method, has occurred. > > So, the call to Queue.put() is made, but the actual data is accessedin > memory by the Queue after resetStats has changed it. You're close. The call to Queue.put() is synchronous - it will finish before the call to resetStats() is made - but the *data* is still shared. What is in the Queue is references to the dicts that is also referenced by self.players and self.teams. The actual dict is not copied! This is normal Python function call and assignment semantics, but in this case it's not what you want. You have a race condition - if the data in the Queue is processed before the call to resetStats() is made, it will work fine; if resetStats() is called first, it will be a problem. Actually there are many possible failures since resetStats() loops over the dicts, the consumer could be interleaving its reads with the writes in resetStats(). What you need to do is copy the data, either before you put it in the queue or as part of the reset. I suggest rewriting resetStats() to create new dicts because dict.fromkeys() will do just what you want: def resetStats(): self.players = dict.fromkeys(self.players.keys(), 0) self.teams = dict.teams(self.players.keys(), 0) This way you won't change the data seen by the consumer thread. > I've spent about eight hours so far trying to debug this; I've never > been this frustrated in a Python project before to be honest... I've > reached my next skill level bump, so to speak. Yes, threads can be mind-bending until you learn to spot the gotchas like this. By the way you also have a race condition here: > if self.dump: > self.parser.sendDat() > self.dump = False Possibly the thread that sets self.dump will set it again between the time you test it and when you reset it. If the setting thread is on a timer and the time is long enough, it won't be a problem, but it is a potential bug. Kent _______________________________________________ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor