On Dec 22, 2:59 am, Duncan Booth <duncan.bo...@invalid.invalid> wrote: > RajNewbie <raj.indian...@gmail.com> wrote: > > Say, I have two threads, updating the same dictionary object - but for > > different parameters: > > Please find an example below: > > a = {file1Data : '', > > file2Data : ''} > > > Now, I send it to two different threads, both of which are looping > > infinitely: > > In thread1: > > a['file1Data'] = open(filename1).read > > and > > in thread2: > > a['file2Data'] = open(filename2).read > > > My question is - is this object threadsafe? - since we are working on > > two different parameters in the object. Or should I have to block the > > whole object? > > It depends exactly what you mean by 'threadsafe'. The GIL will guarantee > that you can't screw up Python's internal data structures: so your > dictionary always remains a valid dictionary rather than a pile of bits. > > However, when you dig a bit deeper, it makes very few guarantees at the > Python level. Individual bytecode instructions are not guaranteed > atomic: for example, any assignment (including setting a new value into > the dictionary) could overwrite an existing value and the value which is > overwritten may have a destructor written in Python. If that happens you > can get context switches within the assignment.
Th.1 Th.2 a=X a=Y a=Z You are saying that if 'a=Z' interrupts 'a=Y' at the wrong time, the destructor for 'X' or 'Y' might not get called. Correct? In serial flow, the destructor for X is called, then Y. > Other nasty things can happen if you use dictionaries from multiple > threads. You cannot add or remove a dictionary key while iterating over > a dictionary. This isn't normally a big issue, but as soon as you try to > share the dictionary between threads you'll have to be careful never to > iterate through it. These aren't documented, IIRC. Did you just discover them by trial and error? > You will probably find it less error prone in the long run if you get > your threads to write (key,value) tuples into a queue which the > consuming thread can read and use to update the dictionary. Perhaps there's a general data structure which can honor 'fire-and- forget' method calls in serial. a= async( {} ) a[0]= X a[0]= Y --> obj_queue[a].put( a.__setitem__, 0, X ) obj_queue[a].put( a.__setitem__, 0, Y ) If you need the return value, you'll need to block. print a[0] --> res= obj_queue[a].put( a.__getitem__, 0 ) res.wait() return res print res Or you can use a Condition object. But you can also delegate the print farther down the line of processing: obj_queue[a].link( print ).link( a.__getitem__, 0 ) (As you can see, the author (I) finds it a more interesting problem to get required information in the right places at the right times in execution. The actual implementation is left to the reader; I'm merely claiming that there exists a consistent one taking the above instructions to be sufficient givens.) -- http://mail.python.org/mailman/listinfo/python-list