Re: Are python objects thread-safe?
On Dec 23, 7:30 am, Duncan Booth wrote: > Aaron Brady wrote: > > Th.1 Th.2 > > a=X > > a=Y > > a=Z > > > You are saying that if 'a=Z' interrupts 'a=Y' at the wrong time, the > > destructor for 'X' or 'Y' might not get called. Correct? In serial > > flow, the destructor for X is called, then Y. > > No, the destructors will be called, but the destructors can do pretty much > anything they want so you can't say the assignment is atomic. This isn't > actually a threading issue: you don't need multiple threads to experience > werid issues here. If you do strange things in a destructor then you can > come up with confusing code even with a single thread. I see. What about del a a= Z Then, can we say 'a=Z' is atomic? At least, it eliminates the destructor issue you raise. > >> Other nasty things can happen if you use dictionaries from multiple > >> threads. You cannot add or remove a dictionary key while iterating over > >> a dictionary. This isn't normally a big issue, but as soon as you try to > >> share the dictionary between threads you'll have to be careful never to > >> iterate through it. > > > These aren't documented, IIRC. Did you just discover them by trial > > and error? > > It is documented, but I can't remember where for Python 2.x. For Python 3, > PEP 3106 says: "As in Python 2.x, mutating a dict while iterating over it > using an iterator has an undefined effect and will in most cases raise a > RuntimeError exception. (This is similar to the guarantees made by the Java > Collections Framework.)" I infer that d.items() holds the GIL during the entire operation, and it's safe to put in a thread. It is merely using an iterator that is unsafe. (Python 3.0 removed d.items(), leaving only the iterator, I understand.) I'm looking at the code, and I don't see where the size is safely checked. That is, can't I sneak in an add and a remove during iteration, so long as it doesn't catch me? I'm looking at 'dict_traverse': while (PyDict_Next(op, &i, &pk, &pv)) { Py_VISIT(pk); Py_VISIT(pv); } No locks are acquired here, though I might have missed acquiring the GIL somewhere else. In the OP's example, he wasn't changing the size of the dict. -- http://mail.python.org/mailman/listinfo/python-list
Re: Are python objects thread-safe?
En Tue, 23 Dec 2008 11:30:25 -0200, Duncan Booth escribió: Aaron Brady wrote: Th.1 Th.2 a=X a=Y a=Z You are saying that if 'a=Z' interrupts 'a=Y' at the wrong time, the destructor for 'X' or 'Y' might not get called. Correct? In serial flow, the destructor for X is called, then Y. No, the destructors will be called, but the destructors can do pretty much anything they want so you can't say the assignment is atomic. This isn't actually a threading issue: you don't need multiple threads to experience werid issues here. If you do strange things in a destructor then you can come up with confusing code even with a single thread. A simple example showing what you said: py> class A: ... def __del__(self): ... global a ... a = None ... py> a = A() py> a = 3 py> print a None -- Gabriel Genellina -- http://mail.python.org/mailman/listinfo/python-list
Re: Are python objects thread-safe?
Aaron Brady wrote: > Th.1 Th.2 > a=X >a=Y > a=Z > > You are saying that if 'a=Z' interrupts 'a=Y' at the wrong time, the > destructor for 'X' or 'Y' might not get called. Correct? In serial > flow, the destructor for X is called, then Y. No, the destructors will be called, but the destructors can do pretty much anything they want so you can't say the assignment is atomic. This isn't actually a threading issue: you don't need multiple threads to experience werid issues here. If you do strange things in a destructor then you can come up with confusing code even with a single thread. > >> Other nasty things can happen if you use dictionaries from multiple >> threads. You cannot add or remove a dictionary key while iterating over >> a dictionary. This isn't normally a big issue, but as soon as you try to >> share the dictionary between threads you'll have to be careful never to >> iterate through it. > > These aren't documented, IIRC. Did you just discover them by trial > and error? > It is documented, but I can't remember where for Python 2.x. For Python 3, PEP 3106 says: "As in Python 2.x, mutating a dict while iterating over it using an iterator has an undefined effect and will in most cases raise a RuntimeError exception. (This is similar to the guarantees made by the Java Collections Framework.)" -- Duncan Booth http://kupuguy.blogspot.com -- http://mail.python.org/mailman/listinfo/python-list
Re: Are python objects thread-safe?
On Dec 22, 2:59 am, Duncan Booth wrote: > RajNewbie wrote: > > Say, I have two threads, updating the same dictionary object - but for > > different parameters: > > Please find an example below: > > a = {file1Data : '', > > file2Data : ''} > > > Now, I send it to two different threads, both of which are looping > > infinitely: > > In thread1: > > a['file1Data'] = open(filename1).read > > and > > in thread2: > > a['file2Data'] = open(filename2).read > > > My question is - is this object threadsafe? - since we are working on > > two different parameters in the object. Or should I have to block the > > whole object? > > It depends exactly what you mean by 'threadsafe'. The GIL will guarantee > that you can't screw up Python's internal data structures: so your > dictionary always remains a valid dictionary rather than a pile of bits. > > However, when you dig a bit deeper, it makes very few guarantees at the > Python level. Individual bytecode instructions are not guaranteed > atomic: for example, any assignment (including setting a new value into > the dictionary) could overwrite an existing value and the value which is > overwritten may have a destructor written in Python. If that happens you > can get context switches within the assignment. Th.1 Th.2 a=X a=Y a=Z You are saying that if 'a=Z' interrupts 'a=Y' at the wrong time, the destructor for 'X' or 'Y' might not get called. Correct? In serial flow, the destructor for X is called, then Y. > Other nasty things can happen if you use dictionaries from multiple > threads. You cannot add or remove a dictionary key while iterating over > a dictionary. This isn't normally a big issue, but as soon as you try to > share the dictionary between threads you'll have to be careful never to > iterate through it. These aren't documented, IIRC. Did you just discover them by trial and error? > You will probably find it less error prone in the long run if you get > your threads to write (key,value) tuples into a queue which the > consuming thread can read and use to update the dictionary. Perhaps there's a general data structure which can honor 'fire-and- forget' method calls in serial. a= async( {} ) a[0]= X a[0]= Y --> obj_queue[a].put( a.__setitem__, 0, X ) obj_queue[a].put( a.__setitem__, 0, Y ) If you need the return value, you'll need to block. print a[0] --> res= obj_queue[a].put( a.__getitem__, 0 ) res.wait() return res print res Or you can use a Condition object. But you can also delegate the print farther down the line of processing: obj_queue[a].link( print ).link( a.__getitem__, 0 ) (As you can see, the author (I) finds it a more interesting problem to get required information in the right places at the right times in execution. The actual implementation is left to the reader; I'm merely claiming that there exists a consistent one taking the above instructions to be sufficient givens.) -- http://mail.python.org/mailman/listinfo/python-list
Re: Are python objects thread-safe?
RajNewbie wrote: > Say, I have two threads, updating the same dictionary object - but for > different parameters: > Please find an example below: > a = {file1Data : '', >file2Data : ''} > > Now, I send it to two different threads, both of which are looping > infinitely: > In thread1: > a['file1Data'] = open(filename1).read > and > in thread2: > a['file2Data'] = open(filename2).read > > My question is - is this object threadsafe? - since we are working on > two different parameters in the object. Or should I have to block the > whole object? > It depends exactly what you mean by 'threadsafe'. The GIL will guarantee that you can't screw up Python's internal data structures: so your dictionary always remains a valid dictionary rather than a pile of bits. However, when you dig a bit deeper, it makes very few guarantees at the Python level. Individual bytecode instructions are not guaranteed atomic: for example, any assignment (including setting a new value into the dictionary) could overwrite an existing value and the value which is overwritten may have a destructor written in Python. If that happens you can get context switches within the assignment. Other nasty things can happen if you use dictionaries from multiple threads. You cannot add or remove a dictionary key while iterating over a dictionary. This isn't normally a big issue, but as soon as you try to share the dictionary between threads you'll have to be careful never to iterate through it. You will probably find it less error prone in the long run if you get your threads to write (key,value) tuples into a queue which the consuming thread can read and use to update the dictionary. -- Duncan Booth http://kupuguy.blogspot.com -- http://mail.python.org/mailman/listinfo/python-list
Re: Are python objects thread-safe?
On Dec 21, 11:51 am, RajNewbie wrote: > Say, I have two threads, updating the same dictionary object - but for > different parameters: > Please find an example below: > a = {file1Data : '', > file2Data : ''} > > Now, I send it to two different threads, both of which are looping > infinitely: > In thread1: > a['file1Data'] = open(filename1).read > and > in thread2: > a['file2Data'] = open(filename2).read > > My question is - is this object threadsafe? - since we are working on > two different parameters in the object. Or should I have to block the > whole object? In general, python makes few promises. It has a *strong* preference towards failing gracefully (ie an exception rather than a segfault), which implies atomic operations underneath, but makes no promise as to the granularity of those atomic operations. In practice though, it is safe to update two distinct keys in a dict. -- http://mail.python.org/mailman/listinfo/python-list
Re: Are python objects thread-safe?
On Dec 21, 12:51 pm, RajNewbie wrote: > Say, I have two threads, updating the same dictionary object - but for > different parameters: > Please find an example below: > a = {file1Data : '', > file2Data : ''} > > Now, I send it to two different threads, both of which are looping > infinitely: > In thread1: > a['file1Data'] = open(filename1).read > and > in thread2: > a['file2Data'] = open(filename2).read > > My question is - is this object threadsafe? - since we are working on > two different parameters in the object. Or should I have to block the > whole object? Threads take turns with the Global Interpreter Lock, so a Python thread is sure to have the GIL before it calls a method on some object. So yes, with the rare exception (that I don't want to not mention) that if you've got non-Python threads running in your process somehow, they don't make the guarantee of enforcing that. -- http://mail.python.org/mailman/listinfo/python-list
Re: Are python objects thread-safe?
On Mon, Dec 22, 2008 at 4:51 AM, RajNewbie wrote: > Say, I have two threads, updating the same dictionary object - but for > different parameters: > Please find an example below: > a = {file1Data : '', > file2Data : ''} > > Now, I send it to two different threads, both of which are looping > infinitely: > In thread1: > a['file1Data'] = open(filename1).read > and > in thread2: > a['file2Data'] = open(filename2).read > > My question is - is this object threadsafe? - since we are working on > two different parameters in the object. Or should I have to block the > whole object? I believe (iirc), all basic data types and objects are thread-safe. I could be wrong though - I don't tend to use threads much myself :) cheers James -- http://mail.python.org/mailman/listinfo/python-list
Are python objects thread-safe?
Say, I have two threads, updating the same dictionary object - but for different parameters: Please find an example below: a = {file1Data : '', file2Data : ''} Now, I send it to two different threads, both of which are looping infinitely: In thread1: a['file1Data'] = open(filename1).read and in thread2: a['file2Data'] = open(filename2).read My question is - is this object threadsafe? - since we are working on two different parameters in the object. Or should I have to block the whole object? -- http://mail.python.org/mailman/listinfo/python-list