Re: Lockless algorithms in python (Nothing to do with GIL)
I'm not an expert on this, but wouldn't it be more dependent on the platform than python version? Perhaps it is Windows 7 that is very slow. Perhaps my processor architecture. Not sure... Here are some for 3.1.2x64 import timeit timeit.timeit('Lock()', 'from threading import Lock') 1.4162585386092708 timeit.timeit('dict()', 'from threading import Lock') 0.2730348901369162 timeit.timeit('list()', 'from threading import Lock') 0.1719480219512306 -Zac On Fri, Jul 2, 2010 at 1:26 PM, Antoine Pitrou solip...@pitrou.net wrote: On Mon, 28 Jun 2010 16:46:41 -0700 Zac Burns zac...@gmail.com wrote: In my experience it is far more expensive to allocate a lock in python then it is the types that use them. Here are some examples: timeit.timeit('Lock()', 'from threading import Lock') 1.4449114807669048 timeit.timeit('dict()') 0.2821554294221187 timeit.timeit('list()') 0.17358153222312467 I'm not sure what Python version on what machine you are using, but here (Python 2.7): timeit.timeit('Lock()', 'from threading import Lock') 0.09944796562194824 timeit.timeit('dict()') 0.17817902565002441 timeit.timeit('list()') 0.19633007049560547 timeit.timeit('{}') 0.03823709487915039 timeit.timeit('[]') 0.05156302452087402 -- http://mail.python.org/mailman/listinfo/python-list -- http://mail.python.org/mailman/listinfo/python-list
Re: Lockless algorithms in python (Nothing to do with GIL)
On Mon, 28 Jun 2010 16:46:41 -0700 Zac Burns zac...@gmail.com wrote: In my experience it is far more expensive to allocate a lock in python then it is the types that use them. Here are some examples: timeit.timeit('Lock()', 'from threading import Lock') 1.4449114807669048 timeit.timeit('dict()') 0.2821554294221187 timeit.timeit('list()') 0.17358153222312467 I'm not sure what Python version on what machine you are using, but here (Python 2.7): timeit.timeit('Lock()', 'from threading import Lock') 0.09944796562194824 timeit.timeit('dict()') 0.17817902565002441 timeit.timeit('list()') 0.19633007049560547 timeit.timeit('{}') 0.03823709487915039 timeit.timeit('[]') 0.05156302452087402 -- http://mail.python.org/mailman/listinfo/python-list
Re: Lockless algorithms in python (Nothing to do with GIL)
On Mon, 2010-06-28 at 16:46 -0700, Zac Burns wrote: In my experience it is far more expensive to allocate a lock in python then it is the types that use them. Here are some examples: timeit.timeit('Lock()', 'from threading import Lock') 1.4449114807669048 timeit.timeit('dict()') 0.2821554294221187 timeit.timeit('list()') 0.17358153222312467 Granted, I know these types do not allocate a lock - but a similarly defined user type would need to allocate a lock to guarantee thread-safety. Sure, but I think you're timing the wrong thing here. You would only allocate the lock relatively rarely - it's the overhead of *acquiring* the lock that's the real problem. r...@durian:~$ python -m timeit -s from threading import Lock; l = Lock() l.acquire(); l.release() 100 loops, best of 3: 0.378 usec per loop Compare to one of my favourite dict methods, setdefault: r...@durian:~$ python -m timeit -s d = dict() d.setdefault('hello','world') 1000 loops, best of 3: 0.189 usec per loop In the same ballpark really. I have not done a great deal of research into lockless algorithms thusfar, but would it be worthwhile to investigate implementing some of them in python to optimize performance critical sections of code? Many lockless algorithms that I have looked at thusfar require a Compare and Swap operation. Does python have an equivalent to this? Is python high level enough that it has something better than this or it simply doesn't need it? I've managed to avoid locking in some cases by using dict.setdefault() as a kind of atomic test-and-set operation. It's not a full compare-and-swap but you could implement a simple locking scheme using it like this: class PretendLock: def __init__(self): self.d = dict() def acquire(self): myid = threading.currentThread().ident while self.d.setdefault(owner,myid) != myid: pass def release(self): del self.d[owner] This depends on some peculiarities of the CPython implementation that aren't guaranteed by the language spec - namely that thread switches can't happen while executing C code, that dict.setdefault is implemented in C, and that is can calculate the hash of a string key without calling back out to python code. Usually, it's not worth it. The one time I use this regularly is for lazy object initialisation, e.g. something like this: def get_my_object(key): try: return objects[key] except KeyError: return objects.setdefault(key,create_my_object(key)) If two threads happen to try initialising the object at the same time, only one will wind up in the dict and being used; the other will be discarded and garbage-collected. Cheers, Ryan -- Ryan Kelly http://www.rfk.id.au | This message is digitally signed. Please visit r...@rfk.id.au| http://www.rfk.id.au/ramblings/gpg/ for details signature.asc Description: This is a digitally signed message part -- http://mail.python.org/mailman/listinfo/python-list
Re: Lockless algorithms in python (Nothing to do with GIL)
Sure, but I think you're timing the wrong thing here. You would only allocate the lock relatively rarely - it's the overhead of *acquiring* the lock that's the real problem. r...@durian:~$ python -m timeit -s from threading import Lock; l = Lock() l.acquire(); l.release() 100 loops, best of 3: 0.378 usec per loop Compare to one of my favourite dict methods, setdefault: r...@durian:~$ python -m timeit -s d = dict() d.setdefault('hello','world') 1000 loops, best of 3: 0.189 usec per loop In the same ballpark really. I've managed to avoid locking in some cases by using dict.setdefault() as a kind of atomic test-and-set operation. It's not a full compare-and-swap but you could implement a simple locking scheme using it like this: class PretendLock: def __init__(self): self.d = dict() def acquire(self): myid = threading.currentThread().ident while self.d.setdefault(owner,myid) != myid: pass def release(self): del self.d[owner] This depends on some peculiarities of the CPython implementation that aren't guaranteed by the language spec - namely that thread switches can't happen while executing C code, that dict.setdefault is implemented in C, and that is can calculate the hash of a string key without calling back out to python code. Usually, it's not worth it. The one time I use this regularly is for lazy object initialisation, e.g. something like this: def get_my_object(key): try: return objects[key] except KeyError: return objects.setdefault(key,create_my_object(key)) If two threads happen to try initialising the object at the same time, only one will wind up in the dict and being used; the other will be discarded and garbage-collected. Thanks Ryan, Agreed, I was timing the wrong thing - but largely because I wasn't sure what to time against. setdefault certainly seems to be a candidate for something to time. Even so, setdefault wins. Though, I wouldn't want to use PretendLock in this case... it doesn't seem that having the spin loop would justify reducing reducing the lock allocation overhead in a no-contention scenario. Yes, get_my_object is a good trick. I did just that to avoid instantiating extra locks in one scenario. -- Zachary Burns (407)590-4814 Aim - Zac256FL Production Engineer Zindagi Games -- http://mail.python.org/mailman/listinfo/python-list
Re: Lockless algorithms in python (Nothing to do with GIL)
On Mon, 2010-06-28 at 18:27 -0700, Zac Burns wrote: I've managed to avoid locking in some cases by using dict.setdefault() as a kind of atomic test-and-set operation. It's not a full compare-and-swap but you could implement a simple locking scheme using it like this: class PretendLock: def __init__(self): self.d = dict() def acquire(self): myid = threading.currentThread().ident while self.d.setdefault(owner,myid) != myid: pass def release(self): del self.d[owner] This depends on some peculiarities of the CPython implementation that aren't guaranteed by the language spec - namely that thread switches can't happen while executing C code, that dict.setdefault is implemented in C, and that is can calculate the hash of a string key without calling back out to python code. Thanks Ryan, Agreed, I was timing the wrong thing - but largely because I wasn't sure what to time against. setdefault certainly seems to be a candidate for something to time. Even so, setdefault wins. Though, I wouldn't want to use PretendLock in this case... it doesn't seem that having the spin loop would justify reducing reducing the lock allocation overhead in a no-contention scenario. Oh totally - nobody reading this should even consider using that PretendLock class for anything. It's just an example of how to use setdefault as a test-and-set kind of operator. Ryan -- Ryan Kelly http://www.rfk.id.au | This message is digitally signed. Please visit r...@rfk.id.au| http://www.rfk.id.au/ramblings/gpg/ for details signature.asc Description: This is a digitally signed message part -- http://mail.python.org/mailman/listinfo/python-list
Re: Lockless algorithms in python (Nothing to do with GIL)
Many lockless algorithms that I have looked at thusfar require a Compare and Swap operation. Does python have an equivalent to this? Is python high level enough that it has something better than this or it simply doesn't need it? Python does have a GIL, and contrary to the title, this has a lot to do with the GIL. Specifically, any set of operations protected by the GIL are atomic in the CPython interpreter. We can therefore get global synchronization 'for free', allowing 'lock free' data structures that avoid using threading.Lock, and replaces it with, well, nothing. :) How to do it (it takes tiny amounts of C or Cython): Take at look at page 14: http://www.dabeaz.com/python/UnderstandingGIL.pdf The critical part is the variable _Py_Ticker in Python's source file ceval.c, which is declared volatile int. As it is not declared static it is a global variable, and we can actually manipulate its value from a C extension: extern volatile int _Py_Ticker; // evil!!! Now imagine temporarily setting _Py_Ticker to some ridiculous high value like 0x7fff. That locks out all other threads, practically forever, until we restore _Py_Ticker back to it's original value. What this gives us is global synchronization for free! The GIL is there anyway, so it adds no additional overhead. By messing with _Py_Ticker we can force blocks of Python code to become atomic in the interpreter. With this hack we can therefore create data structures that are lock free relative to the CPython interpreter. It is evil, but very cheap compared to using threading.Lock. In fact it has exactly zero overhead: the GIL is there anyway, and it is already acquired! The safest placement of a _Py_Ticker hack is in a context manager. We have to make sure it gets restored to a sane value, even in presence of an exception. Enjoy :) P.S. If you get smart and think sys.setcheckinterval(sys.maxint) would do the trick, you are actually not that smart. It will lock our own thread out with a Poisson rate of 1./sys.getcheckinterval(). Thus we have to attack _Py_Ticker from C. P.P.S. And when we have taken control over the GIL, don't do anything that might release it from C (like reading from a file). We cannot afford to loose it :) -- http://mail.python.org/mailman/listinfo/python-list
Re: Lockless algorithms in python (Nothing to do with GIL)
Many lockless algorithms that I have looked at thusfar require a Compare and Swap operation. Does python have an equivalent to this? Is python high level enough that it has something better than this or it simply doesn't need it? Python does have a GIL, and contrary to the title, this has a lot to do with the GIL. Specifically, any set of operations protected by the GIL are atomic in the CPython interpreter. We can therefore get global synchronization 'for free', allowing 'lock free' data structures that avoid using threading.Lock, and replaces it with, well, nothing. :) How to do it (it takes tiny amounts of C or Cython): Take at look at page 14: http://www.dabeaz.com/python/UnderstandingGIL.pdf The critical part is the variable _Py_Ticker in Python's source file ceval.c, which is declared volatile int. As it is not declared static it is a global variable, and we can actually manipulate its value from a C extension: extern volatile int _Py_Ticker; // evil!!! Now imagine temporarily setting _Py_Ticker to some ridiculous high value like 0x7fff. That locks out all other threads, practically forever, until we restore _Py_Ticker back to it's original value. What this gives us is global synchronization for free! The GIL is there anyway, so it adds no additional overhead. By messing with _Py_Ticker we can force blocks of Python code to become atomic in the interpreter. With this hack we can therefore create data structures that are lock free relative to the CPython interpreter. It is evil, but very cheap compared to using threading.Lock. In fact it has exactly zero overhead: the GIL is there anyway, and it is already acquired! The safest placement of a _Py_Ticker hack is in a context manager. We have to make sure it gets restored to a sane value, even in presence of an exception. Enjoy :) P.S. If you get smart and think sys.setcheckinterval(sys.maxint) would do the trick, you are actually not that smart. It will lock our own thread out with a Poisson rate of 1./sys.getcheckinterval(). Thus we have to attack _Py_Ticker from C. P.P.S. And when we have taken control over the GIL, don't do anything that might release it from C (like reading from a file). We cannot afford to loose it :) -- http://mail.python.org/mailman/listinfo/python-list
Re: Lockless algorithms in python (Nothing to do with GIL)
On Mon, 2010-06-28 at 19:45 -0700, sturlamolden wrote: Many lockless algorithms that I have looked at thusfar require a Compare and Swap operation. Does python have an equivalent to this? Is python high level enough that it has something better than this or it simply doesn't need it? Python does have a GIL, and contrary to the title, this has a lot to do with the GIL. Specifically, any set of operations protected by the GIL are atomic in the CPython interpreter. We can therefore get global synchronization 'for free', allowing 'lock free' data structures that avoid using threading.Lock, and replaces it with, well, nothing. :) How to do it (it takes tiny amounts of C or Cython): Take at look at page 14: http://www.dabeaz.com/python/UnderstandingGIL.pdf The critical part is the variable _Py_Ticker in Python's source file ceval.c, which is declared volatile int. As it is not declared static it is a global variable, and we can actually manipulate its value from a C extension: extern volatile int _Py_Ticker; // evil!!! Now imagine temporarily setting _Py_Ticker to some ridiculous high value like 0x7fff. That locks out all other threads, practically forever, until we restore _Py_Ticker back to it's original value. What this gives us is global synchronization for free! The GIL is there anyway, so it adds no additional overhead. Very interesting idea. Will it work if accessed through ctypes? ticker = ctypes.c_int.in_dll(ctypes.pythonapi,_Py_Ticker) ticker.value = 0x7fff Or does ctypes muck with the GIL in a way that would break this idea? Ryan -- Ryan Kelly http://www.rfk.id.au | This message is digitally signed. Please visit r...@rfk.id.au| http://www.rfk.id.au/ramblings/gpg/ for details signature.asc Description: This is a digitally signed message part -- http://mail.python.org/mailman/listinfo/python-list
Re: Lockless algorithms in python (Nothing to do with GIL)
On 29 Jun, 05:11, Ryan Kelly r...@rfk.id.au wrote: Very interesting idea. Will it work if accessed through ctypes? ticker = ctypes.c_int.in_dll(ctypes.pythonapi,_Py_Ticker) ticker.value = 0x7fff Or does ctypes muck with the GIL in a way that would break this idea? ctypes.pythonapi PyDLL 'python dll', handle 1e00 at 1ffead0 It's a PyDLL, so it should not mock with the GIL. from contextlib import contextmanager import ctypes _Py_Ticker = ctypes.c_int.in_dll(ctypes.pythonapi,_Py_Ticker) @contextmanager def atomic(): tmp = _Py_Ticker.value _Py_Ticker.value = 0x7fff yield _Py_Ticker.value = tmp - 1 Now we can do with atomic(): # whatever pass Just make sure we don't call extension code that releases the GIL :) -- http://mail.python.org/mailman/listinfo/python-list