Re: basic thread question
On 25 Aug, 13:33, Piet van Oostrum wrote: > I have heard about that also, but is there a Python implementation that > uses this? (Just curious, I am not using Windows.) On Windows we have three different versions of Python 2.6: * Python 2.6 for Win32/64 (from python.org) does not have os.fork. * Python 2.6 for Cygwin has os.fork, but it is non-COW and sluggish. * Python 2.6 for SUA has a fast os.fork with copy-on-write. You get Python 2.6.2 for SUA prebuilt by Microsoft from http://www.interopsystems.com. Using Python 2.6 for SUA is not without surprices: For example, the process is not executed from the Win32 subsystem, hence the Windows API is inaccessible. That means we cannot use native Windows GUI. Instead we must run an X11 server on the Windows subsystem (e.g. X- Win32), and use the Xlib SUA has installed. You can compare SUA to a stripped down Linux distro, on which you have to build and install most of the software you want to use. I do not recommend using Python for SUA instead of Python for Windows unless you absolutely need a fast os.fork or have a program that otherwise requires Unix. But for running Unix apps on Windows, SUA is clearly superior to Cygwin. Licencing is also better: Programs compiled against Cygwin libraries are GPL (unless you buy a commercial license). Program compiled against SUA libraries are not. Sturla Molden -- http://mail.python.org/mailman/listinfo/python-list
Re: basic thread question
> sturlamolden (s) wrote: >s> On 25 Aug, 01:26, Piet van Oostrum wrote: >>> That's because it doesn't use copy-on-write. Thereby losing most of its >>> advantages. I don't know SUA, but I have vaguely heard about it. >s> SUA is a version of UNIX hidden inside Windows Vista and Windows 7 >s> (except in Home and Home Premium), but very few seem to know of it. >s> SUA (Subsystem for Unix based Applications) is formerly known as >s> Interix, which is a certified version of UNIX based on OpenBSD. If you >s> go to http://www.interopsystems.com (a website run by Interop Systems >s> Inc., a company owned by Microsoft), you will find a lot of common >s> unix tools prebuilt for SUA, including Python 2.6.2. >s> The NT-kernel supports copy-on-write fork with a special system call >s> (ZwCreateProcess in ntdll.dll), which is what SUA's implementation of >s> fork() uses. I have heard about that also, but is there a Python implementation that uses this? (Just curious, I am not using Windows.) -- Piet van Oostrum URL: http://pietvanoostrum.com [PGP 8DAE142BE17999C4] Private email: p...@vanoostrum.org -- http://mail.python.org/mailman/listinfo/python-list
Re: basic thread question
On 25 Aug, 01:26, Piet van Oostrum wrote: > That's because it doesn't use copy-on-write. Thereby losing most of its > advantages. I don't know SUA, but I have vaguely heard about it. SUA is a version of UNIX hidden inside Windows Vista and Windows 7 (except in Home and Home Premium), but very few seem to know of it. SUA (Subsystem for Unix based Applications) is formerly known as Interix, which is a certified version of UNIX based on OpenBSD. If you go to http://www.interopsystems.com (a website run by Interop Systems Inc., a company owned by Microsoft), you will find a lot of common unix tools prebuilt for SUA, including Python 2.6.2. The NT-kernel supports copy-on-write fork with a special system call (ZwCreateProcess in ntdll.dll), which is what SUA's implementation of fork() uses. -- http://mail.python.org/mailman/listinfo/python-list
Re: basic thread question
> sturlamolden (s) wrote: >s> On 24 Aug, 13:21, Piet van Oostrum wrote: >>> But os.fork() is not available on Windows. And I guess refcounts et al. >>> will soon destroy the sharing. >s> Well, there is os.fork in Cygwin and SUA (SUA is the Unix subsytem in >s> Windows Vista Professional). Cygwin's fork is a bit sluggish. That's because it doesn't use copy-on-write. Thereby losing most of its advantages. I don't know SUA, but I have vaguely heard about it. -- Piet van Oostrum URL: http://pietvanoostrum.com [PGP 8DAE142BE17999C4] Private email: p...@vanoostrum.org -- http://mail.python.org/mailman/listinfo/python-list
Re: basic thread question
On 24 Aug, 13:21, Piet van Oostrum wrote: > But os.fork() is not available on Windows. And I guess refcounts et al. > will soon destroy the sharing. Well, there is os.fork in Cygwin and SUA (SUA is the Unix subsytem in Windows Vista Professional). Cygwin's fork is a bit sluggish. Multiprocessing works on Windows and Linux alike. Apart from that, how are you going to use threads? The GIL will not be a problem if it can be released. Mostly, the GIL is a hypothetical problem. It is only a problem for compute-bound code written in pure Python. But very few use Python for that. However, if you do and can afford the 200x speed penalty from using Python (instead of C, C++, Fortran, Cython), you can just as well accept that only one CPU is used. Sturla Molden -- http://mail.python.org/mailman/listinfo/python-list
Re: basic thread question
On 18 Aug, 22:10, Derek Martin wrote: > I have some simple threaded code... If I run this > with an arg of 1 (start one thread), it pegs one cpu, as I would > expect. If I run it with an arg of 2 (start 2 threads), it uses both > CPUs, but utilization of both is less than 50%. Can anyone explain > why? Access to the Python interpreter is serialized by the global interpreter lock (GIL). You created two threads the OS could schedule, but they competed for access to the Python interpreter. If you want to utilize more than one CPU, you have to release the GIL or use multiple processes instead (os.fork since you are using Linux). This is how the GIL can be released: * Many functions in Python's standard library, particularly all blocking i/o functions, release the GIL. This covers the by far most common use of threads. * In C or C++ extensions, use the macros Py_BEGIN_ALLOW_THREADS and Py_END_ALLOW_THREADS. * With ctypes, functions called from a cdll release the GIL, whereas functions called from a pydll do not. * In f2py, declaring a Fortran function threadsafe in a .pyf file or cf2py comment releases the GIL. * In Cython or Pyrex extensions, use a "with nogil:" block to execute code without holding the GIL. Sturla Molden -- http://mail.python.org/mailman/listinfo/python-list
Re: basic thread question
> Dave Angel (DA) wrote: >DA> Dennis Lee Bieber wrote: >>> On Sun, 23 Aug 2009 22:14:17 -0700, John Nagle >>> declaimed the following in gmane.comp.python.general: >>> >>> Multiple Python processes can run concurrently, but each process has a copy of the entire Python system, so the memory and cache footprints are far larger than for multiple threads. >>> One would think a smart enough OS would be able to share the >>> executable (interpreter) code, and only create a new stack/heap >>> allocation for data. >>> >DA> That's what fork is all about. (See os.fork(), available on most >DA> Unix/Linux) The two processes start out sharing their state, and only the >DA> things subsequently written need separate swap space. But os.fork() is not available on Windows. And I guess refcounts et al. will soon destroy the sharing. -- Piet van Oostrum URL: http://pietvanoostrum.com [PGP 8DAE142BE17999C4] Private email: p...@vanoostrum.org -- http://mail.python.org/mailman/listinfo/python-list
Re: basic thread question
Dennis Lee Bieber wrote: On Sun, 23 Aug 2009 22:14:17 -0700, John Nagle declaimed the following in gmane.comp.python.general: Multiple Python processes can run concurrently, but each process has a copy of the entire Python system, so the memory and cache footprints are far larger than for multiple threads. One would think a smart enough OS would be able to share the executable (interpreter) code, and only create a new stack/heap allocation for data. That's what fork is all about. (See os.fork(), available on most Unix/Linux) The two processes start out sharing their state, and only the things subsequently written need separate swap space. In Windows (and probably Unix/Linux), the swapspace taken by the executable and DLLs(shared libraries) is minimal. Each DLL may have a "preferred location" and if that part of the address space is available, it takes no swapspace at all, except for static variables, which are usually allocated together. I don't know whether the standard build of CPython (python.exe and the pyo libraries) uses such a linker option, but I'd bet they do. It also speeds startup time. On my system, a minimal python program uses about 50k of swapspace. But I'm sure that goes way up with lots of imports. DaveA -- http://mail.python.org/mailman/listinfo/python-list
Re: basic thread question
> Dennis Lee Bieber (DLB) wrote: >DLB> On Sun, 23 Aug 2009 22:14:17 -0700, John Nagle >DLB> declaimed the following in gmane.comp.python.general: >>> Multiple Python processes can run concurrently, but each process >>> has a copy of the entire Python system, so the memory and cache footprints >>> are >>> far larger than for multiple threads. >>> >DLB> One would think a smart enough OS would be able to share the >DLB> executable (interpreter) code, and only create a new stack/heap >DLB> allocation for data. Of course they do, but a significant portion of a Python system consists of imported modules and these are data as far as the OS is concerned. Only the modules written in C which are loaded as DLL's (shared libs) and of course the interpreter executable will be shared. -- Piet van Oostrum URL: http://pietvanoostrum.com [PGP 8DAE142BE17999C4] Private email: p...@vanoostrum.org -- http://mail.python.org/mailman/listinfo/python-list
Re: basic thread question
Jan Kaliszewski wrote: 18-08-2009 o 22:10:15 Derek Martin wrote: I have some simple threaded code... If I run this with an arg of 1 (start one thread), it pegs one cpu, as I would expect. If I run it with an arg of 2 (start 2 threads), it uses both CPUs, but utilization of both is less than 50%. Can anyone explain why? I do not pretend it's impeccable code, and I'm not looking for a critiqe of the code per se, excepting the case where what I've written is actually *wrong*. I hacked this together in a couple of minutes, with the intent of pegging my CPUs. Performance with two threads is actually *worse* than with one, which is highly unintuitive. I can accomplish my goal very easily with bash, but I still want to understand what's going on here... The OS is Linux 2.6.24, on a Ubuntu base. Here's the code: Python threads can't benefit from multiple processors (because of GIL, see: http://docs.python.org/glossary.html#term-global-interpreter-lock). This is a CPython implementation restriction. It's not inherent in the language. Multiple threads make overall performance worse because Python's approach to thread locking produces a large number of context switches. The interpreter unlocks the "Global Interpreter Lock" every N interpreter cycles and on any system call that can block, which, if there is a thread waiting, causes a context switch. Multiple Python processes can run concurrently, but each process has a copy of the entire Python system, so the memory and cache footprints are far larger than for multiple threads. John Nagle -- http://mail.python.org/mailman/listinfo/python-list
Re: basic thread question
On Aug 18, 4:58 pm, birdsong wrote: > On Aug 18, 3:18 pm, Derek Martin wrote: > > > > > On Tue, Aug 18, 2009 at 03:10:15PM -0500, Derek Martin wrote: > > > I have some simple threaded code... If I run this > > > with an arg of 1 (start one thread), it pegs one cpu, as I would > > > expect. If I run it with an arg of 2 (start 2 threads), it uses both > > > CPUs, but utilization of both is less than 50%. Can anyone explain > > > why? > > > Ah, searching while waiting for an answer (the e-mail gateway is a bit > > slow, it seems...) I discovered that the GIL is the culprate. > > Evidently this question comes up a lot. It would probably save a lot > > of time on the part of those who answer questions here, as well as > > those implementing solutions in Python, if whoever is maintaining the > > docs these days would put a blurb about this in the docs in big bold > > letters... Concurrency being perhaps the primary reason to use > > threading, essentially it means that Python is not useful for the > > sorts of problems that one would be inclined to solve they way my code > > works (or rather, was meant to). It would be very helpful to know > > that *before* one tried to implement a solution that way... especially > > for solutions significantly less trivial than mine. ;-) > > > Thanks > > > -- > > Derek D. Martinhttp://www.pizzashack.org/ > > GPG Key ID: 0x81CFE75D > > > application_pgp-signature_part > > < 1KViewDownload > > I would still watch that video which will explain a bit more about the > GIL. Thank you for the video! It's good to know, but it raises lots of other questions in my mind. Lots of examples would have helped. ~Sean -- http://mail.python.org/mailman/listinfo/python-list
Re: basic thread question
On Aug 18, 3:18 pm, Derek Martin wrote: > On Tue, Aug 18, 2009 at 03:10:15PM -0500, Derek Martin wrote: > > I have some simple threaded code... If I run this > > with an arg of 1 (start one thread), it pegs one cpu, as I would > > expect. If I run it with an arg of 2 (start 2 threads), it uses both > > CPUs, but utilization of both is less than 50%. Can anyone explain > > why? > > Ah, searching while waiting for an answer (the e-mail gateway is a bit > slow, it seems...) I discovered that the GIL is the culprate. > Evidently this question comes up a lot. It would probably save a lot > of time on the part of those who answer questions here, as well as > those implementing solutions in Python, if whoever is maintaining the > docs these days would put a blurb about this in the docs in big bold > letters... Concurrency being perhaps the primary reason to use > threading, essentially it means that Python is not useful for the > sorts of problems that one would be inclined to solve they way my code > works (or rather, was meant to). It would be very helpful to know > that *before* one tried to implement a solution that way... especially > for solutions significantly less trivial than mine. ;-) > > Thanks > > -- > Derek D. Martinhttp://www.pizzashack.org/ > GPG Key ID: 0x81CFE75D > > application_pgp-signature_part > < 1KViewDownload I would still watch that video which will explain a bit more about the GIL. -- http://mail.python.org/mailman/listinfo/python-list
Re: basic thread question
On Tue, Aug 18, 2009 at 03:10:15PM -0500, Derek Martin wrote: > I have some simple threaded code... If I run this > with an arg of 1 (start one thread), it pegs one cpu, as I would > expect. If I run it with an arg of 2 (start 2 threads), it uses both > CPUs, but utilization of both is less than 50%. Can anyone explain > why? Ah, searching while waiting for an answer (the e-mail gateway is a bit slow, it seems...) I discovered that the GIL is the culprate. Evidently this question comes up a lot. It would probably save a lot of time on the part of those who answer questions here, as well as those implementing solutions in Python, if whoever is maintaining the docs these days would put a blurb about this in the docs in big bold letters... Concurrency being perhaps the primary reason to use threading, essentially it means that Python is not useful for the sorts of problems that one would be inclined to solve they way my code works (or rather, was meant to). It would be very helpful to know that *before* one tried to implement a solution that way... especially for solutions significantly less trivial than mine. ;-) Thanks -- Derek D. Martin http://www.pizzashack.org/ GPG Key ID: 0x81CFE75D pgpN8wmTPdqR4.pgp Description: PGP signature -- http://mail.python.org/mailman/listinfo/python-list
Re: basic thread question
18-08-2009 o 22:10:15 Derek Martin wrote: I have some simple threaded code... If I run this with an arg of 1 (start one thread), it pegs one cpu, as I would expect. If I run it with an arg of 2 (start 2 threads), it uses both CPUs, but utilization of both is less than 50%. Can anyone explain why? I do not pretend it's impeccable code, and I'm not looking for a critiqe of the code per se, excepting the case where what I've written is actually *wrong*. I hacked this together in a couple of minutes, with the intent of pegging my CPUs. Performance with two threads is actually *worse* than with one, which is highly unintuitive. I can accomplish my goal very easily with bash, but I still want to understand what's going on here... The OS is Linux 2.6.24, on a Ubuntu base. Here's the code: Python threads can't benefit from multiple processors (because of GIL, see: http://docs.python.org/glossary.html#term-global-interpreter-lock). 'multiprocessing' module is what you need: http://docs.python.org/library/multiprocessing.html Cheers, *j -- Jan Kaliszewski (zuo) -- http://mail.python.org/mailman/listinfo/python-list
Re: basic thread question
On Aug 18, 1:10 pm, Derek Martin wrote: > I have some simple threaded code... If I run this > with an arg of 1 (start one thread), it pegs one cpu, as I would > expect. If I run it with an arg of 2 (start 2 threads), it uses both > CPUs, but utilization of both is less than 50%. Can anyone explain > why? > > I do not pretend it's impeccable code, and I'm not looking for a > critiqe of the code per se, excepting the case where what I've written > is actually *wrong*. I hacked this together in a couple of minutes, > with the intent of pegging my CPUs. Performance with two threads is > actually *worse* than with one, which is highly unintuitive. I can > accomplish my goal very easily with bash, but I still want to > understand what's going on here... > > The OS is Linux 2.6.24, on a Ubuntu base. Here's the code: > > Thanks > > -=-=-=-=- > > #!/usr/bin/python > > import thread, sys, time > > def busy(thread): > x=0 > while True: > x+=1 > > if __name__ == '__main__': > try: > cpus = int(sys.argv[1]) > except ValueError: > cpus = 1 > print "cpus = %d, argv[1] = %s\n" % (cpus, sys.argv[1]) > i=0 > thread_list = [] > while i < cpus: > x = thread.start_new_thread(busy, (i,)) > thread_list.append(x) > i+=1 > while True: > pass > > -- > Derek D. Martinhttp://www.pizzashack.org/ > GPG Key ID: 0x81CFE75D > > application_pgp-signature_part > < 1KViewDownload watch this and all your findings will be explained: http://blip.tv/file/2232410 this talk marked a pivotal moment in my understanding of python threads and signal handling in threaded programs. -- http://mail.python.org/mailman/listinfo/python-list