Re: basic thread question

2009-08-25 Thread sturlamolden
On 25 Aug, 13:33, Piet van Oostrum  wrote:

> I have heard about that also, but is there a Python implementation that
> uses this? (Just curious, I am not using Windows.)

On Windows we have three different versions of Python 2.6:

* Python 2.6 for Win32/64 (from python.org) does not have os.fork.

* Python 2.6 for Cygwin has os.fork, but it is non-COW and sluggish.

* Python 2.6 for SUA has a fast os.fork with copy-on-write.

You get Python 2.6.2 for SUA prebuilt by Microsoft from 
http://www.interopsystems.com.

Using Python 2.6 for SUA is not without surprices: For example, the
process is not executed from the Win32 subsystem, hence the Windows
API is inaccessible. That means we cannot use native Windows GUI.
Instead we must run an X11 server on the Windows subsystem (e.g. X-
Win32), and use the Xlib SUA has installed. You can compare SUA to a
stripped down Linux distro, on which you have to build and install
most of the software you want to use. I do not recommend using Python
for SUA instead of Python for Windows unless you absolutely need a
fast os.fork or have a program that otherwise requires Unix. But for
running Unix apps on Windows, SUA is clearly superior to Cygwin.
Licencing is also better: Programs compiled against Cygwin libraries
are GPL (unless you buy a commercial license). Program compiled
against SUA libraries are not.



Sturla Molden
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: basic thread question

2009-08-25 Thread Piet van Oostrum
> sturlamolden  (s) wrote:

>s> On 25 Aug, 01:26, Piet van Oostrum  wrote:
>>> That's because it doesn't use copy-on-write. Thereby losing most of its
>>> advantages. I don't know SUA, but I have vaguely heard about it.

>s> SUA is a version of UNIX hidden inside Windows Vista and Windows 7
>s> (except in Home and Home Premium), but very few seem to know of it.
>s> SUA (Subsystem for Unix based Applications) is formerly known as
>s> Interix, which is a certified version of UNIX based on OpenBSD. If you
>s> go to http://www.interopsystems.com (a website run by Interop Systems
>s> Inc., a company owned by Microsoft), you will find a lot of common
>s> unix tools prebuilt for SUA, including Python 2.6.2.

>s> The NT-kernel supports copy-on-write fork with a special system call
>s> (ZwCreateProcess in ntdll.dll), which is what SUA's implementation of
>s> fork() uses.

I have heard about that also, but is there a Python implementation that
uses this? (Just curious, I am not using Windows.)
-- 
Piet van Oostrum 
URL: http://pietvanoostrum.com [PGP 8DAE142BE17999C4]
Private email: p...@vanoostrum.org
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: basic thread question

2009-08-24 Thread sturlamolden
On 25 Aug, 01:26, Piet van Oostrum  wrote:

> That's because it doesn't use copy-on-write. Thereby losing most of its
> advantages. I don't know SUA, but I have vaguely heard about it.

SUA is a version of UNIX hidden inside Windows Vista and Windows 7
(except in Home and Home Premium), but very few seem to know of it.
SUA (Subsystem for Unix based Applications) is formerly known as
Interix, which is a certified version of UNIX based on OpenBSD. If you
go to http://www.interopsystems.com (a website run by Interop Systems
Inc., a company owned by Microsoft), you will find a lot of common
unix tools prebuilt for SUA, including Python 2.6.2.

The NT-kernel supports copy-on-write fork with a special system call
(ZwCreateProcess in ntdll.dll), which is what SUA's implementation of
fork() uses.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: basic thread question

2009-08-24 Thread Piet van Oostrum
> sturlamolden  (s) wrote:

>s> On 24 Aug, 13:21, Piet van Oostrum  wrote:
>>> But os.fork() is not available on Windows. And I guess refcounts et al.
>>> will soon destroy the sharing.

>s> Well, there is os.fork in Cygwin and SUA (SUA is the Unix subsytem in
>s> Windows Vista Professional). Cygwin's fork is a bit sluggish.

That's because it doesn't use copy-on-write. Thereby losing most of its
advantages. I don't know SUA, but I have vaguely heard about it.
-- 
Piet van Oostrum 
URL: http://pietvanoostrum.com [PGP 8DAE142BE17999C4]
Private email: p...@vanoostrum.org
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: basic thread question

2009-08-24 Thread sturlamolden
On 24 Aug, 13:21, Piet van Oostrum  wrote:

> But os.fork() is not available on Windows. And I guess refcounts et al.
> will soon destroy the sharing.

Well, there is os.fork in Cygwin and SUA (SUA is the Unix subsytem in
Windows Vista Professional). Cygwin's fork is a bit sluggish.

Multiprocessing works on Windows and Linux alike.

Apart from that, how are you going to use threads? The GIL will not be
a problem if it can be released. Mostly, the GIL is a hypothetical
problem. It is only a problem for compute-bound code written in pure
Python. But very few use Python for that. However, if you do and can
afford the 200x speed penalty from using Python (instead of C, C++,
Fortran, Cython), you can just as well accept that only one CPU is
used.


Sturla Molden











-- 
http://mail.python.org/mailman/listinfo/python-list


Re: basic thread question

2009-08-24 Thread sturlamolden
On 18 Aug, 22:10, Derek Martin  wrote:

> I have some simple threaded code...  If I run this
> with an arg of 1 (start one thread), it pegs one cpu, as I would
> expect.  If I run it with an arg of 2 (start 2 threads), it uses both
> CPUs, but utilization of both is less than 50%.  Can anyone explain
> why?  

Access to the Python interpreter is serialized by the global
interpreter lock (GIL). You created two threads the OS could schedule,
but they competed for access to the Python interpreter. If you want to
utilize more than one CPU, you have to release the GIL or use multiple
processes instead (os.fork since you are using Linux).

This is how the GIL can be released:

* Many functions in Python's standard library, particularly all
blocking i/o functions, release the GIL. This covers the by far most
common use of threads.

* In C or C++ extensions, use the macros Py_BEGIN_ALLOW_THREADS and
Py_END_ALLOW_THREADS.

* With ctypes, functions called from a cdll release the GIL, whereas
functions called from a pydll do not.

* In f2py, declaring a Fortran function threadsafe in a .pyf file or
cf2py comment releases the GIL.

* In Cython or Pyrex extensions, use a "with nogil:" block to execute
code without holding the GIL.


Sturla Molden
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: basic thread question

2009-08-24 Thread Piet van Oostrum
> Dave Angel  (DA) wrote:

>DA> Dennis Lee Bieber wrote:
>>> On Sun, 23 Aug 2009 22:14:17 -0700, John Nagle 
>>> declaimed the following in gmane.comp.python.general:
>>> 
>>> 
 Multiple Python processes can run concurrently, but each process
 has a copy of the entire Python system, so the memory and cache footprints 
 are
 far larger than for multiple threads.
 
 
>>> One would think a smart enough OS would be able to share the
>>> executable (interpreter) code, and only create a new stack/heap
>>> allocation for data.
>>> 
>DA> That's what fork is all about.  (See os.fork(), available on most
>DA> Unix/Linux)  The two processes start out sharing their state, and only the
>DA> things subsequently written need separate swap space.

But os.fork() is not available on Windows. And I guess refcounts et al.
will soon destroy the sharing.
-- 
Piet van Oostrum 
URL: http://pietvanoostrum.com [PGP 8DAE142BE17999C4]
Private email: p...@vanoostrum.org
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: basic thread question

2009-08-24 Thread Dave Angel

Dennis Lee Bieber wrote:

On Sun, 23 Aug 2009 22:14:17 -0700, John Nagle 
declaimed the following in gmane.comp.python.general:

  

 Multiple Python processes can run concurrently, but each process
has a copy of the entire Python system, so the memory and cache footprints are
far larger than for multiple threads.



One would think a smart enough OS would be able to share the
executable (interpreter) code, and only create a new stack/heap
allocation for data.
  
That's what fork is all about.  (See os.fork(), available on most 
Unix/Linux)  The two processes start out sharing their state, and only 
the things subsequently written need separate swap space.


In Windows (and probably Unix/Linux), the swapspace taken by the 
executable and DLLs(shared libraries) is minimal.  Each DLL may have a 
"preferred location" and if that part of the address space is available, 
it takes no swapspace at all, except for static variables, which are 
usually allocated together.  I don't know whether the standard build of 
CPython (python.exe and the pyo libraries) uses such a linker option, 
but I'd bet they do.  It also speeds startup time.


On my system, a minimal python program uses about 50k of swapspace.  But 
I'm sure that goes way up with lots of imports.



DaveA
--
http://mail.python.org/mailman/listinfo/python-list


Re: basic thread question

2009-08-24 Thread Piet van Oostrum
> Dennis Lee Bieber  (DLB) wrote:

>DLB> On Sun, 23 Aug 2009 22:14:17 -0700, John Nagle 
>DLB> declaimed the following in gmane.comp.python.general:

>>> Multiple Python processes can run concurrently, but each process
>>> has a copy of the entire Python system, so the memory and cache footprints 
>>> are
>>> far larger than for multiple threads.
>>> 
>DLB>   One would think a smart enough OS would be able to share the
>DLB> executable (interpreter) code, and only create a new stack/heap
>DLB> allocation for data.

Of course they do, but a significant portion of a Python system consists
of imported modules and these are data as far as the OS is concerned.
Only the modules written in C which are loaded as DLL's (shared libs)
and of course the interpreter executable will be shared.
-- 
Piet van Oostrum 
URL: http://pietvanoostrum.com [PGP 8DAE142BE17999C4]
Private email: p...@vanoostrum.org
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: basic thread question

2009-08-23 Thread John Nagle

Jan Kaliszewski wrote:

18-08-2009 o 22:10:15 Derek Martin  wrote:


I have some simple threaded code...  If I run this
with an arg of 1 (start one thread), it pegs one cpu, as I would
expect.  If I run it with an arg of 2 (start 2 threads), it uses both
CPUs, but utilization of both is less than 50%.  Can anyone explain
why?

I do not pretend it's impeccable code, and I'm not looking for a
critiqe of the code per se, excepting the case where what I've written
is actually *wrong*. I hacked this together in a couple of minutes,
with the intent of pegging my CPUs.  Performance with two threads is
actually *worse* than with one, which is highly unintuitive.  I can
accomplish my goal very easily with bash, but I still want to
understand what's going on here...

The OS is Linux 2.6.24, on a Ubuntu base.  Here's the code:


Python threads can't benefit from multiple processors (because of GIL,
see: http://docs.python.org/glossary.html#term-global-interpreter-lock).


This is a CPython implementation restriction.  It's not inherent in
the language.

Multiple threads make overall performance worse because Python's
approach to thread locking produces a large number of context switches.
The interpreter unlocks the "Global Interpreter Lock" every N interpreter
cycles and on any system call that can block, which, if there is a
thread waiting, causes a context switch.

Multiple Python processes can run concurrently, but each process
has a copy of the entire Python system, so the memory and cache footprints are
far larger than for multiple threads.

John Nagle
--
http://mail.python.org/mailman/listinfo/python-list


Re: basic thread question

2009-08-19 Thread Sean DiZazzo
On Aug 18, 4:58 pm, birdsong  wrote:
> On Aug 18, 3:18 pm, Derek Martin  wrote:
>
>
>
> > On Tue, Aug 18, 2009 at 03:10:15PM -0500, Derek Martin wrote:
> > > I have some simple threaded code...  If I run this
> > > with an arg of 1 (start one thread), it pegs one cpu, as I would
> > > expect.  If I run it with an arg of 2 (start 2 threads), it uses both
> > > CPUs, but utilization of both is less than 50%.  Can anyone explain
> > > why?  
>
> > Ah, searching while waiting for an answer (the e-mail gateway is a bit
> > slow, it seems...) I discovered that the GIL is the culprate.
> > Evidently this question comes up a lot.  It would probably save a lot
> > of time on the part of those who answer questions here, as well as
> > those implementing solutions in Python, if whoever is maintaining the
> > docs these days would put a blurb about this in the docs in big bold
> > letters...  Concurrency being perhaps the primary reason to use
> > threading, essentially it means that Python is not useful for the
> > sorts of problems that one would be inclined to solve they way my code
> > works (or rather, was meant to).  It would be very helpful to know
> > that *before* one tried to implement a solution that way... especially
> > for solutions significantly less trivial than mine. ;-)
>
> > Thanks
>
> > --
> > Derek D. Martinhttp://www.pizzashack.org/
> > GPG Key ID: 0x81CFE75D
>
> >  application_pgp-signature_part
> > < 1KViewDownload
>
> I would still watch that video which will explain a bit more about the
> GIL.

Thank you for the video!  It's good to know, but it raises lots of
other questions in my mind.  Lots of examples would have helped.

~Sean
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: basic thread question

2009-08-18 Thread birdsong
On Aug 18, 3:18 pm, Derek Martin  wrote:
> On Tue, Aug 18, 2009 at 03:10:15PM -0500, Derek Martin wrote:
> > I have some simple threaded code...  If I run this
> > with an arg of 1 (start one thread), it pegs one cpu, as I would
> > expect.  If I run it with an arg of 2 (start 2 threads), it uses both
> > CPUs, but utilization of both is less than 50%.  Can anyone explain
> > why?  
>
> Ah, searching while waiting for an answer (the e-mail gateway is a bit
> slow, it seems...) I discovered that the GIL is the culprate.
> Evidently this question comes up a lot.  It would probably save a lot
> of time on the part of those who answer questions here, as well as
> those implementing solutions in Python, if whoever is maintaining the
> docs these days would put a blurb about this in the docs in big bold
> letters...  Concurrency being perhaps the primary reason to use
> threading, essentially it means that Python is not useful for the
> sorts of problems that one would be inclined to solve they way my code
> works (or rather, was meant to).  It would be very helpful to know
> that *before* one tried to implement a solution that way... especially
> for solutions significantly less trivial than mine. ;-)
>
> Thanks
>
> --
> Derek D. Martinhttp://www.pizzashack.org/
> GPG Key ID: 0x81CFE75D
>
>  application_pgp-signature_part
> < 1KViewDownload

I would still watch that video which will explain a bit more about the
GIL.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: basic thread question

2009-08-18 Thread Derek Martin
On Tue, Aug 18, 2009 at 03:10:15PM -0500, Derek Martin wrote:
> I have some simple threaded code...  If I run this
> with an arg of 1 (start one thread), it pegs one cpu, as I would
> expect.  If I run it with an arg of 2 (start 2 threads), it uses both
> CPUs, but utilization of both is less than 50%.  Can anyone explain
> why?  
 
Ah, searching while waiting for an answer (the e-mail gateway is a bit
slow, it seems...) I discovered that the GIL is the culprate.
Evidently this question comes up a lot.  It would probably save a lot
of time on the part of those who answer questions here, as well as
those implementing solutions in Python, if whoever is maintaining the
docs these days would put a blurb about this in the docs in big bold
letters...  Concurrency being perhaps the primary reason to use
threading, essentially it means that Python is not useful for the
sorts of problems that one would be inclined to solve they way my code
works (or rather, was meant to).  It would be very helpful to know
that *before* one tried to implement a solution that way... especially
for solutions significantly less trivial than mine. ;-)

Thanks

-- 
Derek D. Martin
http://www.pizzashack.org/
GPG Key ID: 0x81CFE75D



pgpN8wmTPdqR4.pgp
Description: PGP signature
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: basic thread question

2009-08-18 Thread Jan Kaliszewski

18-08-2009 o 22:10:15 Derek Martin  wrote:


I have some simple threaded code...  If I run this
with an arg of 1 (start one thread), it pegs one cpu, as I would
expect.  If I run it with an arg of 2 (start 2 threads), it uses both
CPUs, but utilization of both is less than 50%.  Can anyone explain
why?

I do not pretend it's impeccable code, and I'm not looking for a
critiqe of the code per se, excepting the case where what I've written
is actually *wrong*. I hacked this together in a couple of minutes,
with the intent of pegging my CPUs.  Performance with two threads is
actually *worse* than with one, which is highly unintuitive.  I can
accomplish my goal very easily with bash, but I still want to
understand what's going on here...

The OS is Linux 2.6.24, on a Ubuntu base.  Here's the code:


Python threads can't benefit from multiple processors (because of GIL,
see: http://docs.python.org/glossary.html#term-global-interpreter-lock).

'multiprocessing' module is what you need:

http://docs.python.org/library/multiprocessing.html

Cheers,
*j

--
Jan Kaliszewski (zuo) 
--
http://mail.python.org/mailman/listinfo/python-list


Re: basic thread question

2009-08-18 Thread birdsong
On Aug 18, 1:10 pm, Derek Martin  wrote:
> I have some simple threaded code...  If I run this
> with an arg of 1 (start one thread), it pegs one cpu, as I would
> expect.  If I run it with an arg of 2 (start 2 threads), it uses both
> CPUs, but utilization of both is less than 50%.  Can anyone explain
> why?  
>
> I do not pretend it's impeccable code, and I'm not looking for a
> critiqe of the code per se, excepting the case where what I've written
> is actually *wrong*. I hacked this together in a couple of minutes,
> with the intent of pegging my CPUs.  Performance with two threads is
> actually *worse* than with one, which is highly unintuitive.  I can
> accomplish my goal very easily with bash, but I still want to
> understand what's going on here...
>
> The OS is Linux 2.6.24, on a Ubuntu base.  Here's the code:
>
> Thanks
>
> -=-=-=-=-
>
> #!/usr/bin/python
>
> import thread, sys, time
>
> def busy(thread):
>     x=0
>     while True:
>         x+=1
>
> if __name__ == '__main__':
>     try:
>         cpus = int(sys.argv[1])
>     except ValueError:
>         cpus = 1
>     print "cpus = %d, argv[1] = %s\n" % (cpus, sys.argv[1])
>     i=0
>     thread_list = []
>     while i < cpus:
>         x = thread.start_new_thread(busy, (i,))
>         thread_list.append(x)
>         i+=1
>     while True:
>         pass
>
> --
> Derek D. Martinhttp://www.pizzashack.org/
> GPG Key ID: 0x81CFE75D
>
>  application_pgp-signature_part
> < 1KViewDownload

watch this and all your findings will be explained: http://blip.tv/file/2232410

this talk marked a pivotal moment in my understanding of python
threads and signal handling in threaded programs.
-- 
http://mail.python.org/mailman/listinfo/python-list