subject:"RE\: Please help with Threading"

On 18 May 2013 20:33, Dennis Lee Bieber wlfr...@ix.netcom.com wrote:
 Python threads work fine if the threads either rely on intelligent
 DLLs for number crunching (instead of doing nested Python loops to
 process a numeric array you pass it to something like NumPy which
 releases the GIL while crunching a copy of the array) or they do lots of
 I/O and have to wait for I/O devices (while one thread is waiting for
 the write/read operation to complete, another thread can do some number
 crunching).

Has nobody thought of a context manager to allow a part of your code to
free up the GIL? I think the GIL is not inherently bad, but if it poses a
problem at times, there should be a way to get it out of your... Way.
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Please help with Threading

2013-05-20 Thread Cameron Simpson

On 20May2013 07:25, Fábio Santos fabiosantos...@gmail.com wrote:
| On 18 May 2013 20:33, Dennis Lee Bieber wlfr...@ix.netcom.com wrote:
|  Python threads work fine if the threads either rely on intelligent
|  DLLs for number crunching (instead of doing nested Python loops to
|  process a numeric array you pass it to something like NumPy which
|  releases the GIL while crunching a copy of the array) or they do lots of
|  I/O and have to wait for I/O devices (while one thread is waiting for
|  the write/read operation to complete, another thread can do some number
|  crunching).
| 
| Has nobody thought of a context manager to allow a part of your code to
| free up the GIL? I think the GIL is not inherently bad, but if it poses a
| problem at times, there should be a way to get it out of your... Way.

The GIL makes individual python operations thread safe by never
running two at once. This makes the implementation of the operations
simpler, faster and safer. It is probably totally infeasible to
write meaningful python code inside your suggested context
manager that didn't rely on the GIL; if the GIL were not held the
code would be unsafe.

It is easy for a C extension to release the GIL, and then to do
meaningful work until it needs to return to python land. Most C
extensions will do that around non-trivial sections, and anything
that may stall in the OS.

So your use case for the context manager doesn't fit well.
-- 
Cameron Simpson c...@zip.com.au

Gentle suggestions being those which are written on rocks of less than 5lbs.
- Tracy Nelson in comp.lang.c
-- 
http://mail.python.org/mailman/listinfo/python-list

RE: Please help with Threading

Date: Sun, 19 May 2013 13:10:36 +1000
From: c...@zip.com.au
To: carlosnepomuc...@outlook.com
CC: python-list@python.org
Subject: Re: Please help with Threading

On 19May2013 03:02, Carlos Nepomuceno carlosnepomuc...@outlook.com wrote:
| Just been told that GIL doesn't make things slower, but as I
| didn't know that such a thing even existed I went out looking for
| more info and found that document:
| http://www.dabeaz.com/python/UnderstandingGIL.pdf
|
| Is it current? I didn't know Python threads aren't preemptive.
| Seems to be something really old considering the state of the art
| on parallel execution on multi-cores.
| What's the catch on making Python threads preemptive? Are there any ongoing
projects to make that?

Depends what you mean by preemptive. If you have multiple CPU bound
pure Python threads they will all get CPU time without any of them
explicitly yeilding control. But thread switching happens between
python instructions, mediated by the interpreter.

I meant operating system preemptive. I've just checked and Python does not
start Windows threads.

The standard answers for using multiple cores is to either run
multiple processes (either explicitly spawning other executables,
or spawning child python processes using the multiprocessing module),
or to use (as suggested) libraries that can do the compute intensive
bits themselves, releasing the while doing so so that the Python
interpreter can run other bits of your python code.

I've just discovered the multiprocessing module[1] and will make some tests
with it later. Are there any other modules for that purpose?

I've found the following articles about Python threads. Any suggestions?

http://www.ibm.com/developerworks/aix/library/au-threadingpython/
http://pymotw.com/2/threading/index.html
http://www.laurentluce.com/posts/python-threads-synchronization-locks-rlocks-semaphores-conditions-events-and-queues/

[1] http://docs.python.org/2/library/multiprocessing.html

Plenty of OS system calls (and calls to other libraries from the
interpreter) release the GIL during the call. Other python threads
can run during that window.

And there are other Python implementations other than CPython.

Cheers,
--
Cameron Simpson c...@zip.com.au

Processes are like potatoes. - NCR device driver manual

--
http://mail.python.org/mailman/listinfo/python-list

RE: Please help with Threading

 Date: Mon, 20 May 2013 17:45:14 +1000
 From: c...@zip.com.au
 To: fabiosantos...@gmail.com
 Subject: Re: Please help with Threading
 CC: python-list@python.org; wlfr...@ix.netcom.com

 On 20May2013 07:25, Fábio Santos fabiosantos...@gmail.com wrote:
 | On 18 May 2013 20:33, Dennis Lee Bieber wlfr...@ix.netcom.com wrote:
 | Python threads work fine if the threads either rely on intelligent
 | DLLs for number crunching (instead of doing nested Python loops to
 | process a numeric array you pass it to something like NumPy which
 | releases the GIL while crunching a copy of the array) or they do lots of
 | I/O and have to wait for I/O devices (while one thread is waiting for
 | the write/read operation to complete, another thread can do some number
 | crunching).
 |
 | Has nobody thought of a context manager to allow a part of your code to
 | free up the GIL? I think the GIL is not inherently bad, but if it poses a
 | problem at times, there should be a way to get it out of your... Way.

 The GIL makes individual python operations thread safe by never
 running two at once. This makes the implementation of the operations
 simpler, faster and safer. It is probably totally infeasible to
 write meaningful python code inside your suggested context
 manager that didn't rely on the GIL; if the GIL were not held the
 code would be unsafe.

I just got my hands dirty trying to synchronize Python prints from many threads.
Sometimes they mess up when printing the newlines. 

I tried several approaches using threading.Lock and Condition. None of them 
worked perfectly and all of them made the code sluggish. 

Is there a 100% sure method to make print thread safe? Can it be fast???

 It is easy for a C extension to release the GIL, and then to do
 meaningful work until it needs to return to python land. Most C
 extensions will do that around non-trivial sections, and anything
 that may stall in the OS.

 So your use case for the context manager doesn't fit well.
 --
 Cameron Simpson c...@zip.com.au

 Gentle suggestions being those which are written on rocks of less than 5lbs.
 - Tracy Nelson in comp.lang.c
 --
 http://mail.python.org/mailman/listinfo/python-list   

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Please help with Threading

My use case was a tight loop processing an image pixel by pixel, or
crunching a CSV file. If it only uses local variables (and probably hold a
lock before releasing the GIL) it should be safe, no?

My idea is that it's a little bad to have to write C or use multiprocessing
just to do simultaneous calculations. I think an application using a
reactor loop such as twisted would actually benefit from this. Sure, it
will be slower than a C implementation of the same loop, but isn't fast
prototyping a very important feature of the Python language?
On 20 May 2013 08:45, Cameron Simpson c...@zip.com.au wrote:

 On 20May2013 07:25, Fábio Santos fabiosantos...@gmail.com wrote:
 | On 18 May 2013 20:33, Dennis Lee Bieber wlfr...@ix.netcom.com wrote:
 |  Python threads work fine if the threads either rely on
 intelligent
 |  DLLs for number crunching (instead of doing nested Python loops to
 |  process a numeric array you pass it to something like NumPy which
 |  releases the GIL while crunching a copy of the array) or they do lots
 of
 |  I/O and have to wait for I/O devices (while one thread is waiting for
 |  the write/read operation to complete, another thread can do some number
 |  crunching).
 |
 | Has nobody thought of a context manager to allow a part of your code to
 | free up the GIL? I think the GIL is not inherently bad, but if it poses a
 | problem at times, there should be a way to get it out of your... Way.

 The GIL makes individual python operations thread safe by never
 running two at once. This makes the implementation of the operations
 simpler, faster and safer. It is probably totally infeasible to
 write meaningful python code inside your suggested context
 manager that didn't rely on the GIL; if the GIL were not held the
 code would be unsafe.

 It is easy for a C extension to release the GIL, and then to do
 meaningful work until it needs to return to python land. Most C
 extensions will do that around non-trivial sections, and anything
 that may stall in the OS.

 So your use case for the context manager doesn't fit well.
 --
 Cameron Simpson c...@zip.com.au

 Gentle suggestions being those which are written on rocks of less than
 5lbs.
 - Tracy Nelson in comp.lang.c

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Please help with Threading

2013-05-20 Thread Cameron Simpson

On 20May2013 10:53, Carlos Nepomuceno carlosnepomuc...@outlook.com wrote:
| I just got my hands dirty trying to synchronize Python prints from many 
threads.
| Sometimes they mess up when printing the newlines. 
| I tried several approaches using threading.Lock and Condition.
| None of them worked perfectly and all of them made the code sluggish.

Show us some code, with specific complaints.

Did you try this?

  _lock = Lock()

  def lprint(*a, **kw):
global _lock
with _lock:
  print(*a, **kw)

and use lprint() everywhere?

For generality the lock should be per file: the above hack uses one
lock for any file, so that's going to stall overlapping prints to
different files; inefficient.

There are other things than the above, but at least individual prints will
never overlap. If you have interleaved prints, show us.

| Is there a 100% sure method to make print thread safe? Can it be fast???

Depends on what you mean by fast. It will be slower than code
with no lock; how much would require measurement.

Cheers,
-- 
Cameron Simpson c...@zip.com.au

My own suspicion is that the universe is not only queerer than we suppose,
but queerer than we *can* suppose.
- J.B.S. Haldane On Being the Right Size
  in the (1928) book Possible Worlds
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Please help with Threading

On Mon, May 20, 2013 at 6:35 PM, Cameron Simpson c...@zip.com.au wrote:
   _lock = Lock()

   def lprint(*a, **kw):
 global _lock
 with _lock:
   print(*a, **kw)

 and use lprint() everywhere?

Fun little hack:

def print(*args,print=print,lock=Lock(),**kwargs):
  with lock:
print(*args,**kwargs)

Question: Is this a cool use or a horrible abuse of the scoping rules?

ChrisA
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Please help with Threading

It is pretty cool although it looks like a recursive function at first ;)
On 20 May 2013 10:13, Chris Angelico ros...@gmail.com wrote:

 On Mon, May 20, 2013 at 6:35 PM, Cameron Simpson c...@zip.com.au wrote:
_lock = Lock()
 
def lprint(*a, **kw):
  global _lock
  with _lock:
print(*a, **kw)
 
  and use lprint() everywhere?

 Fun little hack:

 def print(*args,print=print,lock=Lock(),**kwargs):
   with lock:
 print(*args,**kwargs)

 Question: Is this a cool use or a horrible abuse of the scoping rules?

 ChrisA
 --
 http://mail.python.org/mailman/listinfo/python-list

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Please help with Threading

2013-05-20 Thread Cameron Simpson

On 20May2013 19:09, Chris Angelico ros...@gmail.com wrote:
| On Mon, May 20, 2013 at 6:35 PM, Cameron Simpson c...@zip.com.au wrote:
|_lock = Lock()
| 
|def lprint(*a, **kw):
|  global _lock
|  with _lock:
|print(*a, **kw)
| 
|  and use lprint() everywhere?
| 
| Fun little hack:
| 
| def print(*args,print=print,lock=Lock(),**kwargs):
|   with lock:
| print(*args,**kwargs)
| 
| Question: Is this a cool use or a horrible abuse of the scoping rules?

I carefully avoided monkey patching print itself:-)

That's... mad! I can see what the end result is meant to be, but
it looks like a debugging nightmare. Certainly my scoping-fu is too
weak to see at a glance how it works.
-- 
Cameron Simpson c...@zip.com.au

I will not do it as a hack   I will not do it for my friends
I will not do it on a MacI will not write for Uncle Sam
I will not do it on weekends I won't do ADA, Sam-I-Am
- Gregory Bond g...@bby.com.au
-- 
http://mail.python.org/mailman/listinfo/python-list

RE: Please help with Threading

 Date: Mon, 20 May 2013 18:35:20 +1000
 From: c...@zip.com.au
 To: carlosnepomuc...@outlook.com
 CC: python-list@python.org
 Subject: Re: Please help with Threading

 On 20May2013 10:53, Carlos Nepomuceno carlosnepomuc...@outlook.com wrote:
 | I just got my hands dirty trying to synchronize Python prints from many 
 threads.
 | Sometimes they mess up when printing the newlines.
 | I tried several approaches using threading.Lock and Condition.
 | None of them worked perfectly and all of them made the code sluggish.

 Show us some code, with specific complaints.

 Did you try this?

 _lock = Lock()

 def lprint(*a, **kw):
 global _lock
 with _lock:
 print(*a, **kw)

 and use lprint() everywhere?

It works! Think I was running the wrong script...

Anyway, the suggestion you've made is the third and latest attempt that I've 
tried to synchronize the print outputs from the threads.

I've also used:

### 1st approach ###
lock  = threading.Lock()
[...]
try:
    lock.acquire()
    [thread protected code]
finally:
    lock.release()

### 2nd approach ###
cond  = threading.Condition()
[...]
try:
    [thread protected code]
    with cond:
        print '[...]'

### 3rd approach ###
from __future__ import print_function

def safe_print(*args, **kwargs):
    global print_lock
    with print_lock:
    print(*args, **kwargs)
[...]
try:
    [thread protected code]
    safe_print('[...]')

Except for the first one all kind of have the same performance. The 
problem was I placed the acquire/release around the whole code block, 
instead of only the print statements.

Thanks a lot! ;)

 For generality the lock should be per file: the above hack uses one
 lock for any file, so that's going to stall overlapping prints to
 different files; inefficient.

 There are other things than the above, but at least individual prints will
 never overlap. If you have interleaved prints, show us.

 | Is there a 100% sure method to make print thread safe? Can it be fast???

 Depends on what you mean by fast. It will be slower than code
 with no lock; how much would require measurement.

 Cheers,
 --
 Cameron Simpson c...@zip.com.au

 My own suspicion is that the universe is not only queerer than we suppose,
 but queerer than we *can* suppose.
 - J.B.S. Haldane On Being the Right Size
 in the (1928) book Possible Worlds
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Please help with Threading

On Mon, May 20, 2013 at 7:54 PM, Cameron Simpson c...@zip.com.au wrote:
 On 20May2013 19:09, Chris Angelico ros...@gmail.com wrote:
 | On Mon, May 20, 2013 at 6:35 PM, Cameron Simpson c...@zip.com.au wrote:
 |_lock = Lock()
 | 
 |def lprint(*a, **kw):
 |  global _lock
 |  with _lock:
 |print(*a, **kw)
 | 
 |  and use lprint() everywhere?
 |
 | Fun little hack:
 |
 | def print(*args,print=print,lock=Lock(),**kwargs):
 |   with lock:
 | print(*args,**kwargs)
 |
 | Question: Is this a cool use or a horrible abuse of the scoping rules?

 I carefully avoided monkey patching print itself:-)

 That's... mad! I can see what the end result is meant to be, but
 it looks like a debugging nightmare. Certainly my scoping-fu is too
 weak to see at a glance how it works.

Hehe. Like I said, could easily be called abuse.

Referencing a function's own name in a default has to have one of
these interpretations:

1) It's a self-reference, which can be used to guarantee recursion
even if the name is rebound
2) It references whatever previously held that name before this def statement.

Either would be useful. Python happens to follow #2; though I can't
point to any piece of specification that mandates that, so all I can
really say is that CPython 3.3 appears to follow #2. But both
interpretations make sense, and both would be of use, and use of
either could be called abusive of the rules. Figure that out. :)

The second defaulted argument (lock=Lock()), of course, is a common
idiom. No abuse there, that's pretty Pythonic.

This same sort of code could be done as a decorator:

def serialize(fn):
lock=Lock()
def locked(*args,**kw):
with lock:
fn(*args,**kw)
return locked

print=serialize(print)

Spelled like this, it's obvious that the argument to serialize has to
be the previous 'print'. The other notation achieves the same thing,
just in a quirkier way :)

ChrisA
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Please help with Threading

2013-05-20 Thread Ned Batchelder


On 5/20/2013 6:09 AM, Chris Angelico wrote:

Referencing a function's own name in a default has to have one of
these interpretations:

1) It's a self-reference, which can be used to guarantee recursion
even if the name is rebound
2) It references whatever previously held that name before this def statement.


The meaning must be #2.  A def statement is nothing more than a fancy 
assignment statement.  This:


def foo(a):
return a + 1

is really just the same as:

foo = lambda a: a+1

(in fact, they compile to identical bytecode).  More complex def's don't 
have equivalent lambdas, but are still assignments to the name of the 
function.  So your apparently recursive print function is no more 
ambiguous x = x + 1.  The x on the right hand side is the old value of 
x, the x on the left hand side will be the new value of x.


# Each of these updates a name
x = x + 1
def print(*args,print=print,lock=Lock(),**kwargs):
  with lock:
print(*args,**kwargs)

Of course, if you're going to use that code, a comment might be in order 
to help the next reader through the trickiness...


--Ned.

--
http://mail.python.org/mailman/listinfo/python-list

Re: Please help with Threading

2013-05-20 Thread Dave Angel


On 05/20/2013 03:55 AM, Fábio Santos wrote:

My use case was a tight loop processing an image pixel by pixel, or
crunching a CSV file. If it only uses local variables (and probably hold a
lock before releasing the GIL) it should be safe, no?



Are you making function calls, using system libraries, or creating or 
deleting any objects?  All of these use the GIL because they use common 
data structures shared among all threads.  At the lowest level, creating 
an object requires locked access to the memory manager.



Don't forget, the GIL gets used much more for Python internals than it 
does for the visible stuff.



--
DaveA
--
http://mail.python.org/mailman/listinfo/python-list

Re: Please help with Threading

=On Mon, May 20, 2013 at 8:46 PM, Ned Batchelder n...@nedbatchelder.com wrote:
 On 5/20/2013 6:09 AM, Chris Angelico wrote:

 Referencing a function's own name in a default has to have one of
 these interpretations:

 1) It's a self-reference, which can be used to guarantee recursion
 even if the name is rebound
 2) It references whatever previously held that name before this def
 statement.


 The meaning must be #2.  A def statement is nothing more than a fancy
 assignment statement.

Sure, but the language could have been specced up somewhat
differently, with the same syntax. I was fairly confident that this
would be universally true (well, can't do it with 'print' per se in
older Pythons, but for others); my statement about CPython 3.3 was
just because I hadn't actually hunted down specification proof.

 So your apparently recursive print function is no more
 ambiguous x = x + 1.  The x on the right hand side is the old value of x,
 the x on the left hand side will be the new value of x.

 # Each of these updates a name
 x = x + 1

 def print(*args,print=print,lock=Lock(),**kwargs):
   with lock:
 print(*args,**kwargs)

Yeah. The decorator example makes that fairly clear.

 Of course, if you're going to use that code, a comment might be in order to
 help the next reader through the trickiness...

Absolutely!!

ChrisA
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Please help with Threading

I didn't know that.
On 20 May 2013 12:10, Dave Angel da...@davea.name wrote:
 Are you making function calls, using system libraries, or creating or
deleting any objects?  All of these use the GIL because they use common
data structures shared among all threads.  At the lowest level, creating an
object requires locked access to the memory manager.


 Don't forget, the GIL gets used much more for Python internals than it
does for the visible stuff.

I did not know that. It's both interesting and somehow obvious, although I
didn't know it yet.
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Please help with Threading

2013-05-20 Thread 88888 Dihedral

Chris Angelico於 2013年5月20日星期一UTC+8下午5時09分13秒寫道：
 On Mon, May 20, 2013 at 6:35 PM, Cameron Simpson c...@zip.com.au wrote:
 
_lock = Lock()
 
 
 
def lprint(*a, **kw):
 
  global _lock
 
  with _lock:
 
print(*a, **kw)
 
 
 
  and use lprint() everywhere?
 
 
 
 Fun little hack:
 
 
 
 def print(*args,print=print,lock=Lock(),**kwargs):
 
   with lock:
 
 print(*args,**kwargs)
 
 
 
 Question: Is this a cool use or a horrible abuse of the scoping rules?
 
 
 
 ChrisA

OK, if the python interpreter has a global hiden print out
buffer of ,say, 2to 16 K bytes, and all  string print functions
just construct the output string from the format to this string 
in an efficient low level way, then the next question 
would be that whether the uses can use functions in this 
low level buffer for other string formatting jobs.

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Please help with Threading

On Tue, May 21, 2013 at 11:44 AM, 8 Dihedral
dihedral88...@googlemail.com wrote:
 OK, if the python interpreter has a global hiden print out
 buffer of ,say, 2to 16 K bytes, and all  string print functions
 just construct the output string from the format to this string
 in an efficient low level way, then the next question
 would be that whether the uses can use functions in this
 low level buffer for other string formatting jobs.

You remind me of George.
http://www.chroniclesofgeorge.com/

Both make great reading when I'm at work and poking around with random
stuff in our .SQL file of carefully constructed mayhem.

ChrisA
-- 
http://mail.python.org/mailman/listinfo/python-list

RE: Please help with Threading

sys.stdout.write() does not suffer from the newlines mess up when printing from 
many threads, like print statement does.

The only usage difference, AFAIK, is to add '\n' at the end of the string.

It's faster and thread safe (really?) by default.

BTW, why I didn't find the source code to the sys module in the 'Lib' directory?


 Date: Tue, 21 May 2013 11:50:17 +1000
 Subject: Re: Please help with Threading
 From: ros...@gmail.com
 To: python-list@python.org

 On Tue, May 21, 2013 at 11:44 AM, 8 Dihedral
 dihedral88...@googlemail.com wrote:
 OK, if the python interpreter has a global hiden print out
 buffer of ,say, 2to 16 K bytes, and all string print functions
 just construct the output string from the format to this string
 in an efficient low level way, then the next question
 would be that whether the uses can use functions in this
 low level buffer for other string formatting jobs.

 You remind me of George.
 http://www.chroniclesofgeorge.com/

 Both make great reading when I'm at work and poking around with random
 stuff in our .SQL file of carefully constructed mayhem.

 ChrisA
 --
 http://mail.python.org/mailman/listinfo/python-list   
   
-- 
http://mail.python.org/mailman/listinfo/python-list

RE: Please help with Threading