Re: Auto-parallelizing with decorators?

2007-07-07 Thread Josiah Carlson
Kirk Strauser wrote:
 In article [EMAIL PROTECTED],
  Kirk Strauser [EMAIL PROTECTED] wrote:
 
 I was thinking about how a lot of Lisp proponents claim that Lisp is
 inherently parallelizable because its functions don't have (or are not
 supposed to have) side effects, and therefore the compiler can easily tell
 which calls may be run in parallel instead of strictly serially.  I'm not a
 Lisp expert so I can't say whether that's true or not, but it seems like an
 interesting idea for Python.
 
 By the way, I uploaded a sample implementation (in Python) of what I had 
 in mind to http://www.honeypot.net/multi-processing-map-python .  Please 
 let me know what you think of it and whether it seems remotely 
 interesting or goofy.

Try the Processing package available at the Python package index. 
Create as many processes as you want, then toss the data you want 
processed into a Queue.  Watch magically as your data gets processed.

  - Josiah
-- 
http://mail.python.org/mailman/listinfo/python-list


Auto-parallelizing with decorators?

2007-07-06 Thread Kirk Strauser
I was thinking about how a lot of Lisp proponents claim that Lisp is
inherently parallelizable because its functions don't have (or are not
supposed to have) side effects, and therefore the compiler can easily tell
which calls may be run in parallel instead of strictly serially.  I'm not a
Lisp expert so I can't say whether that's true or not, but it seems like an
interesting idea for Python.

Suppose that Python had a new decorator, say @parallelizable.  Functions
so decorated would be eligible for multi-processed or multi-threaded
execution by such native constructs as list comprehensions, or the map()
function.  Their logic could be something along the lines of:

1) Is every function in the function expression of map() or a list
comprehension decorated with @parallelizable?  If not, use the normal
implementation.
2) Spawn an appropriate number of child threads and queue values to be
processed.
3) As results return, either store them in the result list (for map()), or
insert them into their place in the output queue to be consumed by users of
a list comprehension.

In #2, appropriate could default to something like 2 times the number of
CPUs detected in the system (which could be found the first time it was
needed so that programs that don't use this functionality don't pay for
finding that information).  I could imagine that it'd be useful to pass
hints to the decorator to more explicitly size the thread pool, such as:

@parallelizable(number=100)
def lightweightfunction:
Launch as many parallel copies as you can

@parallelizable(perproc=1)
def heavierfunction:
Only run one copy per CPU

@parallelizable(number=2)
def bigslowfunction:
This benefits from parallel execution, but don't kill the system

Would something like this have merit?  I like the idea because it's
completely optional; if you don't use it, you never have to deal with the
side effects.  However, if you take care to make your functions loosely
coupled and not dependent on execution order, you could get a nice speedup
every single time you give Python permission to schedule your repeated
function calls.
-- 
Kirk Strauser
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Auto-parallelizing with decorators?

2007-07-06 Thread Stefan Behnel
Kirk Strauser wrote:
 Suppose that Python had a new decorator, say @parallelizable.  Functions
 so decorated would be eligible for multi-processed or multi-threaded
 execution by such native constructs as list comprehensions, or the map()
 function.

Wouldn't that require parallelism in the interpreter first? Mind the GIL...

Stefan
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Auto-parallelizing with decorators?

2007-07-06 Thread Kirk Strauser
Stefan Behnel wrote:

 Wouldn't that require parallelism in the interpreter first? Mind the
 GIL...

That may be.  I'd bet though that I could whip up a native Python workalike
using os.fork() that would work around GIL, at least on Unix (just because
I don't know if Windows has os.fork(), having never looked for it).

I know something like this wouldn't be the easiest thing to implement, but
honestly, having such beautiful constructs as map() and list comprehensions
available is just begging for an application like this.
-- 
Kirk Strauser
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Auto-parallelizing without decorators?

2007-07-06 Thread John Nagle
Kirk Strauser wrote:
 I was thinking about how a lot of Lisp proponents claim that Lisp is
 inherently parallelizable because its functions don't have (or are not
 supposed to have) side effects, and therefore the compiler can easily tell
 which calls may be run in parallel instead of strictly serially.  I'm not a
 Lisp expert so I can't say whether that's true or not, but it seems like an
 interesting idea for Python.
 
 Suppose that Python had a new decorator, say @parallelizable.  Functions
 so decorated would be eligible for multi-processed or multi-threaded
 execution by such native constructs as list comprehensions, or the map()
 function.

That implies much smarter compilers than we've seen to date for Python.

A more useful idea might be to provide a map/reduce library in the
sense that Google uses the term.

Google's concept of map/reduce is that the mapped function is applied
to all the elements of the list simultaneously, up to the limits of resources
available.  The result of each function is a name/value tuple.  The
reduction process consists of collecting all tuples with the same name
and feeding them to copies of the reduce function in no particular order,
with the reduce function producing fewer output tuples than input tuples.
The reduce function is applied repeatedly in a tree sense until all
output tuples have been reduced.

See:

http://labs.google.com/papers/mapreduce.html

Contrast this with Python's reduce, which is inherently sequential.

John Nagle
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Auto-parallelizing with decorators?

2007-07-06 Thread Kirk Strauser
In article [EMAIL PROTECTED],
 Kirk Strauser [EMAIL PROTECTED] wrote:

 I was thinking about how a lot of Lisp proponents claim that Lisp is
 inherently parallelizable because its functions don't have (or are not
 supposed to have) side effects, and therefore the compiler can easily tell
 which calls may be run in parallel instead of strictly serially.  I'm not a
 Lisp expert so I can't say whether that's true or not, but it seems like an
 interesting idea for Python.

By the way, I uploaded a sample implementation (in Python) of what I had 
in mind to http://www.honeypot.net/multi-processing-map-python .  Please 
let me know what you think of it and whether it seems remotely 
interesting or goofy.
-- 
Kirk Strauser
-- 
http://mail.python.org/mailman/listinfo/python-list