Threads vs. processes, what to consider in choosing ?

2009-02-17 Thread Barak, Ron
Hi,

May I have your recommendations in choosing threads or processes for the 
following ?

I have a wxPython application that builds an internal database from a list of 
files and then displays various aspects of that data, 
in response to user's requests.

I want to add a module that finds events in a set of log files (LogManager).
These log files are potentially huge, and the initial processing is lengthy 
(several minutes).
Thus, when the user will choose LogManager, it would be unacceptable to block 
the other parts of the program, and so - the initial LogManager processing
would need to be done separately from the normal run of the program.
Once the initial processing is done, the main program would be notified and 
could display the results of LogManager processing.

I was thinking of either using threads, or using separate processes, for the 
main programs and LogManager.

What would you suggest I should consider in choosing between the two options ?
Are there other options besides threads and multi-processing ?

Thanks,
Ron.
--
http://mail.python.org/mailman/listinfo/python-list


Re: Threads vs. processes, what to consider in choosing ?

2009-02-17 Thread Philip Semanchuk


On Feb 17, 2009, at 10:18 AM, Barak, Ron wrote:

I have a wxPython application that builds an internal database from  
a list of files and then displays various aspects of that data,

in response to user's requests.

I want to add a module that finds events in a set of log files  
(LogManager).
These log files are potentially huge, and the initial processing is  
lengthy (several minutes).
Thus, when the user will choose LogManager, it would be unacceptable  
to block the other parts of the program, and so - the initial  
LogManager processing

would need to be done separately from the normal run of the program.
Once the initial processing is done, the main program would be  
notified and could display the results of LogManager processing.


I was thinking of either using threads, or using separate processes,  
for the main programs and LogManager.


What would you suggest I should consider in choosing between the two  
options ?

Are there other options besides threads and multi-processing ?


Hi Ron,
The general rule is that it is a lot easier to share data between  
threads than between processes. The multiprocessing library makes the  
latter easier but is only part of the standard library in Python =  
2.6. The design of your application matters a lot. For instance, will  
the processing code write its results to a database, ping the GUI code  
and then exit, allowing the GUI to read the database? That sounds like  
an excellent setup for processes.


In addition, there's the GIL to consider. Multi-process applications  
aren't affected by it while multi-threaded applications may be. In  
these days where multi-processor/multi-core machines are more common,  
this fact is ever more important. Torrents of words have been written  
about the GIL on this list and elsewhere and I have nothing useful to  
add to the torrents. I encourage you to read some of those  
conversations.


FWIW, when I was faced with a similar setup, I went with multiple  
processes rather than threads.


Last but not least, since you asked about alternatives to threads and  
multiprocessing, I'll point you to some low level libraries I wrote  
for doing interprocess communication:

http://semanchuk.com/philip/posix_ipc/
http://semanchuk.com/philip/sysv_ipc/

Good luck
Philip




--
http://mail.python.org/mailman/listinfo/python-list


Re: Threads vs. processes, what to consider in choosing ?

2009-02-17 Thread Christian Heimes
Philip Semanchuk wrote:
 The general rule is that it is a lot easier to share data between
 threads than between processes. The multiprocessing library makes the
 latter easier but is only part of the standard library in Python = 2.6.
 The design of your application matters a lot. For instance, will the
 processing code write its results to a database, ping the GUI code and
 then exit, allowing the GUI to read the database? That sounds like an
 excellent setup for processes.

A backport for Python 2.4 and 2.5 is available on pypi. Python 2.5.4 is
recommended though.

Christian

--
http://mail.python.org/mailman/listinfo/python-list


Re: Threads vs Processes

2006-07-28 Thread H J van Rooyen
Dennis Lee Bieber [EMAIL PROTECTED] Wrote:

| On Thu, 27 Jul 2006 09:17:56 -0700, Carl J. Van Arsdall
| [EMAIL PROTECTED] declaimed the following in comp.lang.python:
|
|  Ah, alright, I think I understand, so threading works well for sharing
|  python objects.  Would a scenario for this be something like a a job
|  queue (say Queue.Queue) for example.  This is a situation in which each
|  process/thread needs access to the Queue to get the next task it must
|  work on.  Does that sound right?  Would the same apply to multiple
|  threads needed access to a dictionary? list?
| 
| Python's Queue module is only (to my knowledge) an internal
| (thread-shared) communication channel; you'd need something else to work
| IPC -- VMS mailboxes, for example (more general than UNIX pipes with
| their single reader/writer concept)
|
|  shared memory mean something more low-level like some bits that don't
|  necessarily mean anything to python but might mean something to your
|  application?
| 
| Most OSs support creation and allocation of memory blocks with an
| attached name; this allows multiple processes to map that block of
| memory into their address space. The contents of said memory block is
| totally up to application agreements (won't work well with Python native
| objects).
|
| mmap()
|
| is one such system. By rough description, it maps a disk file into a
| block of memory, so the OS handles loading the data (instead of, say,
| file.seek(somewhere_long) followed by file.read(some_data_type) you
| treat the mapped memory as an array and use x = mapped[somewhere_long];
| if somewhere_long is not yet in memory, the OS will page swap that part
| of the file into place). The file can be shared, so different
| processes can map the same file, and thereby, the same memory contents.
|
| This can be useful, for example, with multiple identical processes
| feeding status telemetry. Each process is started with some ID, and the
| ID determines which section of mapped memory it is to store its status
| into. The controller program can just do a loop over all the mapped
| memory, updating a display with whatever is current -- doesn't matter if
| process_N manages to update a field twice while the monitor is
| scanning... The display always shows the data that was current at the
| time of the scan.
|
| Carried further -- special memory cards can (at least they were
| where I work) be obtained. These cards have fiber-optic connections. In
| a closely distributed system, each computer has one of these cards, and
| the fiber-optics link them in a cycle. Each process (on each computer)
| maps the memory of the card -- the cards then have logic to relay all
| memory changes, via fiber, to the next card in the link... Thus, all the
| closely linked computers share this block of memory.

This is nice to share inputs from the real world - but there are some hairy
issues if it is to be used for general purpose consumption - unless there are
hardware restrictions to stop machines stomping on each other's memories - i.e.
the machines have to be *polite* and *well behaved* - or you can easily have a
major smash...
A structure has to agreed on, and respected...

 - Hendrik


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Threads vs Processes

2006-07-28 Thread Tobias Brox
[mark]
 http://twistedmatrix.com/projects/core/documentation/howto/async.html .

At my work, we started writing a web app using the twisted framework,
but it was somehow too twisted for the developers, so actually they
chose to do threading rather than using twisted's async methods.

-- 
Tobias Brox, 69°42'N, 18°57'E
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Threads vs Processes

2006-07-28 Thread mark
On Thu, 27 Jul 2006 20:53:54 -0700, Nick Vatamaniuc wrote:
 Debugging all those threads should be a project in an of itself.

Ahh, debugging - I forgot to bring that one up in my argument! Thanks
Nick ;)

Certainly I agree of course that there are many applications which suit
a threaded design. I just think there is a general over-emphasis on
using threads and see it applied very often where an event based
approach would be cleaner and more efficient. Thanks for your comments
Bryan and Nick, an interesting debate.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Threads vs Processes

2006-07-28 Thread sturlamolden
Chance Ginger wrote:

 Not quite that simple. In most modern OS's today there is something
 called COW - copy on write. What happens is when you fork a process
 it will make an identical copy. Whenever the forked process does
 write will it make a copy of the memory. So it isn't quite as bad.

A noteable exception is a toy OS from a manufacturer in Redmond,
Washington. It does not do COW fork. It does not even fork.

To make a server system scale well on Windows you need to use threads,
not processes. That is why the global interpreter lock sucks so badly
on Windows.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Threads vs Processes

2006-07-28 Thread bryanjugglercryptographer

sturlamolden wrote:
 A noteable exception is a toy OS from a manufacturer in Redmond,
 Washington. It does not do COW fork. It does not even fork.

 To make a server system scale well on Windows you need to use threads,
 not processes.

Here's one to think about: if you have a bunch of threads running,
and you fork, should the child process be born running all the
threads? Neither answer is very attractive. It's a matter of which
will probably do the least damage in most cases (and the answer
the popular threading systems choose is 'no'; the child process
runs only the thread that called fork).

MS-Windows is more thread-oriented than *nix, and it avoids this
particular problem by not using fork() to create new processes.

 That is why the global interpreter lock sucks so badly
 on Windows.

It sucks about he same on Windows and *nix: hardly at all on
single-processors, moderately on multi-processors.


-- 
--Bryan

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Threads vs Processes

2006-07-28 Thread [EMAIL PROTECTED]
sturlamolden wrote:
 Chance Ginger wrote:

  Not quite that simple. In most modern OS's today there is something
  called COW - copy on write. What happens is when you fork a process
  it will make an identical copy. Whenever the forked process does
  write will it make a copy of the memory. So it isn't quite as bad.

 A noteable exception is a toy OS from a manufacturer in Redmond,
 Washington. It does not do COW fork. It does not even fork.

That's only true for Windows 98/95/Windows 3.x and other DOS-based
Windows versions.

NTCreateProcess with SectionHandle=NULL creates a new process with a
COW version of the parent process's address space.

It's not called fork, but it does the same thing.  There's a new name
for it in Win2K or XP (maybe CreateProcessEx?) but the functionality
has been there since the NT 3.x days at least and is in all modern
Windows versions.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Threads vs Processes

2006-07-28 Thread [EMAIL PROTECTED]
mark wrote:
 On Wed, 26 Jul 2006 10:54:48 -0700, Carl J. Van Arsdall wrote:
  Alright, based a on discussion on this mailing list, I've started to
  wonder, why use threads vs processes.

 The debate should not be about threads vs processes, it should be
 about threads vs events.

Events serve a seperate problem space.

Use event-driven state machine models for efficient multiplexing and
fast network I/O (e.g. writing an efficient static HTTP server)

Use multi-execution models for efficient multiprocessing.  No matter
how scalable your event-driven app is it's not going to take advantage
of multi-CPU systems, or modern multi-core processors.

Event-driven state machines can be harder to program and maintain than
multi-process solutions, but they are usually easier than
multi-threaded solutions.

On-topic: If your problem is one where event-driven state machines are
a good solution, Python generators can be a _huge_ help.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Threads vs Processes

2006-07-27 Thread [EMAIL PROTECTED]
John Henry wrote:
 
  Carl,
   OS writers provide much more tools for debugging, tracing, changing
  the priority of, sand-boxing processes than threads (in general) It
  *should* be easier to get a process based solution up and running
  andhave it be more robust, when compared to a threaded solution.
 
  - Paddy (who shies away from threads in C and C++ too ;-)

 That mythical process is more robust then thread application
 paradigm again.

 No wonder there are so many boring software applications around.

 Granted.  Threaded program forces you to think and design your
 application much more carefully (to avoid race conditions, dead-locks,
 ...) but there is nothing inherently *non-robust* about threaded
 applications.

Indeed.  Let's just get rid of all preemptive multitasking while we're
at it; MacOS9's cooperative, non-memory-protected system wasn't
inherently worse as long as every application was written properly.
There was nothing inherently non-robust about it!

The key difference between threads and processes is that threads share
all their memory, while processes have memory protection except with
particular segments of memory they choose to share.

The next most important difference is that certain languages have
different support for threads/procs.  If you're writing a Python
application, you need to be aware of the GIL and its implications on
multithreaded performance.  If you're writing a Java app, you're
handicapped by the lack of support for multiprocess solutions.

The third most important difference--and it's a very distant
difference--is the performance difference.  In practice, most
well-designed systems will be pooling threads/procs and so startup time
is not that critical.  For some apps, it may be.  Context switching
time may differ, and likewise that is not usually a sticking point but
for particular programs it can be.  On some OSes, launching a
copy-on-write process is difficult--that used to be a reason to choose
threads over procs on Windows, but nowadays all modern Windows OSes
offer a CreateProcessEx call that allows full-on COW processes.

In general, though, if you want to share _all_ memory or if you have
measured and context switching sucks on your OS and is a big factor in
your application, use threads.  In general, if you don't know exactly
why you're choosing one or the other, or if you want memory protection,
robustness in the face of programming errors, access to more 3rd-party
libraries, etc, then you should choose a multiprocess solution.

(OS designers spent years of hard work writing OSes with protected
memory--why voluntarily throw that out?)

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Threads vs Processes

2006-07-27 Thread [EMAIL PROTECTED]
Russell Warren wrote:
 This is something I have a streak of paranoia about (after discovering
 that the current xmlrpclib has some thread safety issues).  Is there a
 list maintained anywhere of the modules that are aren't thread safe?


It's much safer to work the other way: assume that libraries are _not_
thread safe unless they're listed as such.  Even things like the
standard C library on mainstream Linux distributions are only about 7
years into being thread-safe by default, anything at all esoteric you
should assume is not until you investigate and find documentation to
the contrary.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Threads vs Processes

2006-07-27 Thread [EMAIL PROTECTED]
[EMAIL PROTECTED] wrote:
 John Henry wrote:
  Granted.  Threaded program forces you to think and design your
  application much more carefully (to avoid race conditions, dead-locks,
  ...) but there is nothing inherently *non-robust* about threaded
  applications.

 Indeed.  Let's just get rid of all preemptive multitasking while we're
 at it

Also, race conditions and deadlocks are equally bad in multiprocess
solutions as in multithreaded ones.  Any time you're doing parallel
processing you need to consider them.

I'd actually submit that initially writing multiprocess programs
requires more design and forethought, since you need to determine
exactly what you want to share instead of just saying what the heck,
everything's shared!  The payoff in terms of getting _correct_
behavior more easily, having much easier maintenance down the line, and
being more robust in the face of program failures (or unforseen
environment issues) is usually well worth it, though there are
certainly some applications where threads are a better choice.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Threads vs Processes

2006-07-27 Thread Nick Craig-Wood
[EMAIL PROTECTED] [EMAIL PROTECTED] wrote:
  Yes, someone can, and that someone might as well be you.
  How long does it take to create and clean up 100 trivial
  processes on your system? How about 100 threads? What
  portion of your user waiting time is that?

Here is test prog...

The results are on my 2.6GHz P4 linux system

  Forking
  1000 loops, best of 3: 546 usec per loop
  Threading
  1 loops, best of 3: 199 usec per loop

Indicating that starting up and tearing down new threads is 2.5 times
quicker than starting new processes under python.

This is probably irrelevant in the real world though!



Time threads vs fork


import os
import timeit
import threading

def do_child_stuff():
Trivial function for children to run
# print hello from child
pass

def fork_test():
Test forking
pid = os.fork()
if pid == 0:
# child
do_child_stuff()
os._exit(0)
# parent - wait for child to finish
os.waitpid(pid, os.P_WAIT)

def thread_test():
Test threading
t = threading.Thread(target=do_child_stuff)
t.start()
# wait for child to finish
t.join()

def main():
print Forking
timeit.main([-s, from __main__ import fork_test, fork_test()])
print Threading
timeit.main([-s, from __main__ import thread_test, thread_test()])

if __name__ == __main__:
main()

-- 
Nick Craig-Wood [EMAIL PROTECTED] -- http://www.craig-wood.com/nick
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Threads vs Processes

2006-07-27 Thread Steve Holden
Carl J. Van Arsdall wrote:
 Paul Rubin wrote:
 
Carl J. Van Arsdall [EMAIL PROTECTED] writes:
  

Processes seem fairly expensive from my research so far.  Each fork
copies the entire contents of memory into the new process.


No, you get two processes whose address spaces get the data.  It's
done with the virtual memory hardware.  The data isn't copied.  The
page tables of both processes are just set up to point to the same
physical pages.  Copying only happens if a process writes to one of
the pages.  The OS detects this using a hardware trap from the VM
system.
  
 
 Ah, alright.  So if that's the case, why would you use python threads 
 versus spawning processes?  If they both point to the same address space 
 and python threads can't run concurrently due to the GIL what are they 
 good for?
 

Well, of course they can interleave essentially independent 
computations, which is why threads (formerly lightweight processes) 
were traditionally defined.

Further, some thread-safe extension (compiled) libraries will release 
the GIL during their work, allowing other threads to execute 
simultaneously - and even in parallel on multi-processor hardware.

regards
  Steve
-- 
Steve Holden   +44 150 684 7255  +1 800 494 3119
Holden Web LLC/Ltd  http://www.holdenweb.com
Skype: holdenweb   http://holdenweb.blogspot.com
Recent Ramblings http://del.icio.us/steve.holden

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Threads vs Processes

2006-07-27 Thread Gerhard Fiedler
On 2006-07-26 19:10:14, Carl J. Van Arsdall wrote:

 Ah, alright.  So if that's the case, why would you use python threads 
 versus spawning processes?  If they both point to the same address space 
 and python threads can't run concurrently due to the GIL what are they 
 good for?

Nothing runs concurrently on a single core processor (pipelining aside).
Processes don't run any more concurrently than threads. The scheduling is
different, but they still run sequentially.

Gerhard

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Threads vs Processes

2006-07-27 Thread Carl J. Van Arsdall
[EMAIL PROTECTED] wrote:
 Carl J. Van Arsdall wrote:
   
 Alright, based a on discussion on this mailing list, I've started to
 wonder, why use threads vs processes.
 

 In many cases, you don't have a choice. If your Python program
 is to run other programs, the others get their own processes.
 There's no threads option on that.

 If multiple lines of execution need to share Python objects,
 then the standard Python distribution supports threads, while
 processes would require some heroic extension. Don't confuse
 sharing memory, which is now easy, with sharing Python
 objects, which is hard.

   
Ah, alright, I think I understand, so threading works well for sharing 
python objects.  Would a scenario for this be something like a a job 
queue (say Queue.Queue) for example.  This is a situation in which each 
process/thread needs access to the Queue to get the next task it must 
work on.  Does that sound right?  Would the same apply to multiple 
threads needed access to a dictionary? list?

Now if you are just passing ints and strings around, use processes with 
some type of IPC, does that sound right as well?  Or does the term 
shared memory mean something more low-level like some bits that don't 
necessarily mean anything to python but might mean something to your 
application?

Sorry if you guys think i'm beating this to death, just really trying to 
get a firm grasp on what you are telling me and again, thanks for taking 
the time to explain all of this to me!

-carl


-- 

Carl J. Van Arsdall
[EMAIL PROTECTED]
Build and Release
MontaVista Software

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Threads vs Processes

2006-07-27 Thread John Henry

[EMAIL PROTECTED] wrote:
 [EMAIL PROTECTED] wrote:
  John Henry wrote:
   Granted.  Threaded program forces you to think and design your
   application much more carefully (to avoid race conditions, dead-locks,
   ...) but there is nothing inherently *non-robust* about threaded
   applications.
 
  Indeed.  Let's just get rid of all preemptive multitasking while we're
  at it

 Also, race conditions and deadlocks are equally bad in multiprocess
 solutions as in multithreaded ones.  Any time you're doing parallel
 processing you need to consider them.


Only in the sense that you are far more likely to be dealing with
shared resources in a multi-threaded application.  When I start a
sub-process, I know I am doing that to *avoid* resource sharing.  So,
the chance of a dead-lock is less - only because I would do it far
less.

 I'd actually submit that initially writing multiprocess programs
 requires more design and forethought, since you need to determine
 exactly what you want to share instead of just saying what the heck,
 everything's shared!  The payoff in terms of getting _correct_
 behavior more easily, having much easier maintenance down the line, and
 being more robust in the face of program failures (or unforseen
 environment issues) is usually well worth it, though there are
 certainly some applications where threads are a better choice.

If you're sharing things, I would thread. I would not want to  pay the
expense of a process.

It's too bad that programmers are not threading more often.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Threads vs Processes

2006-07-27 Thread [EMAIL PROTECTED]
John Henry wrote:
 If you're sharing things, I would thread. I would not want to  pay the
 expense of a process.

This is generally a false cost.  There are very few applications where
thread/process startup time is at all a fast path, and there are
likewise few where the difference in context switching time matters at
all.  Indeed, in a Python program on a multiprocessor system, process
are potentially faster than threads, not slower.

Moreover, to get at best a small performance gain you pay a huge cost
by sacrificing memory protection within the threaded process.

You can share things between processes, but you can't memory protect
things between threads.  So if you need some of each (some things
shared and others protected), processes are the clear choice.

Now, for a few applications threads make sense.  Usually that means
applications that have to share a great number of complex data
structures (and normally, making the choice for performance reasons
means your design is flawed and you could help performance greatly by
reworking it--though there may be some exceptions).  But the general
rule when choosing between them should be use processes when you can,
and threads when you must.

Sadly, too many programmers greatly overuse threading.  That problem is
exacerbated by the number of beginner-level programming books that talk
about how to use threads without ever mentioning processes (and without
going into the design of multi-execution apps).

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Threads vs Processes

2006-07-27 Thread Shitiz Bansal
How do i share memory between processes?"[EMAIL PROTECTED]" [EMAIL PROTECTED] wrote: John Henry wrote: If you're sharing things, I would thread. I would not want to  pay the expense of a process.This is generally a false cost.  There are very few applications wherethread/process startup time is at all a fast path, and there arelikewise few where the difference in context switching time matters atall.  Indeed, in a Python program on a multiprocessor system, processare potentially faster than threads, not slower.Moreover, to get at best a small performance gain you pay a huge costby sacrificing memory protection within the threaded process.You can share things between processes, but you can't memory protectthings between threads.  So if you need some of each (some
 thingsshared and others protected), processes are the clear choice.Now, for a few applications threads make sense.  Usually that meansapplications that have to share a great number of complex datastructures (and normally, making the choice for performance reasonsmeans your design is flawed and you could help performance greatly byreworking it--though there may be some exceptions).  But the generalrule when choosing between them should be "use processes when you can,and threads when you must".Sadly, too many programmers greatly overuse threading.  That problem isexacerbated by the number of beginner-level programming books that talkabout how to use threads without ever mentioning processes (and withoutgoing into the design of multi-execution apps).-- http://mail.python.org/mailman/listinfo/python-list __Do You Yahoo!?Tired
 of spam?  Yahoo! Mail has the best spam protection around http://mail.yahoo.com -- 
http://mail.python.org/mailman/listinfo/python-list

Re: Threads vs Processes

2006-07-27 Thread John Henry
Nick Craig-Wood wrote:

 Here is test prog...


snip

Here's a more real-life like program done in both single threaded mode
and multi-threaded mode.  You'll need PythonCard to try this.  Just to
make the point, you will notice that the core code is identical between
the two (method on_menuFileStart_exe).  The only difference is in the
setup code.  I wanted to dismiss the myth that multi-threaded programs
are inherently *evil*, or that it's diffcult to code, or that it's
unsafe.(what ever dirty water people wish to throw at it).

Don't ask me to try this in process!

To have fun, first run it in single threaded mode (change the main
program to invoke the MyBackground class, instead of the
MyBackgroundThreaded class):

Change:

app = model.Application(MyBackgroundThreaded)

to:

app = model.Application(MyBackground)

Start the process by selecting File-Start, and then try to stop the
program by clicking File-Stop.  Note the performance of the program.

Now, run it in multi-threaded mode.  Click File-Start several times
(up to 4) and then try to stop the program by clicking File-Stop.

If you want to show off, add several more StaticText items in the
resource file, add them to the textAreas list in MyBackgroundThreaded
class and let it rip!

BTW: This ap also demonstrates the weakness in Python thread - the
threads don't get preempted equally (not even close).

:-)

Two files follows (test.py and test.rsrc.py):

#!/usr/bin/python


__version__ = $Revision: 1.1 $
__date__ = $Date: 2004/10/24 19:21:46 $


import wx
import threading
import thread
import time

from PythonCard import model

class MyBackground(model.Background):

def on_initialize(self, event):
# if you have any initialization
# including sizer setup, do it here
self.running(False)
self.textAreas=(self.components.TextArea1,)
return

def on_menuFileStart_select(self, event):
on_menuFileStart_exe(self.textAreas[0])
return

def on_menuFileStart_exe(self, textArea):
textArea.visible=True
self.running(True)
for i in range(1000):
textArea.text = Got up to %d % i
##print i
for j in range(i):
k = 0
time.sleep(0)
if not self.running(): break
try:
wx.SafeYield(self)
except:
pass
if not self.running(): break
textArea.text = Finished at %d % i
return

def on_menuFileStop_select(self, event):
self.running(False)

def on_Stop_mouseClick(self, event):
self.on_menuFileStop_select(event)
return

def running(self, flag=None):
if flag!=None:
self.runningFlag=flag
return self.runningFlag


class MyBackgroundThreaded(MyBackground):

def on_initialize(self, event):
# if you have any initialization
# including sizer setup, do it here
self.myLock=thread.allocate_lock()
self.myThreadCount = 0
self.running(False)
self.textAreas=[self.components.TextArea1, 
self.components.TextArea2,
self.components.TextArea3, self.components.TextArea4]
return

def on_menuFileStart_select(self, event):
res=MyBackgroundWorker(self).start()

def on_menuFileStop_select(self, event):
self.running(False)
self.menuBar.setEnabled(menuFileStart, True)

def on_Stop_mouseClick(self, event):
self.on_menuFileStop_select(event)

def running(self, flag=None):
self.myLock.acquire()
if flag!=None:
self.runningFlag=flag
flag=self.runningFlag
self.myLock.release()
return flag

class MyBackgroundWorker(threading.Thread):
def __init__(self, parent):
threading.Thread.__init__(self)
self.parent=parent
self.parent.myLock.acquire()
threadCount=self.parent.myThreadCount
self.parent.myLock.release()
self.textArea=self.parent.textAreas[threadCount]

def run(self):
self.parent.myLock.acquire()
self.parent.myThreadCount += 1
if self.parent.myThreadCount==len(self.parent.textAreas):
self.parent.menuBar.setEnabled(menuFileStart, False)
self.parent.myLock.release()

self.parent.on_menuFileStart_exe(self.textArea)

   

Re: Threads vs Processes

2006-07-27 Thread bryanjugglercryptographer

Carl J. Van Arsdall wrote:
 Ah, alright, I think I understand, so threading works well for sharing
 python objects.  Would a scenario for this be something like a a job
 queue (say Queue.Queue) for example.  This is a situation in which each
 process/thread needs access to the Queue to get the next task it must
 work on.  Does that sound right?

That's a reasonable and popular technique. I'm not sure what this
refers to in your question, so I can't say if it solves the
problem of which you are thinking.

  Would the same apply to multiple
 threads needed access to a dictionary? list?

The Queue class is popular with threads because it already has
locking around its basic methods. You'll need to serialize your
operations when sharing most kinds of objects.

 Now if you are just passing ints and strings around, use processes with
 some type of IPC, does that sound right as well?

Also reasonable and popular. You can even pass many Python objects
by value using pickle, though you lose some safety.

  Or does the term
 shared memory mean something more low-level like some bits that don't
 necessarily mean anything to python but might mean something to your
 application?

Shared memory means the same memory appears in multiple processes,
possibly at different address ranges. What any of them writes to
the memory, they can all read. The standard Python distribution
now offers shared memory via os.mmap(), but lacks cross-process
locks.

Python doesn't support allocating objects in shared memory, and
doing so would be difficult. That's what the POSH project is
about, but it looks stuck in alpha.


-- 
--Bryan

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Threads vs Processes

2006-07-27 Thread Grant Edwards
On 2006-07-27, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote:

 If you're sharing things, I would thread. I would not want to
 pay the expense of a process.

 This is generally a false cost.  There are very few
 applications where thread/process startup time is at all a
 fast path,

Even if it were, on any sanely designed OS, there really isn't
any extra expense for a process over a thread.

 Moreover, to get at best a small performance gain you pay a
 huge cost by sacrificing memory protection within the threaded
 process.

Threading most certainly shouldn't be done in some attempt to
improve performance over a multi-process model.  It should be
done because it fits the algorithm better.  If the execution
contexts don't need to share data and can communicate in a
simple manner, then processes probably make more sense.  If the
contexts need to operate jointly on complex shared data, then
threads are usually easier.

-- 
Grant Edwards   grante Yow!  My life is a patio
  at   of fun!
   visi.com
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Threads vs Processes

2006-07-27 Thread Carl J. Van Arsdall
[EMAIL PROTECTED] wrote:
 Carl J. Van Arsdall wrote:
   
 Ah, alright, I think I understand, so threading works well for sharing
 python objects.  Would a scenario for this be something like a a job
 queue (say Queue.Queue) for example.  This is a situation in which each
 process/thread needs access to the Queue to get the next task it must
 work on.  Does that sound right?
 

 That's a reasonable and popular technique. I'm not sure what this
 refers to in your question, so I can't say if it solves the
 problem of which you are thinking.

   
  Would the same apply to multiple
 threads needed access to a dictionary? list?
 

 The Queue class is popular with threads because it already has
 locking around its basic methods. You'll need to serialize your
 operations when sharing most kinds of objects.

   
Yes yes, of course.  I was just making sure we are on the same page, and 
I think I'm finally getting there.


 Now if you are just passing ints and strings around, use processes with
 some type of IPC, does that sound right as well?
 

 Also reasonable and popular. You can even pass many Python objects
 by value using pickle, though you lose some safety.
   
I actually do use pickle (not for this, but for other things), could you 
elaborate on the safety issue? 


   
  Or does the term
 shared memory mean something more low-level like some bits that don't
 necessarily mean anything to python but might mean something to your
 application?
 

 Shared memory means the same memory appears in multiple processes,
 possibly at different address ranges. What any of them writes to
 the memory, they can all read. The standard Python distribution
 now offers shared memory via os.mmap(), but lacks cross-process
 locks.
   
 Python doesn't support allocating objects in shared memory, and
 doing so would be difficult. That's what the POSH project is
 about, but it looks stuck in alpha.


   


-- 

Carl J. Van Arsdall
[EMAIL PROTECTED]
Build and Release
MontaVista Software

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Threads vs Processes

2006-07-27 Thread bryanjugglercryptographer

Carl J. Van Arsdall wrote:
[...]
 I actually do use pickle (not for this, but for other things), could you
 elaborate on the safety issue?

From http://docs.python.org/lib/node63.html :

Warning: The pickle module is not intended to be secure
against erroneous or maliciously constructed data. Never
unpickle data received from an untrusted or unauthenticated
source.

A corrupted pickle can crash Python. An evil pickle could probably
hijack your process.


-- 
--Bryan

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Threads vs Processes

2006-07-27 Thread Carl J. Van Arsdall
[EMAIL PROTECTED] wrote:
 Carl J. Van Arsdall wrote:
 [...]
   
 I actually do use pickle (not for this, but for other things), could you
 elaborate on the safety issue?
 

 From http://docs.python.org/lib/node63.html :

 Warning: The pickle module is not intended to be secure
 against erroneous or maliciously constructed data. Never
 unpickle data received from an untrusted or unauthenticated
 source.

 A corrupted pickle can crash Python. An evil pickle could probably
 hijack your process.


   
Ah, i the data is coming from someone else.  I understand.  Thanks.

-- 

Carl J. Van Arsdall
[EMAIL PROTECTED]
Build and Release
MontaVista Software

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Threads vs Processes

2006-07-27 Thread mark
On Wed, 26 Jul 2006 10:54:48 -0700, Carl J. Van Arsdall wrote:
 Alright, based a on discussion on this mailing list, I've started to
 wonder, why use threads vs processes.

The debate should not be about threads vs processes, it should be
about threads vs events. Dr. John Ousterhout (creator of Tcl,
Professor of Comp Sci at UC Berkeley, etc), started a famous debate
about this 10 years ago with the following simple presentation.

http://home.pacbell.net/ouster/threads.pdf

That sentiment has largely been ignored and thread usage dominates but,
if you have been programming for as long as I have, and have used both
thread based architectures AND event/reactor/callback based
architectures, then that simple presentation above should ring very
true. Problem is, young people merely equate newer == better.

On large systems and over time, thread based architectures often tend
towards chaos. I have seen a few thread based systems where the
programmers become so frustrated with subtle timing issues etc, and they
eventually overlay so many mutexes etc, that the implementation becomes
single threaded in practice anyhow(!), and very inefficient.

BTW, I am fairly new to python but I have seen that the python Twisted
framework is a good example of the event/reactor design alternative to
threads. See

http://twistedmatrix.com/projects/core/documentation/howto/async.html .

Douglas Schmidt is a famous designer and author (ACE, Corba Tao, etc)
who has written much about reactor design patterns, see
Pattern-Oriented Software Architecture, Vol 2, Wiley 2000, amongst
many other references of his.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Threads vs Processes

2006-07-27 Thread bryanjugglercryptographer
mark wrote:
 The debate should not be about threads vs processes, it should be
 about threads vs events.

We are so lucky as to have both debates.

 Dr. John Ousterhout (creator of Tcl,
 Professor of Comp Sci at UC Berkeley, etc), started a famous debate
 about this 10 years ago with the following simple presentation.

 http://home.pacbell.net/ouster/threads.pdf

The Ousterhout school finds multiple lines of execution
unmanageable, while the Tannenbaum school finds asynchronous I/O
unmanageable.

What's so hard about single-line-of-control (SLOC) event-driven
programming? You can't call anything that might block. You have to
initiate the operation, store all the state you'll need in order
to pick up where you left off, then return all the way back to the
event dispatcher.

 That sentiment has largely been ignored and thread usage dominates but,
 if you have been programming for as long as I have, and have used both
 thread based architectures AND event/reactor/callback based
 architectures, then that simple presentation above should ring very
 true. Problem is, young people merely equate newer == better.

Newer? They're both old as the trees. That can't be why the whiz
kids like them. Threads and process rule because of their success.

 On large systems and over time, thread based architectures often tend
 towards chaos.

While large SLOC event-driven systems surely tend to chaos. Why?
Because they *must* be structured around where blocking operations
can happen, and that is not the structure anyone would choose for
clarity, maintainability and general chaos avoidance.

Even the simplest of modular structures, the procedure, gets
broken. Whether you can encapsulate a sequence of operations in a
procedure depends upon whether it might need to do an operation
that could block.

Going farther, consider writing a class supporting overriding of
some method. Easy; we Pythoneers do it all the time; that's what
O.O. inheritance is all about. Now what if the subclass's version
of the method needs to look up external data, and thus might
block? How does a method override arrange for the call chain to
return all the way back to the event loop, and to and pick up
again with the same call chain when the I/O comes in?

 I have seen a few thread based systems where the
 programmers become so frustrated with subtle timing issues etc, and they
 eventually overlay so many mutexes etc, that the implementation becomes
 single threaded in practice anyhow(!), and very inefficient.

While we simply do not see systems as complex as modern DBMS's
written in the SLOC event-driven style.

 BTW, I am fairly new to python but I have seen that the python Twisted
 framework is a good example of the event/reactor design alternative to
 threads. See

 http://twistedmatrix.com/projects/core/documentation/howto/async.html .

And consequently, to use Twisted you rewrite all your code as
those 'deferred' things.


-- 
--Bryan

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Threads vs Processes

2006-07-27 Thread Nick Vatamaniuc
It seems that both ways are here to stay. If one was so much inferior
and problem-prone, we won't be talking about it now, it would have been
forgotten on the same shelf with a stack of punch cards.

The rule of thumb is 'the right tool for the right job.'

Threading model is very useful for long CPU-bound processing, as it can
potentially take advantage of multiple CPUs/Cores (alas not in Python
now because of GIL). The events will not work as well here. But note,
if there is not much sharing of resources between threads processes
could be used! It turns out that there are very few cases where threads
are simply indispensable.

The event model is usually well suited for I/O or for any large number
of shared resources occurs that would require lots of synchronizations
if threads would be used.

DBMS' are not a good example of typical large, so 'saying see DBMS use
threads -- therefore threads are better' doesn't make a good example.
DBMS are highly optimized, only a few of them actually manage to
successfully take advantage of the multiple execution units. One could
as well cite a hundred of other projects and say 'see it uses an event
model -- therefore event models are better' and so on. Again right
tool for the right job. A good programmer should know both...

 And consequently, to use Twisted you rewrite all your code as
 those 'deferred' things.

Then, try re-writing Twisted using threads in the same number of lines
having the same or better performance.  I bet you'll end up having a
whole bunch of 'locks', 'waits' and 'notify's instead of a bunch of
those 'deferred' things. Debugging all those threads should be a
project in an of itself.

-Nick


[EMAIL PROTECTED] wrote:
 mark wrote:
  The debate should not be about threads vs processes, it should be
  about threads vs events.

 We are so lucky as to have both debates.

  Dr. John Ousterhout (creator of Tcl,
  Professor of Comp Sci at UC Berkeley, etc), started a famous debate
  about this 10 years ago with the following simple presentation.
 
  http://home.pacbell.net/ouster/threads.pdf

 The Ousterhout school finds multiple lines of execution
 unmanageable, while the Tannenbaum school finds asynchronous I/O
 unmanageable.

 What's so hard about single-line-of-control (SLOC) event-driven
 programming? You can't call anything that might block. You have to
 initiate the operation, store all the state you'll need in order
 to pick up where you left off, then return all the way back to the
 event dispatcher.

  That sentiment has largely been ignored and thread usage dominates but,
  if you have been programming for as long as I have, and have used both
  thread based architectures AND event/reactor/callback based
  architectures, then that simple presentation above should ring very
  true. Problem is, young people merely equate newer == better.

 Newer? They're both old as the trees. That can't be why the whiz
 kids like them. Threads and process rule because of their success.

  On large systems and over time, thread based architectures often tend
  towards chaos.

 While large SLOC event-driven systems surely tend to chaos. Why?
 Because they *must* be structured around where blocking operations
 can happen, and that is not the structure anyone would choose for
 clarity, maintainability and general chaos avoidance.

 Even the simplest of modular structures, the procedure, gets
 broken. Whether you can encapsulate a sequence of operations in a
 procedure depends upon whether it might need to do an operation
 that could block.

 Going farther, consider writing a class supporting overriding of
 some method. Easy; we Pythoneers do it all the time; that's what
 O.O. inheritance is all about. Now what if the subclass's version
 of the method needs to look up external data, and thus might
 block? How does a method override arrange for the call chain to
 return all the way back to the event loop, and to and pick up
 again with the same call chain when the I/O comes in?

  I have seen a few thread based systems where the
  programmers become so frustrated with subtle timing issues etc, and they
  eventually overlay so many mutexes etc, that the implementation becomes
  single threaded in practice anyhow(!), and very inefficient.

 While we simply do not see systems as complex as modern DBMS's
 written in the SLOC event-driven style.

  BTW, I am fairly new to python but I have seen that the python Twisted
  framework is a good example of the event/reactor design alternative to
  threads. See
 
  http://twistedmatrix.com/projects/core/documentation/howto/async.html .

 And consequently, to use Twisted you rewrite all your code as
 those 'deferred' things.
 
 
 -- 
 --Bryan

-- 
http://mail.python.org/mailman/listinfo/python-list


Threads vs Processes

2006-07-26 Thread Carl J. Van Arsdall
Alright, based a on discussion on this mailing list, I've started to 
wonder, why use threads vs processes.  So, If I have a system that has a 
large area of shared memory, which would be better?  I've been leaning 
towards threads, I'm going to say why.

Processes seem fairly expensive from my research so far.  Each fork 
copies the entire contents of memory into the new process.  There's also 
a more expensive context switch between processes.  So if I have a 
system that would fork 50+ child processes my memory usage would be huge 
and I burn more cycles that I don't have to.  I understand that there 
are ways of IPC, but aren't these also more expensive?

So threads seems faster and more efficient for this scenario.  That 
alone makes me want to stay with threads, but I get the feeling from 
people on this list that processes are better and that threads are over 
used.  I don't understand why, so can anyone shed any light on this?


Thanks,

-carl

-- 

Carl J. Van Arsdall
[EMAIL PROTECTED]
Build and Release
MontaVista Software

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Threads vs Processes

2006-07-26 Thread Chance Ginger
On Wed, 26 Jul 2006 10:54:48 -0700, Carl J. Van Arsdall wrote:

 Alright, based a on discussion on this mailing list, I've started to 
 wonder, why use threads vs processes.  So, If I have a system that has a 
 large area of shared memory, which would be better?  I've been leaning 
 towards threads, I'm going to say why.
 
 Processes seem fairly expensive from my research so far.  Each fork 
 copies the entire contents of memory into the new process.  There's also 
 a more expensive context switch between processes.  So if I have a 
 system that would fork 50+ child processes my memory usage would be huge 
 and I burn more cycles that I don't have to.  I understand that there 
 are ways of IPC, but aren't these also more expensive?
 
 So threads seems faster and more efficient for this scenario.  That 
 alone makes me want to stay with threads, but I get the feeling from 
 people on this list that processes are better and that threads are over 
 used.  I don't understand why, so can anyone shed any light on this?
 
 
 Thanks,
 
 -carl

Not quite that simple. In most modern OS's today there is something
called COW - copy on write. What happens is when you fork a process
it will make an identical copy. Whenever the forked process does
write will it make a copy of the memory. So it isn't quite as bad.

Secondly, with context switching if the OS is smart it might not 
flush the entire TLB. Since most applications are pretty local as
far as execution goes, it might very well be the case the page (or 
pages) are already in memory. 

As far as Python goes what you need to determine is how much 
real parallelism you want. Since there is a global lock in Python
you will only execute a few (as in tens) instructions before
switching to the new thread. In the case of true process you 
have two independent Python virtual machines. That may make things
go much faster.

Another issue is the libraries you use. A lot of them aren't 
thread safe. So you need to watch out.

Chance
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Threads vs Processes

2006-07-26 Thread John Henry

Chance Ginger wrote:
 On Wed, 26 Jul 2006 10:54:48 -0700, Carl J. Van Arsdall wrote:

  Alright, based a on discussion on this mailing list, I've started to
  wonder, why use threads vs processes.  So, If I have a system that has a
  large area of shared memory, which would be better?  I've been leaning
  towards threads, I'm going to say why.
 
  Processes seem fairly expensive from my research so far.  Each fork
  copies the entire contents of memory into the new process.  There's also
  a more expensive context switch between processes.  So if I have a
  system that would fork 50+ child processes my memory usage would be huge
  and I burn more cycles that I don't have to.  I understand that there
  are ways of IPC, but aren't these also more expensive?
 
  So threads seems faster and more efficient for this scenario.  That
  alone makes me want to stay with threads, but I get the feeling from
  people on this list that processes are better and that threads are over
  used.  I don't understand why, so can anyone shed any light on this?
 
 
  Thanks,
 
  -carl

 Not quite that simple. In most modern OS's today there is something
 called COW - copy on write. What happens is when you fork a process
 it will make an identical copy. Whenever the forked process does
 write will it make a copy of the memory. So it isn't quite as bad.

 Secondly, with context switching if the OS is smart it might not
 flush the entire TLB. Since most applications are pretty local as
 far as execution goes, it might very well be the case the page (or
 pages) are already in memory.

 As far as Python goes what you need to determine is how much
 real parallelism you want. Since there is a global lock in Python
 you will only execute a few (as in tens) instructions before
 switching to the new thread. In the case of true process you
 have two independent Python virtual machines. That may make things
 go much faster.

 Another issue is the libraries you use. A lot of them aren't
 thread safe. So you need to watch out.

 Chance

It's all about performance (and sometimes the perception of
performance).  Eventhough the thread support (and performance) in
Python is fairly weak (as explained by Chance), it's nonetheless very
useful.  My applications threads a lot and it proves to be invaluable -
particularly with GUI type applications.  I am the type of user that
gets annoyed very quickly and easily if the program doesn't respond to
me when I click something.  So, as a rule of thumb, if the code has to
do much of anything that takes say a tenth of a second or more, I
thread.

I posted a simple demo program yesterday to the Pythoncard list to show
why somebody would want to thread an app.  You can properly see it from
archive.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Threads vs Processes

2006-07-26 Thread Paul Rubin
Carl J. Van Arsdall [EMAIL PROTECTED] writes:
 Processes seem fairly expensive from my research so far.  Each fork
 copies the entire contents of memory into the new process.

No, you get two processes whose address spaces get the data.  It's
done with the virtual memory hardware.  The data isn't copied.  The
page tables of both processes are just set up to point to the same
physical pages.  Copying only happens if a process writes to one of
the pages.  The OS detects this using a hardware trap from the VM
system.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Threads vs Processes

2006-07-26 Thread Russell Warren
 Another issue is the libraries you use. A lot of them aren't
 thread safe. So you need to watch out.

This is something I have a streak of paranoia about (after discovering
that the current xmlrpclib has some thread safety issues).  Is there a
list maintained anywhere of the modules that are aren't thread safe?

Russ

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Threads vs Processes

2006-07-26 Thread Carl J. Van Arsdall
Paul Rubin wrote:
 Carl J. Van Arsdall [EMAIL PROTECTED] writes:
   
 Processes seem fairly expensive from my research so far.  Each fork
 copies the entire contents of memory into the new process.
 

 No, you get two processes whose address spaces get the data.  It's
 done with the virtual memory hardware.  The data isn't copied.  The
 page tables of both processes are just set up to point to the same
 physical pages.  Copying only happens if a process writes to one of
 the pages.  The OS detects this using a hardware trap from the VM
 system.
   
Ah, alright.  So if that's the case, why would you use python threads 
versus spawning processes?  If they both point to the same address space 
and python threads can't run concurrently due to the GIL what are they 
good for?

-c

-- 

Carl J. Van Arsdall
[EMAIL PROTECTED]
Build and Release
MontaVista Software

-- 
http://mail.python.org/mailman/listinfo/python-list



Re: Threads vs Processes

2006-07-26 Thread Russell Warren
Oops - minor correction... xmlrpclib is fine (I think/hope).  It is
SimpleXMLRPCServer that currently has issues.  It uses
thread-unfriendly sys.exc_value and sys.exc_type... this is being
corrected.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Threads vs Processes

2006-07-26 Thread Paddy

Carl J. Van Arsdall wrote:
 Alright, based a on discussion on this mailing list, I've started to
 wonder, why use threads vs processes.  So, If I have a system that has a
 large area of shared memory, which would be better?  I've been leaning
 towards threads, I'm going to say why.

 Processes seem fairly expensive from my research so far.  Each fork
 copies the entire contents of memory into the new process.  There's also
 a more expensive context switch between processes.  So if I have a
 system that would fork 50+ child processes my memory usage would be huge
 and I burn more cycles that I don't have to.  I understand that there
 are ways of IPC, but aren't these also more expensive?

 So threads seems faster and more efficient for this scenario.  That
 alone makes me want to stay with threads, but I get the feeling from
 people on this list that processes are better and that threads are over
 used.  I don't understand why, so can anyone shed any light on this?


 Thanks,

 -carl

 --

 Carl J. Van Arsdall
 [EMAIL PROTECTED]
 Build and Release
 MontaVista Software

Carl,
 OS writers provide much more tools for debugging, tracing, changing
the priority of, sand-boxing processes than threads (in general) It
*should* be easier to get a process based solution up and running
andhave it be more robust, when compared to a threaded solution.

- Paddy (who shies away from threads in C and C++ too ;-)

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Threads vs Processes

2006-07-26 Thread John Henry



 Carl,
  OS writers provide much more tools for debugging, tracing, changing
 the priority of, sand-boxing processes than threads (in general) It
 *should* be easier to get a process based solution up and running
 andhave it be more robust, when compared to a threaded solution.

 - Paddy (who shies away from threads in C and C++ too ;-)

That mythical process is more robust then thread application
paradigm again.

No wonder there are so many boring software applications around.

Granted.  Threaded program forces you to think and design your
application much more carefully (to avoid race conditions, dead-locks,
...) but there is nothing inherently *non-robust* about threaded
applications.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Threads vs Processes

2006-07-26 Thread Gerhard Fiedler
On 2006-07-26 21:02:59, John Henry wrote:

 Granted.  Threaded program forces you to think and design your
 application much more carefully (to avoid race conditions, dead-locks,
 ...) but there is nothing inherently *non-robust* about threaded
 applications.

You just need to make sure that every piece of code you're using is
thread-safe. While OTOH to make sure they are all process safe is the job
of the OS, so to speak :)

Gerhard

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Threads vs Processes

2006-07-26 Thread Joe Knapka
John Henry wrote:

 
Carl,
 OS writers provide much more tools for debugging, tracing, changing
the priority of, sand-boxing processes than threads (in general) It
*should* be easier to get a process based solution up and running
andhave it be more robust, when compared to a threaded solution.

- Paddy (who shies away from threads in C and C++ too ;-)
 
 
 That mythical process is more robust then thread application
 paradigm again.
 
 No wonder there are so many boring software applications around.
 
 Granted.  Threaded program forces you to think and design your
 application much more carefully (to avoid race conditions, dead-locks,
 ...) but there is nothing inherently *non-robust* about threaded
 applications.

In this particular case, the OP (in a different thread)
mentioned that his application will be extended by
random individuals who can't necessarily be trusted
to design their extensions correctly.  In that case,
segregating the untrusted code, at least, into
separate processes seems prudent.

The OP also mentioned that:

  If I have a system that has a large area of shared memory,
  which would be better?

IMO, if you're going to be sharing data structures with
code that can't be trusted to clean up after itself,
you're doomed. There's just no way to make that
scenario work reliably. The best you can do is insulate
that data behind an API (rather than giving untrusted
code direct access to the data -- IOW, don't use threads,
because if you do, they can go around your API and screw
things up), and ensure that each API call leaves the
data structures in a consistent state.

-- JK
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Threads vs Processes

2006-07-26 Thread bryanjugglercryptographer
Carl J. Van Arsdall wrote:
 Alright, based a on discussion on this mailing list, I've started to
 wonder, why use threads vs processes.

In many cases, you don't have a choice. If your Python program
is to run other programs, the others get their own processes.
There's no threads option on that.

If multiple lines of execution need to share Python objects,
then the standard Python distribution supports threads, while
processes would require some heroic extension. Don't confuse
sharing memory, which is now easy, with sharing Python
objects, which is hard.


 So, If I have a system that has a
 large area of shared memory, which would be better?  I've been leaning
 towards threads, I'm going to say why.

 Processes seem fairly expensive from my research so far.  Each fork
 copies the entire contents of memory into the new process.

As others have pointed out, not usually true with modern OS's.

 There's also
 a more expensive context switch between processes.  So if I have a
 system that would fork 50+ child processes my memory usage would be huge
 and I burn more cycles that I don't have to.

Again, not usually true. Modern OS's share code across
processes. There's no way to tell the size of 100
unspecified processes, but the number is nothing special.

 So threads seems faster and more efficient for this scenario.  That
 alone makes me want to stay with threads, but I get the feeling from
 people on this list that processes are better and that threads are over
 used.  I don't understand why, so can anyone shed any light on this?

Yes, someone can, and that someone might as well be you.
How long does it take to create and clean up 100 trivial
processes on your system? How about 100 threads? What
portion of your user waiting time is that?


-- 
--Bryan

-- 
http://mail.python.org/mailman/listinfo/python-list