James wrote: > Thanks for the quick reply. > > Interesting. I'm a little overwhelmed with the different terminology > (fork, spawn, thread, etc.). I'm under the impression that I'm > supposed to use os.fork() or os.spawn() for something like what I'm > trying to do (start multiple instances of the I/O utility from one > Python script). >
A fork is a fundamental system call in which the OS makes a nearly identical copy of the running process. I know it's a kind of *-hole thing to say, but... if you don't know why you'd want to fork your process, you probably don't need to. Forking is usually used for disassociating yourself from your parent process to become a daemon. However, it's a basic function of the system an intrinsic in many other higher level actions. One you don't mention is "exec", which is to replace your running process image with a new process image. You can do it from the shell, type "exec somebinary" and that binary replaces your shell process, so when the exec'd process exits, your session is terminated. I mention that because when you combine a fork with an exec, you get a spawn. Your parent process duplicates itself, but the child process chooses to exec another process. So the child copy of the initial process is replaced by new running binary and you have a spawned process running as a child of the first. Finally, a thread (sometimes referred to as a "lightweight process" or "lwp") is kind of like a fork, except a fork duplicates everything about the initial process (except a return code) while a thread shares state with the parent process and all its sibling threads. The interesting thing about a python thread is that it is not an OS level thread, it is a separate execution thread, but still controlled by the python interpreter. So, while a dual processor computer can choose to execute two different processes or thread simultaneously, since there's only one python interpreter (per python process) a python thread is never run concurrently with another thread in the same python process. It's more of a conceptual thing, > However, from what I gather from the documentation, os.fork() is > going to fork the python Python script that calls the original fork > (so fork won't directly start off the programs that I need). How > would I go about forking + then executing an application? Isn't this > what spawn does? Or should I stick with fork + exec*? > However, what you are trying to do, i.e. spawn multiple concurrent child processes, could actually take advantage of a multi processor system using python threads. If you created multiple threads, many of which spawned an independent subprocess, those subprocesses could be executed concurrently by different processors, while their status was still being coordinated via the python thread model. Just give it a go knowing that it is an efficient design, and drop us a line if you have more questions or any problems. Sincerely, e. > Lots to learn, I guess. ;) > Always. ;-) When you think there's nothing else to learn, you've already become obsolete. > .james > > On Sep 19, 2007, at 10:19 PM, Kent Johnson wrote: > > >> James wrote: >> >>> Hi. :) >>> I have a question regarding threading in Python. I'm trying to >>> write a wrapper script in Python that will spin off multiple >>> (lots!) of instances of an I/O benchmark/testing utility. I'm >>> very interested in doing this in Python, but am unsure if this is >>> a good idea. I thought I read somewhere online that because of >>> the way Python was written, even if I spun off (forked off?) >>> multiple instances of a program, all those child processes would >>> be restricted to one CPU. Is this true? >>> >> Python *threads* are limited to a single CPU, or at least they will >> not run faster on multiple CPUs. I don't think there is any such >> restriction for forked processes. >> >> >>> I'm not sure if the utility I'm forking is CPU-intensive; it may >>> very well be. Does Python indeed have this limitation? >>> >> I would think an I/O benchmark is more likely to be I/O bound... >> >> >>> Also, should I be using os.fork() for this kind of program? >>> >> There is a fair amount of activity these days around making Python >> friendly to multi-processing. See >> http://wiki.python.org/moin/ParallelProcessing >> >> Kent >> > > _______________________________________________ > Tutor maillist - Tutor@python.org > http://mail.python.org/mailman/listinfo/tutor > _______________________________________________ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor