On 10/30/2012 03:18 PM, richard kappler wrote: > As I sit through the aftermath of Sandy, I have resumed my personal quest > to learn python. One of the things I am struggling with is running multiple > processes. I read the docs on threading and am completely lost so am > turning to the most excellent tutors here (and thanks for all the help, > past, present and future btw!). > > In what ways can one run multiple concurrent processes and which would be > considered the "best" way or is that case dependent? > > Example: > > I'm working on programming a robot in Python. The bot has an Arduino board > that receives sensor input and sends the data to a laptop which is the > "brain" of the bot via pySerial and uses this incoming sensor data to help > determine state. State is used in decision making. The laptop runs a > program that we'll call the Master Control Program in a nod to Tron. The > bot also has a chat program, computer vision, some AI it uses to mine the > web for information, several other functions. Each of these concurrent > programs (thus far all python programs) must run continuously and feed data > to the MCP which receives the data, makes decisions and sends out > appropriate action commands such as for movement, change of state, > conversation direction, what to research, etc. > > So, to return to the original question, how does one run multiple > concurrent processes within python? > >
I'm only guessing about your background, so please don't take offense at the simple level of the following. You see, before you can really understand how the language features work, and what the various terms mean, you need to understand the processor and the OS. A decade or so ago, things were a bit simpler -- if we wanted a faster machine, Intel would crank up the processor clock rate, and things were faster. But eventually, it reached the point where increased clock rate became VERY expensive, and Intel (and others) came up with a different strategy. I'm going to guess you're running on some variant of the Pentium processor. The processor (cpu) has a feature called hyperthreading, meaning that for most operations, it can do two things at once. So it has two copies of the instruction pointer, and two copies of most registers. As long as neither program uses the features that aren't replicated, you can run two programs completely independently. The two programs share physical memory, hard disk, keyboard and screen, but they probably won't slow each other down very much. You may have a dual-core, or even a quad-core processor. And you may have more than one of those, if you're on a high-end server. So, as long as the processes are separate, you could run many of them at a time. The other thing that affects all of this is the operating system you're running. It has to manage these multiple processes, and make sure that things that can't be shared are correctly serialized; one task grabs a resource and others block waiting for that resource. The most visible (but not the most important) way this occurs is that separate applications draw in different windows. They share the screen, but none of them writes to the raw device, all of them go through a window manager. This is multiprocessing. And since one program can launch others, it's one way that a single "task" can be split up to use these multiple cores/cpus. The operating system deliberately keeps the separate processes very isolated, but provides a few ways for them to talk to each other: (one program can launch another, passing it arguments, and observing the return code, it can also use pipes to connect to stdin and stdout of the other program, they can open up queues, shared memory, or they can each read & write to a common file.) Such processes do NOT normally share variables, and function calls on one do not easily end up invoking code in the other. But there is a second way that two cpus can work on the same "task." If a single process is multi-THREADED, then the threads do share variables and other resources, and communication between them is easy (so easy it's difficult to get right, actually). This is theoretically much gentler on system resources, but at the cost of lots more bugs likely. Some operating systems have a feature called forking, which can theoretically give you the best of both worlds. But I'm not going to even try to explain that unless you tell me you're on a Linux or Unix type operating system. Besides, I don't know how any of Python uses such a fork; it hasn't turned out to be necessary information for me yet. Now, with CPython in particular, multithreading has a serious problem, the global lock (GIL). Since so much happens behind the scenes inside the interpreter and low-level library routines, and perhaps since most of that was written before multithreading was supported, there's a single lock that permits only one thread of a process to be working at a time. So if you break up a CPU-bound task into multiple threads, only one will run at a time, and chances are it'll run slower than if it only had one thread. Two things happen to make the GIL less painful (it's really just two manifestations of the same thing). Many times when a thread is in C code, or when it is calling some system function that blocks (eg. waiting for a network message), the GIL is deliberately released, and other threads CAN run. So writing a server that waits on many sockets, one per thread, can make good sense, both from code simplicity and from performance considerations. One other thing that's related. Most gui programs run with an event loop, which is another type of multithreading that does NOT use any special cpu or OS features. With an event loop, it's your job to make sure all transactions are reasonably small, and that each is triggered by some event. Once you understand event loops, it's simpler than either of the other approaches. Note that sometimes two or three of these approaches are combined in one system. Hope this helps, and that some of it was useful. I know that in places I oversimplified, but I think I caught the spirit of the tradeoffs. -- DaveA _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor