Ryan Stuart <ryan.stuart...@gmail.com> writes: > I'm not sure what else to say really. It's just a fact of life that > Threads by definition run in the same memory space and hence always > have the possibility of nasty unforeseen problems. They are unforeseen > because it is extremely difficult (maybe impossible?) to try and map > out and understand all the different possible mutations to state.
Sure, the shared memory introduces the possibility of some bad errors, I'm just saying that I've found that by staying with a certain straightforward style, it doesn't seem difficult in practice to avoid those errors. > Sure, your code might not be making any mutations (that you know of), > but malloc definitely is [1], and that's just the tip of the iceberg. > Other things like buffers for stdin and stdout, DNS resolution etc. > all have the same issue. I don't understand what you mean about malloc. I looked at that code and there's a mutex to make multi-threaded programs work right, and an ifdef (maybe for better performance) to use different code if there are no threads. IOW they spent a bunch of time handling threads. Are you saying there's a bug? Re stdin/stdout: obviously you can't have multiple threads messing with the same fd's; that's the same thing as data sharing. Re DNS: if gethostbyname isn't thread-safe I'd think of that as a pretty bad bug. But I'm having a vague memory of having had an issue with this though, and IIRC it took part of a morning to figure out what was going on, annoying but not a multi-month bug-hunt or anything like that. It didn't happen on my workstation, but only on the embedded target that was probably running an old or weird libc. > To borrow from the original article I linked - "Nevertheless I still > think it’s a bad idea to make things harder for ourselves if we can > avoid it." That article was interesting in some ways but confused in others. One way it was interesting is it said various non-thread approaches (such as coroutines) had about the same problems as threads. Some ways it was confused were: 1) thinking Haskell threads were like processes with separate address spaces. In fact they are in the same address space and programming with them isn't all that different from Python threads, though the synchronization primitives are a bit different. There is also an STM library available that is ingenious though apparently somewhat slow. 2) it has a weird story about the brass cockroach, that basically signified that they didn't have a robust enough testing system to be able to reproduce the bug. That is what they should have worked on. 3) It goes into various hazards of the balance transfer example not mentioning that STM (available in Haskell and Clojure) completely solves it. 4) It says: "eventually a system which communicates exclusively through non-blocking queues effectively becomes a set of communicating event loops, and its problems revert to those of an event-driven system; it doesn’t look like regular programming with threads any more." That is essentially what an Erlang program is, and it misses the fact that those low-level event loops can use blocking operations to their heart's content, without the inversion of control (callback spaghetti) of traditional evented systems (I haven't used asyncio yet). Also, the low-level loops can run in parallel on multiple cores, while a asyncio-style coroutine loop is sequential under the skin. In Erlang/OTP, you don't even see the event loops directly, since they are abstracted away by the OTP framework and it looks like RPC calls at the application level. But, it helps to know what is going on underneath. I'm realizing some people program Python in an ultra-dynamic style where the mutability of modules, functions, etc. really comes into play, so that make threads much more dangerous. I've tended to write Python with much less dynamism or even as if it were statically typed, so maybe that helps. Anyway, I got one thing out of this, which is that the multiprocessing module looks pretty nice and I should try it even when I don't need multicore parallelism, so thanks for that. In reality though, Python is still my most productive language for throwaway, non-concurrent scripting, but for more complicated concurrent programs, alternatives like Haskell, Erlang, and Go all have significant attractions. -- https://mail.python.org/mailman/listinfo/python-list