> @GordonBGood \- there have been issues with thread pool, parallel and spawn > for years. I'm not sure what shape these modules are in now...
Yes, there have been issues of which @araq has been well aware, as follows: 1. They were written in a time that Nim contributors and @Araq still believed that Garbage Collection was the way to handle memory management. 2. Worse, the default Garbage Collection chosen did not use shared memory, but each thread had its own managed memory space, so heap allocated items needed to be copied between thread spaces by special means using shared memory. 3. The solution to these problems was "compiler magic" implementing "deep copying" as necessary, which was obscure, hard to maintain, and still had lots of bugs/problems after years of development. 4. Nothing much could be done about those things until we got a better default memory management system, which we now have in `--gc:arc/orc` (not really a garbage collector as is commonly known). 5. Now the old `system.channels` and `system.threadpool` library have been updated so that they can work with `--gc:arc/orc` and if ref objects are sent to channels as in `move` form (last used is to send them), now they don't copy; however all the old cruft using "compiler magic" to check the type of the sent object is still there and it still automatically deep copies if one is not careful. 6. @araq came up with/accepted the idea, as borrowed from other languages like Pony, of an `Isolated[T]` type in the `std/isolation` library that is move only and cannot be copied to be used as a message type container in sending between threads. It has the advantage that all of the "compiler magic" is wrapped into its constructor proc so that it doesn't have to be scattered all through new libraries using it. 7. A new `std/channels` library has been written to only use the `isolation` library to show that it works without all the cruft. It is API-similar to the old library but not quite a direct drop in replacement (there is a `newChannel` constructor proc to construct a channel in a channel cache with buffer size specified here rather than in the `open` proc, the `open/close` proc's now return a `bool`, and `send/trySend` now must sent a `Isolated[T]` but there are templates to handle the wrapping for the programmer). One advantage of this new `std/channels` library is that it can be copied across threads (by reference counting) without having to resort to using a global variable or manually allocating and passing by pointer. 8. We are still missing a `std/threadpool` library to do the equivalent for `spawn`'ing of "green threads" along with the implementation of all the other proc's related to monitoring spawned threads completion status. > In my experience with Nim, all of these new features and abstractions around > concurrency, never really end up paying off. Well, something had to be done to make it simpler and easier to maintain, and @araq has done it. If you look into how easy it is to use `Isolated[T]` from the user exposed interface, you'll be amazed. I haven't looked into what it took to implement the constructor proc, but at least that's concentrated to only in one place in the compiler implementation and not across many of the libraries. As to whether it will pay off, we have to use it to see, but isn't confidence in @araq one of the reasons we're here? > They're either too poorly documented to grok, or they have significant edge > cases they can't solve for. I plan to take over the documentation thereof, and indeed it is currently a bit lacking. As to edge cases, that's our job to try to identify them by using the libraries and filing issues when they don't work or could be improved. > Being able to move isolated subgraphs between threads is great, but can't we > already do this with move semantics? Yes, a disciplined programmer can avoid the copying by wrapping everything (especially `seq/string`'s) in `ref` objects and structuring their code so that they are `move`'ed, but the deep copying cruft is still there to bite the unwary; By using `Isolated[T]` to pass messages in, we must then be able to pass the "isolatable" test of the `isolate` constructor so it should guaranty that the code compiles, everything sent in is moved not copied whether the programmer is experienced or a noobie. Also, the runtime cost is only that of the object wrapping that we would likely be doing anyway (and it's possible that those get optimized away?). > Allocations on a shared heap, pointers, and synchronization primitives are > the best way I've found to address this style of programming in Nim. Yes, that's what I have found, too, but that's an ugly C-like way of doing things, and is the reason I've been away while waiting for arc/orc to stabilize. Now that it's getting there, hopefully we will find this leads to a better way. > Maybe the new iteration of channels will synergize well with the existing > thread pool implementation, and will offer a safer, easier to implement > alternative. Nope, the new iteration of channels also needs a new iteration of `threadpool` that works in a similar way, but yes, together they should offer a "safer, easier to implement alternative". > I'm not surprised you were able to achieve similar performance to running > green threads over a thread pool (and passing messages between them). Context > switching is the power in green threads, and their ability to be paused and > resumed on different system threads. I'm making some assumptions here, but if > one is simply running coroutines over a thread pool without any kind of > cooperative multitasking, you're probably just fighting contention issues and > the performance characteristics of your implementation will be poor. No, I was testing the old versus the new channel implementations with the a Proof Of Concept thread pool written similar to the old `threadpool` library but using the new `Isolated[T]` library; thus message passing was done using the new `channels` implementation and so there weren't any contention issues other than what it may have. Thus, the user interface looked a lot like as if I was calling GoLang go routines using their channel's for inter-thread communication as they recommend. One can't match GoLang's go routine performance without doing all the "computer magic" they do, but the performance was in line with that of other languages that use run time provided "green threads" that are based on passing work units to OS threads already spun up and just waiting for work, such as Haskell's `forkIO` threads, DotNet's Task Parallel Library, etc. Thus, they aren't equivalent to GoLang's "green threads" which have the injected code necessary to be able to very quickly suspend and resume tiny chunks of code, but hopefully that won't be necessary and @araq doesn't want to go that route anyway. On my test machine, I don't think one can run simple tasks through such a thread pool faster than about in the order of hundreds of thousands per second, but that's a lot better than `createThread`'ing new OS threads, which is in the order of more than a hundred times slower. That's why we want a new `threadpool` library that makes this easy to do. If one wants to do inter-thread communication (whether OS threads or thread pool threads) using shared memory allocations with `Lock`'s and `Cond`'s to handle the contention (`atomics` is still an Unstable API), of course one can do so, but its so much easier just using channels.
