> @GordonBGood \- there have been issues with thread pool, parallel and spawn 
> for years. I'm not sure what shape these modules are in now...

Yes, there have been issues of which @araq has been well aware, as follows:

  1. They were written in a time that Nim contributors and @Araq still believed 
that Garbage Collection was the way to handle memory management.
  2. Worse, the default Garbage Collection chosen did not use shared memory, 
but each thread had its own managed memory space, so heap allocated items 
needed to be copied between thread spaces by special means using shared memory.
  3. The solution to these problems was "compiler magic" implementing "deep 
copying" as necessary, which was obscure, hard to maintain, and still had lots 
of bugs/problems after years of development.
  4. Nothing much could be done about those things until we got a better 
default memory management system, which we now have in `--gc:arc/orc` (not 
really a garbage collector as is commonly known).
  5. Now the old `system.channels` and `system.threadpool` library have been 
updated so that they can work with `--gc:arc/orc` and if ref objects are sent 
to channels as in `move` form (last used is to send them), now they don't copy; 
however all the old cruft using "compiler magic" to check the type of the sent 
object is still there and it still automatically deep copies if one is not 
careful.
  6. @araq came up with/accepted the idea, as borrowed from other languages 
like Pony, of an `Isolated[T]` type in the `std/isolation` library that is move 
only and cannot be copied to be used as a message type container in sending 
between threads. It has the advantage that all of the "compiler magic" is 
wrapped into its constructor proc so that it doesn't have to be scattered all 
through new libraries using it.
  7. A new `std/channels` library has been written to only use the `isolation` 
library to show that it works without all the cruft. It is API-similar to the 
old library but not quite a direct drop in replacement (there is a `newChannel` 
constructor proc to construct a channel in a channel cache with buffer size 
specified here rather than in the `open` proc, the `open/close` proc's now 
return a `bool`, and `send/trySend` now must sent a `Isolated[T]` but there are 
templates to handle the wrapping for the programmer). One advantage of this new 
`std/channels` library is that it can be copied across threads (by reference 
counting) without having to resort to using a global variable or manually 
allocating and passing by pointer.
  8. We are still missing a `std/threadpool` library to do the equivalent for 
`spawn`'ing of "green threads" along with the implementation of all the other 
proc's related to monitoring spawned threads completion status.



> In my experience with Nim, all of these new features and abstractions around 
> concurrency, never really end up paying off.

Well, something had to be done to make it simpler and easier to maintain, and 
@araq has done it. If you look into how easy it is to use `Isolated[T]` from 
the user exposed interface, you'll be amazed. I haven't looked into what it 
took to implement the constructor proc, but at least that's concentrated to 
only in one place in the compiler implementation and not across many of the 
libraries. As to whether it will pay off, we have to use it to see, but isn't 
confidence in @araq one of the reasons we're here?

> They're either too poorly documented to grok, or they have significant edge 
> cases they can't solve for.

I plan to take over the documentation thereof, and indeed it is currently a bit 
lacking. As to edge cases, that's our job to try to identify them by using the 
libraries and filing issues when they don't work or could be improved.

> Being able to move isolated subgraphs between threads is great, but can't we 
> already do this with move semantics?

Yes, a disciplined programmer can avoid the copying by wrapping everything 
(especially `seq/string`'s) in `ref` objects and structuring their code so that 
they are `move`'ed, but the deep copying cruft is still there to bite the 
unwary; By using `Isolated[T]` to pass messages in, we must then be able to 
pass the "isolatable" test of the `isolate` constructor so it should guaranty 
that the code compiles, everything sent in is moved not copied whether the 
programmer is experienced or a noobie. Also, the runtime cost is only that of 
the object wrapping that we would likely be doing anyway (and it's possible 
that those get optimized away?).

> Allocations on a shared heap, pointers, and synchronization primitives are 
> the best way I've found to address this style of programming in Nim.

Yes, that's what I have found, too, but that's an ugly C-like way of doing 
things, and is the reason I've been away while waiting for arc/orc to 
stabilize. Now that it's getting there, hopefully we will find this leads to a 
better way.

> Maybe the new iteration of channels will synergize well with the existing 
> thread pool implementation, and will offer a safer, easier to implement 
> alternative.

Nope, the new iteration of channels also needs a new iteration of `threadpool` 
that works in a similar way, but yes, together they should offer a "safer, 
easier to implement alternative".

> I'm not surprised you were able to achieve similar performance to running 
> green threads over a thread pool (and passing messages between them). Context 
> switching is the power in green threads, and their ability to be paused and 
> resumed on different system threads. I'm making some assumptions here, but if 
> one is simply running coroutines over a thread pool without any kind of 
> cooperative multitasking, you're probably just fighting contention issues and 
> the performance characteristics of your implementation will be poor.

No, I was testing the old versus the new channel implementations with the a 
Proof Of Concept thread pool written similar to the old `threadpool` library 
but using the new `Isolated[T]` library; thus message passing was done using 
the new `channels` implementation and so there weren't any contention issues 
other than what it may have. Thus, the user interface looked a lot like as if I 
was calling GoLang go routines using their channel's for inter-thread 
communication as they recommend. One can't match GoLang's go routine 
performance without doing all the "computer magic" they do, but the performance 
was in line with that of other languages that use run time provided "green 
threads" that are based on passing work units to OS threads already spun up and 
just waiting for work, such as Haskell's `forkIO` threads, DotNet's Task 
Parallel Library, etc. Thus, they aren't equivalent to GoLang's "green threads" 
which have the injected code necessary to be able to very quickly suspend and 
resume tiny chunks of code, but hopefully that won't be necessary and @araq 
doesn't want to go that route anyway.

On my test machine, I don't think one can run simple tasks through such a 
thread pool faster than about in the order of hundreds of thousands per second, 
but that's a lot better than `createThread`'ing new OS threads, which is in the 
order of more than a hundred times slower. That's why we want a new 
`threadpool` library that makes this easy to do.

If one wants to do inter-thread communication (whether OS threads or thread 
pool threads) using shared memory allocations with `Lock`'s and `Cond`'s to 
handle the contention (`atomics` is still an Unstable API), of course one can 
do so, but its so much easier just using channels.

Reply via email to