Re: How many std.concurrency receivers?

Charles Hixson Thu, 11 Oct 2012 13:25:26 -0700

On 10/11/2012 11:04 AM, thedeemon wrote:

On Thursday, 11 October 2012 at 16:09:20 UTC, Charles Hixson wrote:

Hmmm...what I'm trying to build is basically a cross between a
weighted directed graph and a neural net, with some features of each,
but not much in common. Very light-weight processes would be ideal.
The only communication should be via message-passing. Each cell would
spend most of it's time sitting on a count-down timer waiting to be
rolled out to a database of inactive processes, but it needs to
maintain local state (weights of links, activation level, etc. nothing
fancy.)

If I were doing this sequentially, I'd want to use structs for the
cells, because class instances would be too heavy. And I'd store them
in a hash table keyed by cell-id#.

Unfortunately, I don't see any reasonable way of chunking the pieces,
so that I can chunk them into 100 relatively independent sets. Or even
1000. 10,000 is probably about the right size for active-at-one-time
cells. And if it would handle that, std.concurrency seemed ideal.

Do you have any suggestions as to what would be a reasonable better
choice? (Outside of going back to sequential.)


Here's how I would try to approach a task of having thousands of
independent agents with current std.concurrency. Each agent (cell) is
represented by some data structure and its main function which gets one
message as input, reacts (possibly changing its state and sending other
messages) and returns without blocking. Then I'd create say 16 threads
(or 8, anyway a power of 2 which is close to actual number of cores),
each of them will have its own message queue, that's given by
std.concurrency. Let's say each cell has its own id. I would place cell
with id N to the thread number N mod 16. Each thread will have an array
of cells mapped to it. Then if some cell sends a message to cell X, it
makes sure the message contains cell id of recipient and then sends it
to thread X mod 16. Each worker thread runs a loop where it receives
next message from its queue, finds the target cell by its id in this
thread's array of cells (we can use X / 16 as index) and calls its
reaction function. This way all agents are evenly distributed between
threads, we're using just 16 threads and 16 queues which work in
parallel, and it all acts as if thousands of agents work independently.
However this approach does not guarantee even work distribution between
cores.

=------------below is my second thoughts.

If I could do things that way, it would certainly be a faster designthan what I'm considering now. But I'm really concerned abouteverything fitting into RAM. I'm going to need to think about this.I've got about 8GB of RAM, and I'm on a 64 bit system. So maybe myconcerns about things fitting into memory are out of date. (I'm stillused to thinking of a 64KB computer as being one with a lot of RAM.)And I notice my disk swap space is totally unused. Hmmmm... Maybe Ishould even replace the database with a sequential file.

Unless D has some limits that I can't recall reading about, that lookslike the right way to go, even if it feels wrong. Probably because Ilearned programming way back when... but reasonably it looks like theright answer.

P.S.: There's no way to guarantee that the cores will be used evenly,because the cells definitely AREN'T even in their use. And while thedistribution of use isn't random, it also isn't predictable...and variesover time. So don't worry about this approach not guaranteeing equaldistribution of work.


=------------below is my first impressions

That's a nice approach, though I can't use a vector of cells in eachthread, because the cells roll in and out depending on their level ofactivity, and all active (i.e. ram-resident) cells will need to beaccessed occasionally to age their activity, so that will need to be ahash table (i.e. associative array). Also, I only have about8-hyperthreads. So I guess what I'll do is run all the cells in onethread (to simplify the logic) and in other threads do things likemanage the database, etc. Not what I was hoping for, but probably amuch more reasonable match to the hardware. (Also, I'll want to have afew extra threads available for things like background e-mail polling,etc. Or even debuggers.)

I guess that a part of the problem (i.e., why I can't adopt yoursuggestion) is that there's no way all the cells would fit into RAM.(Or maybe I'm wrong. There will probably be only a few million total.And each one will probably be less than a kilobyte in size. [You'llnote I don't have very precise estimates yet. That will take months toyears to develop.])

Still, if I adopt this serialized variation, it will be relatively easyto split it several ways in the future if I get fancier hardware, and ifI decide that all the nodes WILL fit into RAM. So I guess what I shoulddo is build the serial version, but ensure that it remains feasible toconvert it into the chunked-parallel version that you described.Certainly if I could replace the associative array by a simple vectorthat would speed up lots of parts of it, and so would eliminating therolling in and out of cells.

Re: How many std.concurrency receivers?

Reply via email to