As you mentioned, what you're battling right now is contention. In your existing code the threads always conflict. No matter what, only one thread's update can succeed, the others will fail and have to re-run. With the refs-in-a-vector approach as long as two threads don't touch the same cell (ref) they won't conflict, and so your chances of parallelism will be higher. Note however, that based on your algorithm, you still may have a lot of conflicts if all your threads are trying to do something like "find the first free slot", and since they're all starting at the same start point (the start of the vector) performance won't improve much. To improve this look at ways of offsetting the access of the threads. Perhaps start each thread searching the vector at an offset of (vector_size / threads) * thread_index. Or perhaps the threads will conflict at first, then diverge into different parts of the vector.
As far as scaling there's a few things you can do for that. Firstly refs aren't that heavy, so you should be able to create hundreds or thousands without issue. Also if the threads never need to read from one ref and update another in the same transaction you could look at using Atoms. Atoms don't involve locks (or STM), so via `swap!` it could be possible to get even better performance. On Mon, Jan 30, 2017 at 7:03 PM, Brian Craft <craft.br...@gmail.com> wrote: > Would this not scale badly? When the vector is hundreds of thousands, or > millions? > > On Monday, January 30, 2017 at 5:54:32 PM UTC-8, tbc++ wrote: >> >> Instead of looking at the state as a ref with a vector in it, think of it >> as a vector of refs. That then allows multiple refs to be modified at once >> without stepping on other unrelated refs. >> >> On Mon, Jan 30, 2017 at 5:26 PM, Brian Craft <craft...@gmail.com> wrote: >> >>> I'm experimenting with ref, dosync, and alter to run some code in >>> parallel, but so far haven't been able to figure out how to structure it >>> such that it runs faster, instead of slower. >>> >>> The problem looks something like this. >>> >>> The current state is a long vector. Input is a long sequence of values. >>> At each step some positions in the vector are populated, if empty, based on >>> computations over the next value in the sequence. >>> >>> It's like placing puzzle pieces: search for positions that satisfy a >>> constraint, and fill them. >>> >>> Using threads, I'd like to place the next several values in parallel. >>> But running in parallel it's possible that two threads would find solutions >>> that conflict. In this case, the latter one must continue searching. >>> >>> Is there a way to express this with clojure? The problem is that I can't >>> tell 'dosync' when a previous transaction invalidates the current >>> transaction: it just always retries. >>> >>> e.g. if each thread is doing >>> >>> >>> (dosync >>> (let [positions (find-positions next-value @state)] >>> (alter state (fn [s] (update-state s positions)))))) >>> >>> they interfere with each other constantly, because any write to 'state' >>> causes the other threads to retry, even if their positions are still free. >>> >>> One really wants to reverse the order of find-positions and dosync here, >>> so it computes positions, then tries to commit them, checking in the >>> transaction that the positions are still free, and continuing the search if >>> they're not. >>> >>> I suppose there's some solution involving exceptions. Is there a better >>> way to think about this? >>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "Clojure" group. >>> To post to this group, send email to clo...@googlegroups.com >>> Note that posts from new members are moderated - please be patient with >>> your first post. >>> To unsubscribe from this group, send email to >>> clojure+u...@googlegroups.com >>> For more options, visit this group at >>> http://groups.google.com/group/clojure?hl=en >>> --- >>> You received this message because you are subscribed to the Google >>> Groups "Clojure" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to clojure+u...@googlegroups.com. >>> For more options, visit https://groups.google.com/d/optout. >>> >> >> >> >> -- >> “One of the main causes of the fall of the Roman Empire was that–lacking >> zero–they had no way to indicate successful termination of their C >> programs.” >> (Robert Firth) >> > -- > You received this message because you are subscribed to the Google > Groups "Clojure" group. > To post to this group, send email to clojure@googlegroups.com > Note that posts from new members are moderated - please be patient with > your first post. > To unsubscribe from this group, send email to > clojure+unsubscr...@googlegroups.com > For more options, visit this group at > http://groups.google.com/group/clojure?hl=en > --- > You received this message because you are subscribed to the Google Groups > "Clojure" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to clojure+unsubscr...@googlegroups.com. > For more options, visit https://groups.google.com/d/optout. > -- “One of the main causes of the fall of the Roman Empire was that–lacking zero–they had no way to indicate successful termination of their C programs.” (Robert Firth) -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to the Google Groups "Clojure" group. To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.