As you mentioned, what you're battling right now is contention. In your
existing code the threads always conflict. No matter what, only one
thread's update can succeed, the others will fail and have to re-run. With
the refs-in-a-vector approach as long as two threads don't touch the same
cell (ref) they won't conflict, and so your chances of parallelism will be
higher. Note however, that based on your algorithm, you still may have a
lot of conflicts if all your threads are trying to do something like "find
the first free slot", and since they're all starting at the same start
point (the start of the vector) performance won't improve much. To improve
this look at ways of offsetting the access of the threads. Perhaps start
each thread searching the vector at an offset of (vector_size / threads) *
thread_index. Or perhaps the threads will conflict at first, then diverge
into different parts of the vector.

As far as scaling there's a few things you can do for that. Firstly refs
aren't that heavy, so you should be able to create hundreds or thousands
without issue. Also if the threads never need to read from one ref and
update another in the same transaction you could look at using Atoms. Atoms
don't involve locks (or STM), so via `swap!` it could be possible to get
even better performance.


On Mon, Jan 30, 2017 at 7:03 PM, Brian Craft <craft.br...@gmail.com> wrote:

> Would this not scale badly? When the vector is hundreds of thousands, or
> millions?
>
> On Monday, January 30, 2017 at 5:54:32 PM UTC-8, tbc++ wrote:
>>
>> Instead of looking at the state as a ref with a vector in it, think of it
>> as a vector of refs. That then allows multiple refs to be modified at once
>> without stepping on other unrelated refs.
>>
>> On Mon, Jan 30, 2017 at 5:26 PM, Brian Craft <craft...@gmail.com> wrote:
>>
>>> I'm experimenting with ref, dosync, and alter to run some code in
>>> parallel, but so far haven't been able to figure out how to structure it
>>> such that it runs faster, instead of slower.
>>>
>>> The problem looks something like this.
>>>
>>> The current state is a long vector. Input is a long sequence of values.
>>> At each step some positions in the vector are populated, if empty, based on
>>> computations over the next value in the sequence.
>>>
>>> It's like placing puzzle pieces: search for positions that satisfy a
>>> constraint, and fill them.
>>>
>>> Using threads, I'd like to place the next several values in parallel.
>>> But running in parallel it's possible that two threads would find solutions
>>> that conflict. In this case, the latter one must continue searching.
>>>
>>> Is there a way to express this with clojure? The problem is that I can't
>>> tell 'dosync' when a previous transaction invalidates the current
>>> transaction: it just always retries.
>>>
>>> e.g. if each thread is doing
>>>
>>>
>>>     (dosync
>>>       (let [positions (find-positions next-value @state)]
>>>        (alter state (fn [s] (update-state s positions))))))
>>>
>>> they interfere with each other constantly, because any write to 'state'
>>> causes the other threads to retry, even if their positions are still free.
>>>
>>> One really wants to reverse the order of find-positions and dosync here,
>>> so it computes positions, then tries to commit them, checking in the
>>> transaction that the positions are still free, and continuing the search if
>>> they're not.
>>>
>>> I suppose there's some solution involving exceptions. Is there a better
>>> way to think about this?
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "Clojure" group.
>>> To post to this group, send email to clo...@googlegroups.com
>>> Note that posts from new members are moderated - please be patient with
>>> your first post.
>>> To unsubscribe from this group, send email to
>>> clojure+u...@googlegroups.com
>>> For more options, visit this group at
>>> http://groups.google.com/group/clojure?hl=en
>>> ---
>>> You received this message because you are subscribed to the Google
>>> Groups "Clojure" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to clojure+u...@googlegroups.com.
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>
>>
>> --
>> “One of the main causes of the fall of the Roman Empire was that–lacking
>> zero–they had no way to indicate successful termination of their C
>> programs.”
>> (Robert Firth)
>>
> --
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to clojure@googlegroups.com
> Note that posts from new members are moderated - please be patient with
> your first post.
> To unsubscribe from this group, send email to
> clojure+unsubscr...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/clojure?hl=en
> ---
> You received this message because you are subscribed to the Google Groups
> "Clojure" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to clojure+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>



-- 
“One of the main causes of the fall of the Roman Empire was that–lacking
zero–they had no way to indicate successful termination of their C
programs.”
(Robert Firth)

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to