I'm walking a seq of many millions of floats, encoding them for the 
persistence layer, and getting sequence ids from the db. So, conceptually 
there are two parts: the slow part, and the side-effecting part. Vaguely 
like

(map get-ids (map encode float-seq))

which is later reduced while writing to disk. In the get-ids step the order 
matters. My first attempt to make the slow part parallel was to use pmap,

(map get-ids (pmap encode float-seq))

However that's actually slower. I expect this is because even though 
"encode" is the bottleneck, it's still faster than the overhead of pmap. I 
next tried pmap over groups of floats, a bit like

(map get-ids (flatten (pmap #(map encode %) (partition-all 20000 
float-seq))))

(sorry for any typos, I'm just pseudo-coding here) This was still slower, 
which surprised me. I understand the first pmap result, but this one is 
puzzling to me. Even if I partition half the length of the seq (so in 
theory it can run two threads, each of which will run five or six seconds), 
it's no faster than map.  Part of this seems to be the overhead of creating 
more intermediate seqs. Perhaps I'm misunderstanding what's happening 
during partition-all.

Is there some obvious way to approach this scenario? I looked briefly at 
the reducers library, however it was unclear to me how to deal with the 
side-effecting portion of the operation. The second (fast) map operation 
needs to be done in order.

-- 
-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Reply via email to