On 15/01/13 09:25, Marko Topolnik wrote:
The order in which you are polling is not very relevant given the fact
that /doall/ won't return until *all* futures are realized. It's just
an internal detail.
I finally fully grasped what you were saying...So yes you're right - as
long as I'm forcing realisation at the end there is nothing to be
gained...However, what if I submit jobs eagerly and poll for results
lazily? Then there must be some some gain from using the completion
service which will bring back the results in the order they finished....
some basic testing:
(defn pool-map
"A saner, more disciplined version of pmap. Submits jobs eagerly but
polls for results lazily.
Don't use if original ordering of 'coll' matters."
[f coll]
(let [cpu-no (.. Runtime getRuntime availableProcessors)
exec (java.util.concurrent.Executors/newFixedThreadPool cpu-no)
pool (java.util.concurrent.ExecutorCompletionService. exec)
futures (doall (for [x coll] (.submit pool #(f x))))] ;;submit
everything up front
(try
(for [_ futures] (.. pool take get))
(finally (.shutdown exec)))))
;;your version is 'pool-map1'
;;weirdly enough 'pool-map1' doesn't behave lazily (even though it has a
call to 'map'!)!!!
user=> (def dummy-times [3000 10 9 8 7 6 5 4 3 2 1])
#'user/dummy-times
user=> (time (pmap #(do (Thread/sleep %) %) dummy-times))
"Elapsed time: 16.213366 msecs"
(3000 10 9 8 7 6 5 4 3 2 1) ;;here you waited 3s before sleeping for
0.01 s
user=> (time (pool-map #(do (Thread/sleep %) %) dummy-times))
"Elapsed time: 21.004979 msecs"
(10 9 8 7 6 5 4 3 2 1 3000) ;;here you've not waited at all - sleeping
for 3s finished last and is last
user=> (time (pool-map1 #(do (Thread/sleep %) %) dummy-times))
"Elapsed time: 3008.174631 msecs" ;;non-lazy?
(3000 10 9 8 7 6 5 4 3 2 1) ;;again your version will wait for the
first item to finish before proceeding
I think what you trying to get across is that the overall timings (if
we do realise the result) will not differ much as all jobs have to
finish eventually. In other words, sleeping for 3 s first and for 1
later is the same thing as sleeping for 1 s and then for 3
seconds!...and of course this is generally true! However, there is no
real benefit waiting for the 1st task to finish when we don't mind
about ordering. You 'll get the first item whenever it finishes in
whatever position...This MUST be good but perhaps it needs to be paired
with laziness to witness any effect?
aking into account all that was said, /pool-map/ can't offer much more
than /pmap/. You can't know which tasks will take less time until they
are already done. It is theoretically impossible to pre-order them
according to execution time, thereby harvesting the results of the
fastest ones earlier, eventually promoting total concurrency.
hmmm...so the completion service is useless? It can't be... You say
that'You can't know which tasks will take less time until they are
already done' but the way I see it you don't need to...all you need to
know at any given time is whether a or some futures have completed. If
one has indeed completed you invoke .get for the result. If it hasn't
finished and you do .get it will block until it finishes just like
deref-ing in Clojure... I honestly don't see why harvesting the results
of the fastest ones earlier requires to know the execution times up
front! As you go along you can ask the futures whether they finished or
not, can't you?
I am in no way trying to contradict you ,I'm just trying to set things
straight so we are all on the same page...again thanks for your time and
comments! :)
Jim
--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en