I've been moving house for the last week or so but I'll also give the benchmark another look. My initial profiling seemed to show that the parallel version was spending a significant amount of time in java.lang.isArray, clojush.pushstate/stack-ref is calling nth on the result of cons, since it isn't an instance of clojure.lang.Indexed nth resorts to a series of tests before returning the value (including isArray).
I changed clojush.pushstate/push-item implementation to (assoc state type (vec (cons value (type state)))) ;; vec added to ensure result is indexable This slowed down my single threaded version a bit but improved my parallel speedup from 2x to 3x on an 8 physical core machine. We could easily improve this by replacing the cons with conj and updating the code that pops the state or by implementing a more efficient indexable stack with deftype. After the change above clojush.interpreter/execute_instruction is looking like a hotspot with clojure.lang.ArraySeq creation seeming to spend more time in java.lang.Class.getComponentType() in the parallel version than the serial one. Cameron. -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en