No worries. Thanks, I'll give that a try as well! On Thursday, November 19, 2015 at 1:04:04 AM UTC+9, tbc++ wrote: > > Oh, then I completely mis-understood the problem at hand here. If that's > the case then do the following: > > Change "atom" to "volatile!" and "swap!" to "vswap!". See if that changes > anything. > > Timothy > > > On Wed, Nov 18, 2015 at 9:00 AM, David Iba <davi...@gmail.com > <javascript:>> wrote: > >> Timothy: Each thread (call of f2) creates its own "local" atom, so I >> don't think there should be any swap retries. >> >> Gianluca: Good idea! I've only tried OpenJDK, but I will look into >> trying Oracle and report back. >> >> Andy: jvisualvm was showing pretty much all of the memory allocated in >> the eden space and a little in the first survivor (no major/full GC's), and >> total GC Time was very minimal. >> >> I'm in the middle of running some more tests and will report back when I >> get a chance today or tomorrow. Thanks for all the feedback on this! >> >> On Thursday, November 19, 2015 at 12:38:55 AM UTC+9, tbc++ wrote: >>> >>> This sort of code is somewhat the worst case situation for atoms (or >>> really for CAS). Clojure's swap! is based off the "compare-and-swap" or CAS >>> operation that most x86 CPUs have as an instruction. If we expand swap! it >>> looks something like this: >>> >>> (loop [old-val @x*] >>> (let [new-val (assoc old-val :k i)] >>> (if (compare-and-swap x* old-val new-val) >>> new-val >>> (recur @x*))) >>> >>> Compare-and-swap can be defined as "updates the content of the reference >>> to new-val only if the current value of the reference is equal to the >>> old-val). >>> >>> So in essence, only one core can be modifying the contents of an atom at >>> a time, if the atom is modified during the execution of the swap! call, >>> then swap! will continue to re-run your function until it's able to update >>> the atom without it being modified during the function's execution. >>> >>> So let's say you have some super long task that you need to integrate >>> into a ref, he's one way to do it, but probably not the best: >>> >>> (let [a (atom 0)] >>> (dotimes [x 18] >>> (future >>> (swap! a long-operation-on-score some-param)))) >>> >>> >>> In this case long-operation-on-score will need to be re-run every time a >>> thread modifies the atom. However if our function only needs the state of >>> the ref to add to it, then we can do something like this instead: >>> >>> (let [a (atom 0)] >>> (dotimes [x 18] >>> (future >>> (let [score (long-operation-on-score some-param) >>> (swap! a + score))))) >>> >>> Now we only have a simple addition inside the swap! and we will have >>> less contention between the CPUs because they will most likely be spending >>> more time inside 'long-operation-on-score' instead of inside the swap. >>> >>> *TL;DR*: do as little work as possible inside swap! the more you have >>> inside swap! the higher chance you will have of throwing away work due to >>> swap! retries. >>> >>> Timothy >>> >>> On Wed, Nov 18, 2015 at 8:13 AM, gianluca torta <giat...@gmail.com> >>> wrote: >>> >>>> by the way, have you tried both Oracle and Open JDK with the same >>>> results? >>>> Gianluca >>>> >>>> On Tuesday, November 17, 2015 at 8:28:49 PM UTC+1, Andy Fingerhut wrote: >>>>> >>>>> David, you say "Based on jvisualvm monitoring, doesn't seem to be >>>>> GC-related". >>>>> >>>>> What is jvisualvm showing you related to GC and/or memory allocation >>>>> when you tried the 18-core version with 18 threads in the same process? >>>>> >>>>> Even memory allocation could become a point of contention, depending >>>>> upon how the memory allocation works with many threads. e.g. Depends on >>>>> whether a thread gets a large chunk of memory on a global lock, and then >>>>> locally carves it up into the small pieces it needs for each individual >>>>> Java 'new' allocation, or gets a global lock for every 'new'. The latter >>>>> would give terrible performance as # cores increase, but I don't know how >>>>> to tell whether that is the case, except by knowing more about how the >>>>> memory allocator is implemented in your JVM. Maybe digging through >>>>> OpenJDK >>>>> source code in the right place would tell? >>>>> >>>>> Andy >>>>> >>>>> On Tue, Nov 17, 2015 at 2:00 AM, David Iba <davi...@gmail.com> wrote: >>>>> >>>>>> correction: that "do" should be a "doall". (My actual test code was >>>>>> a bit different, but each run printed some info when it started so it >>>>>> doesn't have to do with delayed evaluation of lazy seq's or anything). >>>>>> >>>>>> >>>>>> On Tuesday, November 17, 2015 at 6:49:16 PM UTC+9, David Iba wrote: >>>>>>> >>>>>>> Andy: Interesting. Thanks for educating me on the fact that atom >>>>>>> swap's don't use the STM. Your theory seems plausible... I will try >>>>>>> those >>>>>>> tests next time I launch the 18-core instance, but yeah, not sure how >>>>>>> illuminating the results will be. >>>>>>> >>>>>>> Niels: along the lines of this (so that each thread prints its time >>>>>>> as well as printing the overall time): >>>>>>> >>>>>>> 1. (time >>>>>>> 2. (let [f f1 >>>>>>> 3. n-runs 18 >>>>>>> 4. futs (do (for [i (range n-runs)] >>>>>>> 5. (future (time (f)))))] >>>>>>> 6. (doseq [fut futs] >>>>>>> 7. @fut))) >>>>>>> >>>>>>> >>>>>>> On Tuesday, November 17, 2015 at 5:33:01 PM UTC+9, Niels van >>>>>>> Klaveren wrote: >>>>>>>> >>>>>>>> Could you also show how you are running these functions in parallel >>>>>>>> and time them ? The way you start the functions can have as much >>>>>>>> impact as >>>>>>>> the functions themselves. >>>>>>>> >>>>>>>> Regards, >>>>>>>> Niels >>>>>>>> >>>>>>>> On Tuesday, November 17, 2015 at 6:38:39 AM UTC+1, David Iba wrote: >>>>>>>>> >>>>>>>>> I have functions f1 and f2 below, and let's say they run in T1 and >>>>>>>>> T2 amount of time when running a single instance/thread. The issue >>>>>>>>> I'm >>>>>>>>> facing is that parallelizing f2 across 18 cores takes anywhere from >>>>>>>>> 2-5X >>>>>>>>> T2, and for more complex funcs takes absurdly long. >>>>>>>>> >>>>>>>>> >>>>>>>>> 1. (defn f1 [] >>>>>>>>> 2. (apply + (range 2e9))) >>>>>>>>> 3. >>>>>>>>> 4. ;; Note: each call to (f2) makes its own x* atom, so the >>>>>>>>> 'swap!' should never retry. >>>>>>>>> 5. (defn f2 [] >>>>>>>>> 6. (let [x* (atom {})] >>>>>>>>> 7. (loop [i 1e9] >>>>>>>>> 8. (when-not (zero? i) >>>>>>>>> 9. (swap! x* assoc :k i) >>>>>>>>> 10. (recur (dec i)))))) >>>>>>>>> >>>>>>>>> >>>>>>>>> Of note: >>>>>>>>> - On a 4-core machine, both f1 and f2 parallelize well (roungly T1 >>>>>>>>> and T2 for 4 runs in parallel) >>>>>>>>> - running 18 f1's in parallel on the 18-core machine also >>>>>>>>> parallelizes well. >>>>>>>>> - Disabling hyperthreading doesn't help. >>>>>>>>> - Based on jvisualvm monitoring, doesn't seem to be GC-related >>>>>>>>> - also tried on dedicated 18-core ec2 instance with same issues, >>>>>>>>> so not shared-tenancy-related >>>>>>>>> - if I make a jar that runs a single f2 and launch 18 in parallel, >>>>>>>>> it parallelizes well (so I don't think it's machine/aws-related) >>>>>>>>> >>>>>>>>> Could it be that the 18 f2's in parallel on a single JVM instance >>>>>>>>> is overworking the STM with all the swap's? Any other theories? >>>>>>>>> >>>>>>>>> Thanks! >>>>>>>>> >>>>>>>> -- >>>>>> You received this message because you are subscribed to the Google >>>>>> Groups "Clojure" group. >>>>>> To post to this group, send email to clo...@googlegroups.com >>>>>> Note that posts from new members are moderated - please be patient >>>>>> with your first post. >>>>>> To unsubscribe from this group, send email to >>>>>> clojure+u...@googlegroups.com >>>>>> For more options, visit this group at >>>>>> http://groups.google.com/group/clojure?hl=en >>>>>> --- >>>>>> You received this message because you are subscribed to the Google >>>>>> Groups "Clojure" group. >>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>> send an email to clojure+u...@googlegroups.com. >>>>>> For more options, visit https://groups.google.com/d/optout. >>>>>> >>>>> >>>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "Clojure" group. >>>> To post to this group, send email to clo...@googlegroups.com >>>> Note that posts from new members are moderated - please be patient with >>>> your first post. >>>> To unsubscribe from this group, send email to >>>> clojure+u...@googlegroups.com >>>> For more options, visit this group at >>>> http://groups.google.com/group/clojure?hl=en >>>> --- >>>> You received this message because you are subscribed to the Google >>>> Groups "Clojure" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to clojure+u...@googlegroups.com. >>>> For more options, visit https://groups.google.com/d/optout. >>>> >>> >>> >>> >>> -- >>> “One of the main causes of the fall of the Roman Empire was that–lacking >>> zero–they had no way to indicate successful termination of their C >>> programs.” >>> (Robert Firth) >>> >> -- >> You received this message because you are subscribed to the Google >> Groups "Clojure" group. >> To post to this group, send email to clo...@googlegroups.com >> <javascript:> >> Note that posts from new members are moderated - please be patient with >> your first post. >> To unsubscribe from this group, send email to >> clojure+u...@googlegroups.com <javascript:> >> For more options, visit this group at >> http://groups.google.com/group/clojure?hl=en >> --- >> You received this message because you are subscribed to the Google Groups >> "Clojure" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to clojure+u...@googlegroups.com <javascript:>. >> For more options, visit https://groups.google.com/d/optout. >> > > > > -- > “One of the main causes of the fall of the Roman Empire was that–lacking > zero–they had no way to indicate successful termination of their C > programs.” > (Robert Firth) >
-- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to the Google Groups "Clojure" group. To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.