Absolutely
On Fri, Dec 7, 2012 at 1:29 PM, László Török <ltoro...@gmail.com> wrote: > Hi, > > shouldn't > > (fn [groups a] > (assoc groups (f a) (conj (get groups a []) a))) > > be > > (fn [groups a] > (let [k (f a)] > (assoc groups k (conj (get groups k []) a)))) > > ? > > Las > > 2012/12/7 Christophe Grand <christo...@cgrand.net> > >> Hi, >> >> There are indeed too much allocationfoing on in your r/map. >> You don't need the rmap, >> >> Start from a plain old reduce like your reduce-by-naive, replace reduce >> by r/fold, remove the seed and add the combine-fn (which now provides the >> seed): >> >> (defn group-by-red [f coll] >> (r/fold >> (partial merge-with concat) >> (fn [groups a] >> (assoc groups (f a) (conj (get groups a []) a))) >> coll)) >> >> >> Crude benchmark; >> (let [v (vec (range 1000000))] >> (dotimes [n 5] >> (println "Run" n) >> (time (group-by #(mod % 29) v)) >> (time (group-by-naive #(mod % 29) v)) >> (time (group-by-red #(mod % 29) v)))) >> Run 0 >> "Elapsed time: 938.878 msecs" >> "Elapsed time: 778.161 msecs" >> "Elapsed time: 1035.997 msecs" >> Run 1 >> "Elapsed time: 715.352 msecs" >> "Elapsed time: 771.913 msecs" >> "Elapsed time: 840.264 msecs" >> Run 2 >> "Elapsed time: 656.588 msecs" >> "Elapsed time: 700.114 msecs" >> "Elapsed time: 777.323 msecs" >> Run 3 >> "Elapsed time: 731.781 msecs" >> "Elapsed time: 713.547 msecs" >> "Elapsed time: 875.801 msecs" >> Run 4 >> "Elapsed time: 689.711 msecs" >> "Elapsed time: 710.728 msecs" >> "Elapsed time: 1112.609 msecs" >> nil >> >> Let's play with the granularity (defaults to 512) >> (defn group-by-red [f coll] >> (r/fold 2048 >> (partial merge-with concat) >> (fn [groups a] >> (assoc groups (f a) (conj (get groups a []) a))) >> coll)) >> >> Re-benchmark: >> Run 0 >> "Elapsed time: 810.676 msecs" >> "Elapsed time: 736.764 msecs" >> "Elapsed time: 547.651 msecs" >> Run 1 >> "Elapsed time: 681.249 msecs" >> "Elapsed time: 850.046 msecs" >> "Elapsed time: 521.27 msecs" >> Run 2 >> "Elapsed time: 669.385 msecs" >> "Elapsed time: 712.15 msecs" >> "Elapsed time: 518.502 msecs" >> Run 3 >> "Elapsed time: 673.108 msecs" >> "Elapsed time: 745.688 msecs" >> "Elapsed time: 542.837 msecs" >> Run 4 >> "Elapsed time: 654.196 msecs" >> "Elapsed time: 723.074 msecs" >> "Elapsed time: 506.861 msecs" >> >> Note that you are not exactly comparing apples to apples since >> group-by-red is returning maps whose values are unrealized lazy sequence >> (concats of concats of ocncats of vactors) instead of vectors >> >> (defn group-by-red [f coll] >> (r/fold 2048 >> (partial merge-with into) >> (fn [groups a] >> (assoc groups (f a) (conj (get groups a []) a))) >> coll)) >> >> Run 0 >> "Elapsed time: 763.455 msecs" >> "Elapsed time: 763.501 msecs" >> "Elapsed time: 681.075 msecs" >> Run 1 >> "Elapsed time: 645.52 msecs" >> "Elapsed time: 731.545 msecs" >> "Elapsed time: 476.381 msecs" >> Run 2 >> "Elapsed time: 660.775 msecs" >> "Elapsed time: 728.19 msecs" >> "Elapsed time: 475.543 msecs" >> Run 3 >> "Elapsed time: 657.255 msecs" >> "Elapsed time: 725.995 msecs" >> "Elapsed time: 494.038 msecs" >> Run 4 >> "Elapsed time: 647.53 msecs" >> "Elapsed time: 731.085 msecs" >> "Elapsed time: 538.649 msecs" >> >> hth, >> >> Christophe >> >> >> On Fri, Dec 7, 2012 at 10:30 AM, Balint Erdi <balint.e...@gmail.com>wrote: >> >>> BTW I understood the most about reducers (still not quite there yet, >>> though :) ) from Rich Hickey's talk at EuroClojure: >>> >>> https://vimeo.com/45561411 >>> >>> >>> On Friday, December 7, 2012 10:21:59 AM UTC+1, Balint Erdi wrote: >>>> >>>> Hey, >>>> >>>> Reducers is fascinating and quite complex at the same time. I feel like >>>> "best practices" around it has not quite solidified yet. >>>> >>>> Here is how I made your example work: >>>> >>>> (ns group-by-reducers.core >>>> (:require [clojure.core.reducers :as r :only [fold reduce map]]) >>>> (:require [criterium.core :as c])) >>>> >>>> (defn group-by-naive [f coll] >>>> (reduce >>>> (fn [groups a] >>>> (assoc groups (f a) (conj (get groups a []) a))) >>>> {} >>>> coll)) >>>> >>>> (defn group-by-red [f coll] >>>> (letfn [(reduce-groups >>>> ([] {}) >>>> ([g1 g2] >>>> (merge-with concat g1 g2)))] >>>> (r/fold reduce-groups (r/map #(hash-map (f %) [%]) coll)))) >>>> >>>> (defn -main >>>> [& args] >>>> (c/quick-bench (group-by #(mod % 29) (range 10000))) >>>> (c/quick-bench (group-by-naive #(mod % 29) (range 10000))) >>>> (c/quick-bench (group-by-red #(mod % 29) (range 10000)))) >>>> >>>> (available as a gist here: >>>> https://gist.github.com/**4232044<https://gist.github.com/4232044> >>>> ) >>>> >>>> The core and naive versions perform roughly equally (~4ms) while the >>>> reducers version takes 10x as much time (~40ms) on my two-core laptop. >>>> >>>> I'm absolutely sure I'm doing something wrong (there is probably too >>>> much memory allocated by the map function) and would like to hear the >>>> opinion of someone more knowledgeable with reducers. I just did not want >>>> the subject to be buried. >>>> >>>> Hope that still helps a bit, >>>> Balint >>>> >>>> ps. There are some tips and tricks in this post: >>>> http://www.thebusby.com/**2012/07/tips-tricks-with-** >>>> clojure-reducers.html<http://www.thebusby.com/2012/07/tips-tricks-with-clojure-reducers.html> >>>> >>>> On Monday, December 3, 2012 5:04:17 PM UTC+1, Las wrote: >>>>> >>>>> Hi, >>>>> >>>>> As I was trying to wrap my head around the reducers library[1], I >>>>> thought implementing group-by would be a good exercise to gain some >>>>> insight. >>>>> >>>>> After spending a few hours with it, I'm still pretty much clueless, so >>>>> hope to find someone here to help me out: >>>>> >>>>> So if I understood the reducer lingo introduced in [2],[3] and >>>>> group-by correctly, it reduces the following reducing function on a >>>>> collection >>>>> >>>>> (fn group-by-reducef [keyfn ret x] >>>>> (let [k (keyfn x)] >>>>> (assoc ret k (conj (get ret k []) x)))) >>>>> >>>>> where keyfn is provided by a partial function application. >>>>> >>>>> fold needs a combining function that takes two result maps that have >>>>> already been grouped and merges them. >>>>> A naive implementation could look like >>>>> >>>>> (defn group-by-combinef >>>>> ([] {}) >>>>> ([g1 g2] >>>>> (persistent! >>>>> (reduce (fn [res k v] >>>>> (assoc! res k (into (get res k []) v))) >>>>> (transient g1) g2)))) >>>>> >>>>> (defn group-by [f coll] >>>>> (fold (partial gr-by-reducef f) gr-by-combinef coll)) >>>>> >>>>> Now couple of questions: >>>>> 1) I expected fold to actually perform the operation, how can I force >>>>> it to give me the result? >>>>> 2) Can somehow the actual reducing at the leaf nodes still take >>>>> advantage of transient collections? >>>>> 3) I took a look at flatten as it seems the "closest" match. Again, if >>>>> I call (flatten [[1 2] [2 4]]), I don't actually get the result. How do I >>>>> get to the result? >>>>> >>>>> >>>>> Thanks! >>>>> >>>>> [1] https://github.com/**clojure/clojure/blob/master/** >>>>> src/clj/clojure/core/reducers.**clj<https://github.com/clojure/clojure/blob/master/src/clj/clojure/core/reducers.clj> >>>>> [2] http://clojure.com/blog/**2012/05/08/reducers-a-library-** >>>>> and-model-for-collection-**processing.html<http://clojure.com/blog/2012/05/08/reducers-a-library-and-model-for-collection-processing.html> >>>>> [3] >>>>> http://clojure.com/blog/**2012/05/15/anatomy-of-reducer.**html<http://clojure.com/blog/2012/05/15/anatomy-of-reducer.html> >>>>> -- >>>>> László Török >>>>> >>>>> -- >>> You received this message because you are subscribed to the Google >>> Groups "Clojure" group. >>> To post to this group, send email to clojure@googlegroups.com >>> Note that posts from new members are moderated - please be patient with >>> your first post. >>> To unsubscribe from this group, send email to >>> clojure+unsubscr...@googlegroups.com >>> For more options, visit this group at >>> http://groups.google.com/group/clojure?hl=en >>> >> >> >> >> -- >> On Clojure http://clj-me.cgrand.net/ >> Clojure Programming http://clojurebook.com >> Training, Consulting & Contracting http://lambdanext.eu/ >> >> -- >> You received this message because you are subscribed to the Google >> Groups "Clojure" group. >> To post to this group, send email to clojure@googlegroups.com >> Note that posts from new members are moderated - please be patient with >> your first post. >> To unsubscribe from this group, send email to >> clojure+unsubscr...@googlegroups.com >> For more options, visit this group at >> http://groups.google.com/group/clojure?hl=en >> > > > > -- > László Török > > -- > You received this message because you are subscribed to the Google > Groups "Clojure" group. > To post to this group, send email to clojure@googlegroups.com > Note that posts from new members are moderated - please be patient with > your first post. > To unsubscribe from this group, send email to > clojure+unsubscr...@googlegroups.com > For more options, visit this group at > http://groups.google.com/group/clojure?hl=en > -- On Clojure http://clj-me.cgrand.net/ Clojure Programming http://clojurebook.com Training, Consulting & Contracting http://lambdanext.eu/ -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en