Hey,

Reducers is fascinating and quite complex at the same time. I feel like 
"best practices" around it has not quite solidified yet.

Here is how I made your example work:

(ns group-by-reducers.core
  (:require [clojure.core.reducers :as r :only [fold reduce map]])
  (:require [criterium.core :as c]))

(defn group-by-naive [f coll]
  (reduce
    (fn [groups a]
      (assoc groups (f a) (conj (get groups a []) a)))
    {}
    coll))

(defn group-by-red [f coll]
  (letfn [(reduce-groups
            ([] {})
            ([g1 g2]
              (merge-with concat g1 g2)))]
    (r/fold reduce-groups (r/map #(hash-map (f %) [%]) coll))))

(defn -main
  [& args]
  (c/quick-bench (group-by #(mod % 29) (range 10000)))
  (c/quick-bench (group-by-naive #(mod % 29) (range 10000)))
  (c/quick-bench (group-by-red #(mod % 29) (range 10000))))

(available as a gist here: https://gist.github.com/4232044)

The core and naive versions perform roughly equally (~4ms) while the 
reducers version takes 10x as much time (~40ms) on my two-core laptop.

I'm absolutely sure I'm doing something wrong (there is probably too much 
memory allocated by the map function) and would like to hear the opinion of 
someone more knowledgeable with reducers. I just did not want the subject 
to be buried.

Hope that still helps a bit,
Balint

ps. There are some tips and tricks in this 
post: http://www.thebusby.com/2012/07/tips-tricks-with-clojure-reducers.html

On Monday, December 3, 2012 5:04:17 PM UTC+1, Las wrote:
>
> Hi,
>
> As I was trying to wrap my head around the reducers library[1], I thought 
> implementing group-by would be a good exercise to gain some insight.
>
> After spending a few hours with it, I'm still pretty much clueless, so 
> hope to find someone here to help me out:
>
> So if I understood the reducer lingo introduced in [2],[3] and group-by 
> correctly, it reduces the following reducing function on a collection
>
> (fn group-by-reducef [keyfn ret x]
>      (let [k (keyfn x)]
>         (assoc ret k (conj (get ret k []) x))))
>
> where keyfn is provided by a partial function application.
>
> fold needs a combining function that takes two result maps that have 
> already been grouped and merges them.
> A naive implementation could look like
>
> (defn group-by-combinef
>   ([] {})
>   ([g1 g2]
>      (persistent!
>       (reduce (fn [res k v]
>                 (assoc! res k (into (get res k []) v)))
>               (transient g1) g2))))
>
> (defn group-by [f coll]
>   (fold (partial gr-by-reducef f) gr-by-combinef coll))
>
> Now couple of questions:
> 1) I expected fold to actually perform the operation, how can I force it 
> to give me the result?
> 2) Can somehow the actual reducing at the leaf nodes still take advantage 
> of transient collections?
> 3) I took a look at flatten as it seems the "closest" match. Again, if I 
> call (flatten [[1 2] [2 4]]), I don't actually get the result. How do I get 
> to the result?
>
>
> Thanks!
>
> [1] 
> https://github.com/clojure/clojure/blob/master/src/clj/clojure/core/reducers.clj
> [2] 
> http://clojure.com/blog/2012/05/08/reducers-a-library-and-model-for-collection-processing.html
> [3] http://clojure.com/blog/2012/05/15/anatomy-of-reducer.html
> -- 
> László Török
>
> 

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Reply via email to