Ghadi,
Thanks for you posting - please see the reply I have just posted.
I'll give foldcat a look - sounds interesting.
With regards to my code for splicing maps/sets together efficiently, I
spent some time looking at how that could be integrated with core.reducers
and did some timings :
(def a (into [] (range 1000000)))
(do (time (reduce conj #{} a)) nil) ;; 9225 ms
(do (time (persistent! (reduce conj! (transient #{}) a))) nil) ;; 5981 ms
(do (time (into #{} a)) nil) ;; 6056 ms
(def n (/ (count a) (.availableProcessors (Runtime/getRuntime)))) ;; =
625000
(do (time (r/fold (r/monoid into hash-set) conj a)) nil) ;; 9639 ms
(do (time (r/fold n (r/monoid into hash-set) conj a)) nil) ;; 6859 ms
(do (time (r/fold (r/monoid (fn [r l](clojure.lang.PersistentHashSet/splice
r l)) hash-set) conj a)) nil) ;; 3654 ms
(do (time (r/fold n (r/monoid (fn [r
l](clojure.lang.PersistentHashSet/splice r l)) hash-set) conj a)) nil) ;;
3288 ms
(= (into #{} a) (r/fold n (r/monoid (fn [r
l](clojure.lang.PersistentHashSet/splice r l)) hash-set) conj a)) ;; = true
I am happy to note that using all cores I can reduce into a set faster than
I can sequentially - 3288ms vs 5981ms - what you would hope for, but not
always as easy to achieve as you would expect.
regards
Jules
On Friday, 21 February 2014 01:22:09 UTC, Ghadi Shayban wrote:
>
> Jules,
> For recombination of parallel reductions into a vector, have you looked at
> foldcat<https://github.com/clojure/clojure/blob/master/src/clj/clojure/core/reducers.clj#L314-L318>?
>
> It works really well, and it's one of those wonderful gems in clojure.core
> waiting to be noticed. It gives you back a structure that is counted,
> seqable, and re-foldable - which covers probably 99% of use cases. The
> returned structure is not indexed, that is intentional.
>
> It's definitely conceivable to have the equivalent of a 'transient merge'
> op for maps, and a design doc on clojure
> wiki<http://dev.clojure.org/dashboard.action>is definitely warranted.
>
> I echo the recommendation for RRB-trees when you need efficient
> combination that returns an indexed structure. If you don't, foldcat =
> delicious. Jean is right about the many ways a superstructure can degrade.
>
> Ghadi
>
> On Thursday, February 20, 2014 8:09:26 PM UTC-5, Jean Niklas L'orange
> wrote:
>>
>> Hi Jules,
>>
>> On Thursday, February 20, 2014 11:59:03 PM UTC+1, Jules wrote:
>>>
>>> Subvec provides a view on a subseq of a vector that behaves like a full
>>> vector. Supervec provides a similar view that makes two vectors behave like
>>> a single one
>>>
>>
>> This data structure ("supervec") is usually known as a rope, and your
>> implementation degrades the *~O(1)* to worst case *O(k)* time for
>> lookups, assoc, conj and pop, where *k* is the amount of concatenations
>> performed. A good rope implementation can reduce that factor down to *O(log
>> k)* through balancing, but it will still break the performance
>> guarantees a persistent vector is supposed to have. If you try to use this
>> in the reducers library, the amount of concatenations fork/join performs
>> might actually show notable performance degradation for those operations.
>> Further, passing it around to code which might perform more concatenations
>> will lead to even worse performance, which may be hard to debug for people
>> not knowing Clojure internals.
>>
>> However, it is a very nice solution if you know there will be limited
>> concatenations, that the concatenations result in a balanced tree, and you
>> don't pass this vector around to other people. :)
>>
>> For a small discussion on this vs. RRB-trees, see Section 1.1 in
>> http://infoscience.epfl.ch/record/169879/files/RMTrees.pdf.
>>
>> Hopefully I'm not destroying your fun playing around with these
>> things—that is not the intent at all. I'm just saying that these things are
>> (sadly?) harder than it first looks like.
>>
>> -- JN
>>
>
--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to [email protected]
Note that posts from new members are moderated - please be patient with your
first post.
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
---
You received this message because you are subscribed to the Google Groups
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.