I've posted code for faster, multi-argument clojure.set functions
here:

http://paste.lisp.org/display/74534

Basic algorithms:
 intersection: start with smallest set, and remove elements from it
that aren't in each other set
                     (always iterate through result-in-progress, which
will always be smaller)
 union: start with biggest set, and add elements to it that are in
other sets
                     (always iterate through other set, which will be
smaller than result-in-progress)
 difference: start with first set, and remove elements from all other
sets
                     (for each other set, choose on the fly which to
iterate through based on size)

Suggestions/comments/optimizations welcome.

Below are some timings comparing them to the current clojure.set
versions.  I've run a number of experiments, and these are the fastest
versions I could come up with.  They will be a bit slower than
"reduce"ing with the current ones (50% or so) when passed several tiny
(i.e., 1-element) sets due to the overhead of extracting the big/small
element, but as you can see, can be much much faster when there are
both big and small sets involved.

-Jason

(Rich, if you don't want multi-arg versions for now, let me know; I
figured since there was already an issue for that, I might as well
combine them.)

user> (let [small #{1}, big1 (set (range 10000)), big2 (set (range
5000 15000))]
        (doseq [f [clojure.set/union union clojure.set/difference difference
clojure.set/intersection intersection]]
          (prn f)
          (time (dotimes [_ 10000000] (f small small)))
          (time (dotimes [_ 1000] (f small big1)))
          (time (dotimes [_ 1000] (f big1 small)))
          (time (dotimes [_ 1000] (f big1 big2)))
          (prn)))

#<set$union__5561 clojure.set$union__5...@c347ea>
"Elapsed time: 5170.231 msecs"
"Elapsed time: 23515.553 msecs"
"Elapsed time: 1.232 msecs"
"Elapsed time: 15733.047 msecs"

#<user$union__1166 user$union__1...@c4a6d8>
"Elapsed time: 7046.297 msecs"
"Elapsed time: 1.266 msecs"
"Elapsed time: 1.004 msecs"
"Elapsed time: 13562.658 msecs"

#<set$difference__5564 clojure.set$difference__5...@fb004>
"Elapsed time: 7032.965 msecs"
"Elapsed time: 3326.449 msecs"
"Elapsed time: 6.747 msecs"
"Elapsed time: 12569.736 msecs"

#<user$difference__1189 user$difference__1...@8c75f4>
"Elapsed time: 12001.778 msecs"
"Elapsed time: 41.649 msecs"
"Elapsed time: 20.441 msecs"
"Elapsed time: 16588.469 msecs"

#<set$intersection__5567 clojure.set$intersection__5...@69f31d>
"Elapsed time: 9972.448 msecs"
"Elapsed time: 3128.244 msecs"
"Elapsed time: 16500.68 msecs"
"Elapsed time: 20531.455 msecs"

#<user$intersection__1178 user$intersection__1...@1e6003>
"Elapsed time: 7990.208 msecs"
"Elapsed time: 1.4 msecs"
"Elapsed time: 2.463 msecs"
"Elapsed time: 12463.575 msecs"

(These may be a big screwy, but should be in the right ballpark.)
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To post to this group, send email to clojure@googlegroups.com
To unsubscribe from this group, send email to 
clojure+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/clojure?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to