Hi, I need to build a data structure to store aliases of arbitrary
values (e.g. URIs), basically an undirected graph. Because I need a
fast lookup, I thought using a map with bi-directional mappings could
work well, e.g. stating "X" sameas "A" becomes this:
;; x == a : {a #{x} x #{a}}
Continuing further...
;; a == b : {a #{b x} b #{a} x #{a}}
;; c == y : {a #{b x} b #{a} c #{y} x #{a} y #{c}}
;; y == b : {a #{b x} b #{a y} c #{y] x #{a} y #{b c}}
I've written the functions below and they produce the correct result:
(def ->set (fnil conj #{}))
(defn register-same-as--map
[aliases x y]
(-> aliases
(update-in [x] ->set y)
(update-in [y] ->set x)))
(defn same-as?--map
"Takes a map of same-as rels and two keys, returns last if both keys
are considered equal."
([aliases x y] (or (= x y) (same-as?--map aliases [x] y #{})))
([aliases [x & queue] y visited]
(or
(get-in aliases [x y])
(when-let [q' (->> x
aliases
(filter (complement visited))
(into queue)
set
seq)]
(recur aliases q' y (conj visited x))))))
(time
(dotimes [i 100000]
(def am
(reduce
(fn [a [x y]] (register-same-as--map a x y))
{} '[[x d] [a b] [b d] [e f] [d y] [c y] [y b] [g e]]))))
;; "Elapsed time: 1976.503 msecs"
(same-as?--map am 'x 'y)
;; y
(same-as?--map am 'a 'c)
;; c
(same-as?--map am 'x 'g)
;; nil
(time (dotimes [i 100000] (same-as?--map am 'x 'y)))
;; "Elapsed time: 353.338 msecs"
Since in most of my use cases I won't have to recurse over more than a
few edges per lookup this approach should work okay. However, I was
still interested in other more compact/faster options. For example,
provided there're aren't too many distinct sets of unrelated nodes,
using sets like below would provide (~4x) faster lookups, however am
not sure if that's really the best way merging those sets efficiently
whenever needed and new edges are added. I guess another level of
branching in `register-same-as` would help to avoid the merging, but
are there any fundamentally better approaches than those two listed?
;; x->a #{#{x a}}
;; x->b #{#{x a b}}
;; c->y #{#{x a b} #{c y}}
;; y->b #{#{x a b c y}} ; merge
(defn register-same-as--set
[aliases x y]
(let [matches (filter #(or (% x) (% y)) aliases)]
(if (seq matches)
(let [a' (reduce disj aliases matches)
m' (reduce #(into % %2) matches)]
(conj a' (conj m' x y)))
(conj aliases #{x y}))))
(defn same-as?--set
[aliases x y] (some #(and (% x) (% y)) aliases))
(time
(dotimes [i 100000]
(def as
(reduce
(fn [a [x y]] (register-same-as--set a x y))
#{} '[[x d] [a b] [b d] [e f] [d y] [c y] [y b] [g e]]))))
;; "Elapsed time: 2308.634 msecs"
(time (dotimes [i 100000] (same-as?--set as 'x 'y)))
;; "Elapsed time: 89.197 msecs"
Thoughts? & Thanks! :)
--
Karsten Schmidt
http://postspectacular.com | http://toxiclibs.org | http://toxi.co.uk
--
--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to [email protected]
Note that posts from new members are moderated - please be patient with your
first post.
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
---
You received this message because you are subscribed to the Google Groups
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.