Intersecting two sets seems to be very slow. As a test case, I take a 
wordlist and find the words whose reversals are again in the list (e.g. 
"desserts" and "stressed").

 julia> dict = Set{UTF8String}(map!(chomp, open(readlines, "wordlist"))) ; 
length(dict)
651357
julia> @time result = intersect(dict, map(reverse, dict))

I interrupted the computation after 15 minutes. In contrast building a hash 
to check if a word is in the list and filtering the reversed list takes 7 
seconds.

julia> check(C) = let H = {x => true for x in C} ;  x -> get(H, x, false) ; 
end
julia> @time show(sort(filter(check(dict), map(reverse, dict)), by = 
length))
...
elapsed time: 7.482893039 seconds (168793408 bytes allocated)

Why is intersect so slow? Even if it is O(N log N) I wouldn't expect it to 
be more than 100x slower than a traversal (here N~600000 and log N ~ 13).

Reply via email to