Hi Christian,

hasch looks nice, I might end up just using it. I will be hashing smaller
collections
(maps where keys are keywords and values are atomic data like integers).

Collisions BTW are not such a big deal for my use case. I will have a
limited number
of fragments (buckets, index pages, etc.) anyway. 65536 of them perhaps.
The more
I think about the problem the more I realize I am implementing some sort of
hash map.


On Mon, Aug 10, 2015 at 3:49 PM, Christian Weilbach <
whitesp...@polyc0l0r.net> wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Hi,
>
> I am the author of https://github.com/whilo/hasch
>
> Would calling hasch.core/edn-hash satisfy your performance
> requirements? I tried hard to make the recursion of the protocol
> performant, but hashing a value is slower than the time needed to
> write the data to disk for big collections. You should pick a faster
> message-digest like you suggested, e.g. MD5:
>
> (defn ^MessageDigest md5-message-digest []
>   (MessageDigest/getInstance "md5"))
>
> (edn-hash {:foo "Bar" :baz 5} md5-message-digest)
>
> You can use the criterium benchmarking snippets in platform.clj to do
> benchmarks. Object.hashCode() is a lot faster still and caches the
> result, I am not sure how much overhead the protocol dispatch causes.
>
> Note that if some collisions are ok for you, you might find a better
> tradeoff, since atm. commutative collections like maps and sets are
> hashed key-value wise and then XOR'd for safety. I am interested in
> your findings and decision, especially if you pick something else.
>
> Christian
>
> On 10.08.2015 09:00, Atamert Ölçgen wrote:
> > Hi,
> >
> > I need a way to reduce a compound value, say {:foo "bar"}, into a
> > number (like 693d9a0698aff95c in hex). I don't necessarily need a
> > very large hash space, 7 hex digits is good enough for my purposes.
> > But I need this hash to be consistent between runs and JVM versions
> > etc. So I guess that rules out standard object hashes.
> >
> > I would like to find a sufficiently fast way to do this. I can live
> > with MD5, but are there faster alternatives (but produce smaller
> > hashes)? ( clj-digest <https://github.com/tebeka/clj-digest>
> > provides a nice interface to what Java provides but there are only
> > usual suspects AFAICS
> > <
> http://docs.oracle.com/javase/7/docs/technotes/guides/security/StandardNames.html#MessageDigest
> >
> >
> >
> )
> >
> > I will be dealing with unordered collections, but it seems hashing
> > is consistent when the input order is changed:
> >
> > user=> (.hashCode {:foo "Bar" :baz 5}) 2040536238 user=> (.hashCode
> > {:baz 5 :foo "Bar"}) 2040536238
> >
> >
> > (It even gave the same hash code in different runs.)
> >
> > I will use these hashes to build index tables. My data, that
> > contains these things I hash is a set. I will store this as an
> > ordered set and keep an index pointing to where records from this
> > hash to that hash lives. This is all Clojure, but I can't keep all
> > my data in memory. (So Clojure's persistent data structures is out
> > of the picture. life would've been much simpler if I could.)
> >
> > Thanks for reading. Any insight is appreciated.
> >
>
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1
>
> iQEcBAEBAgAGBQJVyJ3vAAoJEKel+aujRZMkbhMIAJ61DGUWM9JoN/JcIxvh2Jph
> VohlWbr1yw69D+x4guGOk5AXUh7HMAkmlbuc+YRRnYqGhZtc3r/6C/d/aa5faBAh
> NdIeDa8yNHTAuYERDktfviy+q5a/blJRdvIIe7ntyjpDZyd2gD1AwUGYOKctXipS
> wMPan7v7yPfPlFfnl+VVXfP8yx/LWyZbwfu0Ugv2B2NhvqPMu8joyondOz7GPcLd
> P7EgpIrvfQAElA4c4+UB0BEeJkn+fnpYF3QLJIy5oQny5QwbVtxgVuUNES8EolYl
> HkpFY1ECV/M65fvP6wrcYPihuphSYQoPkfY4ZQfzWCq9mo+3Aj1Jq2u7QfG9HxM=
> =1UE6
> -----END PGP SIGNATURE-----
>
> --
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to clojure@googlegroups.com
> Note that posts from new members are moderated - please be patient with
> your first post.
> To unsubscribe from this group, send email to
> clojure+unsubscr...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/clojure?hl=en
> ---
> You received this message because you are subscribed to the Google Groups
> "Clojure" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to clojure+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>



-- 
Kind Regards,
Atamert Ölçgen

◻◼◻
◻◻◼
◼◼◼

www.muhuk.com

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to