I think we can take some public algos like lookup3 / murmurhash2/3, and stuff them into Lucene utils.
Java implementations (very simple and fast ones) exist for both of these. I.e. lookup3 done by Yonik (http://people.apache.org/~yonik/code/hash/), murmurhash2 - by Andrzej Bialecki ( http://www.getopt.org/murmur/MurmurHash.java ) used in Hadoop? non-compatible with original?, a version by Derek Young ( http://dmy999.com/article/50/murmurhash-2-java-port ), a version by Ted Dunning for Mahout, a random version off Google by Viliam Holub ( http://d3s.mff.cuni.cz/~holub/sw/javamurmurhash/ ) lookup3 defines versions returning 32/64bit hashes, murmurhash2 - 32/64bit, murmurhash3 - 32/64/128bit They are just generally useful for many text-processing apps. On Sat, Dec 25, 2010 at 16:20, Robert Muir <rcm...@gmail.com> wrote: > On Sat, Dec 25, 2010 at 4:04 AM, Uwe Schindler <u...@thetaphi.de> wrote: >> Md5 is guaranteed to be there (like utf8 as charset). This is documented in >> crypto Api, which algorithms are available for digest. >> > > where is this documented? its not in the javadocs. > > anyway, we shouldn't be doing this: > * this algorithm might not exist on J2ME etc (still java), you need to > install an extra crypto add-on. > * we shouldnt start up an expensive PKI infrastructure on mac os X, > including spawning a new thread, just to hash a string. thats absurd. > * we pay all these costs ... for md5! its not even a good hash! > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org > > -- Kirill Zakharenko/Кирилл Захаренко (ear...@gmail.com) Phone: +7 (495) 683-567-4 ICQ: 104465785 --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org