I think we can take some public algos like lookup3 / murmurhash2/3,
and stuff them into Lucene utils.

Java implementations (very simple and fast ones) exist for both of these.

I.e. lookup3 done by Yonik (http://people.apache.org/~yonik/code/hash/),
murmurhash2 - by Andrzej Bialecki (
http://www.getopt.org/murmur/MurmurHash.java ) used in Hadoop?
non-compatible with original?,
a version by Derek Young (
http://dmy999.com/article/50/murmurhash-2-java-port ),
a version by Ted Dunning for Mahout,
a random version off Google by Viliam Holub (
http://d3s.mff.cuni.cz/~holub/sw/javamurmurhash/ )

lookup3 defines versions returning 32/64bit hashes, murmurhash2 -
32/64bit, murmurhash3 - 32/64/128bit

They are just generally useful for many text-processing apps.

On Sat, Dec 25, 2010 at 16:20, Robert Muir <rcm...@gmail.com> wrote:
> On Sat, Dec 25, 2010 at 4:04 AM, Uwe Schindler <u...@thetaphi.de> wrote:
>> Md5 is guaranteed to be there (like utf8 as charset). This is documented in 
>> crypto Api, which algorithms are available for digest.
>>
>
> where is this documented? its not in the javadocs.
>
> anyway, we shouldn't be doing this:
> * this algorithm might not exist on J2ME etc (still java), you need to
> install an extra crypto add-on.
> * we shouldnt start up an expensive PKI infrastructure on mac os X,
> including spawning a new thread, just to hash a string. thats absurd.
> * we pay all these costs ... for md5! its not even a good hash!
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>



-- 
Kirill Zakharenko/Кирилл Захаренко (ear...@gmail.com)
Phone: +7 (495) 683-567-4
ICQ: 104465785

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to