----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/21618/#review47722 -----------------------------------------------------------
Sorry for the delay in reviewing this, it fell off my radar. I like how you've created a separate UDF for random hashing. Overall this is looking pretty good to me. By the way, the HashTests doesn't apply cleanly now (according to RB at least). Can you try updating the patch? Maybe there were some other changes to this file since you submitted the change. datafu-pig/src/main/java/datafu/pig/hash/Hasher.java <https://reviews.apache.org/r/21618/#comment83897> Shouldn't this be like below instead? this("murmur3-32") datafu-pig/src/main/java/datafu/pig/hash/Hasher.java <https://reviews.apache.org/r/21618/#comment83898> Comment is wrong. There is no seed. datafu-pig/src/main/java/datafu/pig/hash/Hasher.java <https://reviews.apache.org/r/21618/#comment83899> remove extra line datafu-pig/src/main/java/datafu/pig/hash/Hasher.java <https://reviews.apache.org/r/21618/#comment83900> Error includes all hash functions, where here it should only include those that take seeds. datafu-pig/src/main/java/datafu/pig/hash/HasherRand.java <https://reviews.apache.org/r/21618/#comment83901> Needs doc string, which should distinguish this from Hasher. datafu-pig/src/main/java/datafu/pig/hash/HasherRand.java <https://reviews.apache.org/r/21618/#comment83902> Should we have a default constructor of murmur3-32 like Hasher? - Matthew Hayes On May 20, 2014, 9:19 a.m., Philip (flip) Kromer wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/21618/ > ----------------------------------------------------------- > > (Updated May 20, 2014, 9:19 a.m.) > > > Review request for DataFu and Matthew Hayes. > > > Bugs: DATAFU-47 > https://issues.apache.org/jira/browse/DATAFU-47 > > > Repository: datafu > > > Description > ------- > > Accompanies DATAFU-47 https://issues.apache.org/jira/browse/DATAFU-47 -- make > sure to apply the patch from DATAFU-46 too first > > Questions for reviewers: > > * If we upgrade Guava, we'd get sip24 (a fast cryptographically secure hash), > crc32 and adler32 (occasionally useful checksums). I can put the update in as > another patch. Should we upgrade? > * This UDF provides the same hashes as MD5 and SHA udfs. Should those be > deprecated in favor of this? I can add the binhex functionality so that > nothing is lost. > * If there's a standard way to do the dependency injection of a fixed random > number generator for the tests please advise. > > > Diffs > ----- > > datafu-pig/src/main/java/datafu/pig/hash/Hasher.java PRE-CREATION > datafu-pig/src/main/java/datafu/pig/hash/HasherRand.java PRE-CREATION > datafu-pig/src/test/java/datafu/test/pig/hash/HashTests.java 7ff8fb9 > datafu-pig/src/test/java/datafu/test/pig/hash/HasherRandForTesting.java > PRE-CREATION > > Diff: https://reviews.apache.org/r/21618/diff/ > > > Testing > ------- > > ./gradlew :datafu-pig:test -Dtest.single=HashTests > > > Thanks, > > Philip (flip) Kromer > >