[ 
https://issues.apache.org/jira/browse/CASSANDRA-2975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13073310#comment-13073310
 ] 

Brian Lindauer commented on CASSANDRA-2975:
-------------------------------------------

Summary:

{code}
Mean FP rates for version 2:
LongBloomFilterTest: 0.997967059178744
LongLegacyBloomFilterTest: 0.997908061594203

Mean FP rates for version 3:
LongBloomFilterTest: 0.998045621980676
LongLegacyBloomFilterTest: 0.998863888888889
{code}


Details:

{code}
Version 2:
     [echo] running long tests
    [junit] WARNING: multiple versions of ant detected in path for junit 
    [junit]          
jar:file:/usr/share/ant/lib/ant.jar!/org/apache/tools/ant/Project.class
    [junit]      and 
jar:file:/Users/jbl/git/cassandra/build/lib/jars/ant-1.6.5.jar!/org/apache/tools/ant/Project.class
    [junit] Testsuite: org.apache.cassandra.utils.LongBloomFilterTest
    [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 106.213 sec
    [junit] 
    [junit] ------------- Standard Error -----------------
    [junit] fp_ratio = 0.9973043478260869
    [junit] fp_ratio = 0.9965793478260869
    [junit] fp_ratio = 0.9996123188405797
    [junit] fp_ratio = 1.0004746376811595
    [junit] fp_ratio = 0.998409420289855
    [junit] fp_ratio = 0.9920978260869565
    [junit] fp_ratio = 0.9979420289855072
    [junit] fp_ratio = 0.9940797101449276
    [junit] fp_ratio = 0.9983913043478261
    [junit] fp_ratio = 1.0006159420289855
    [junit] fp_ratio = 1.0000362318840579
    [junit] fp_ratio = 1.0000615942028985
    [junit] ------------- ---------------- ---------------
mean = 0.997967059178744

    [junit] Testsuite: org.apache.cassandra.utils.LongLegacyBloomFilterTest
    [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 61.721 sec
    [junit] 
    [junit] ------------- Standard Error -----------------
    [junit] fp_ratio = 0.998095652173913
    [junit] fp_ratio = 0.9982576086956522
    [junit] fp_ratio = 0.999159420289855
    [junit] fp_ratio = 1.0001340579710145
    [junit] fp_ratio = 1.0011557971014493
    [junit] fp_ratio = 0.9967717391304348
    [junit] fp_ratio = 0.9955978260869566
    [junit] fp_ratio = 0.9989673913043479
    [junit] fp_ratio = 0.9966231884057971
    [junit] fp_ratio = 0.9973514492753623
    [junit] fp_ratio = 0.9969855072463768
    [junit] fp_ratio = 0.9957971014492754
    [junit] ------------- ---------------- ---------------
mean      = 0.997908061594203


Version 3:
     [echo] running long tests
    [junit] WARNING: multiple versions of ant detected in path for junit 
    [junit]          
jar:file:/usr/share/ant/lib/ant.jar!/org/apache/tools/ant/Project.class
    [junit]      and 
jar:file:/Users/jbl/git/cassandra/build/lib/jars/ant-1.6.5.jar!/org/apache/tools/ant/Project.class
    [junit] Testsuite: org.apache.cassandra.utils.LongBloomFilterTest
    [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 75.994 sec
    [junit] 
    [junit] ------------- Standard Error -----------------
    [junit] fp_ratio = 0.9986532608695652
    [junit] fp_ratio = 0.997158695652174
    [junit] fp_ratio = 0.9995797101449275
    [junit] fp_ratio = 0.9995
    [junit] fp_ratio = 0.9984565217391305
    [junit] fp_ratio = 0.9987101449275362
    [junit] fp_ratio = 0.9979528985507247
    [junit] fp_ratio = 0.9998224637681159
    [junit] fp_ratio = 0.9938876811594203
    [junit] fp_ratio = 0.9993623188405797
    [junit] fp_ratio = 0.9953369565217391
    [junit] fp_ratio = 0.9981268115942029
    [junit] ------------- ---------------- ---------------
mean      = 0.998045621980676

    [junit] Testsuite: org.apache.cassandra.utils.LongLegacyBloomFilterTest
    [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 60.999 sec
    [junit] 
    [junit] ------------- Standard Error -----------------
    [junit] fp_ratio = 0.998095652173913
    [junit] fp_ratio = 0.9983760869565217
    [junit] fp_ratio = 0.9993043478260869
    [junit] fp_ratio = 0.9996159420289855
    [junit] fp_ratio = 0.9980217391304348
    [junit] fp_ratio = 1.0016920289855074
    [junit] fp_ratio = 0.9953623188405797
    [junit] fp_ratio = 0.9968188405797102
    [junit] fp_ratio = 0.9947173913043478
    [junit] fp_ratio = 1.000695652173913
    [junit] fp_ratio = 1.0030760869565218
    [junit] fp_ratio = 1.0005905797101449
    [junit] ------------- ---------------- ---------------
mean      = 0.998863888888889
{code}

> Upgrade MurmurHash to version 3
> -------------------------------
>
>                 Key: CASSANDRA-2975
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2975
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 0.8.3
>            Reporter: Brian Lindauer
>            Priority: Trivial
>              Labels: lhf
>
> MurmurHash version 3 was finalized on June 3. It provides an enormous speedup 
> and increased robustness over version 2, which is implemented in Cassandra. 
> Information here:
> http://code.google.com/p/smhasher/
> The reference implementation is here:
> http://code.google.com/p/smhasher/source/browse/trunk/MurmurHash3.cpp?spec=svn136&r=136
> I have already done the work to port the (public domain) reference 
> implementation to Java in the MurmurHash class and updated the BloomFilter 
> class to use the new implementation:
> https://github.com/lindauer/cassandra/commit/cea6068a4a3e5d7d9509335394f9ef3350d37e93
> Apart from the faster hash time, the new version only requires one call to 
> hash() rather than 2, since it returns 128 bits of hash instead of 64.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to