[ https://issues.apache.org/jira/browse/HADOOP-2588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12558535#action_12558535 ]
Hadoop QA commented on HADOOP-2588: ----------------------------------- +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12373071/patch.txt against trunk revision r611727. @author +1. The patch does not contain any @author tags. javadoc +1. The javadoc tool did not generate any warning messages. javac +1. The applied patch does not generate any new compiler warnings. findbugs +1. The patch does not introduce any new Findbugs warnings. core tests +1. The patch passed core unit tests. contrib tests +1. The patch passed contrib unit tests. Test results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1577/testReport/ Findbugs warnings: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1577/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1577/artifact/trunk/build/test/checkstyle-errors.html Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1577/console This message is automatically generated. > org.onelab.filter.BloomFilter class uses 8X the memory it should be using > ------------------------------------------------------------------------- > > Key: HADOOP-2588 > URL: https://issues.apache.org/jira/browse/HADOOP-2588 > Project: Hadoop > Issue Type: Improvement > Components: contrib/hbase > Affects Versions: 0.16.0 > Environment: n/a > Reporter: Ian Clarke > Priority: Trivial > Fix For: 0.16.0 > > Attachments: patch.txt > > > The org.onelab.filter.BloomFilter uses a boolean[] to store the filter, > however in most Java implementations this will use a byte per bit stored, > meaning that 8X the actual used memory is required. This is unfortunate as > the whole point of a BloomFilter is to save memory. > As a sidebar, the implementation looks a bit shaky in other ways, such as the > way hashes are generated from a SHA1 digest in the Filter class, such as the > way that it just assumes the digestBytes array will be long enough in the > hash() method. > I discovered this while looking for a good Bloom Filter implementation to use > in my own project. In the end I went ahead and implemented my own, its very > simple and pretty elegant (even if I do say so myself ;) - you are welcome to > use it: > http://locut.us/blog/2008/01/12/a-decent-stand-alone-java-bloom-filter-implementation/ -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.