weijietong commented on a change in pull request #1600: DRILL-6947: fix RuntimeFilter memory leak URL: https://github.com/apache/drill/pull/1600#discussion_r250454504
########## File path: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/filter/RuntimeFilterRecordBatch.java ########## @@ -224,21 +224,39 @@ private void applyRuntimeFilter() throws SchemaChangeException { setupHashHelper(); //To make each independent bloom filter work together to construct a final filter result: BitSet. BitSet bitSet = new BitSet(originalRecordCount); - for (int i = 0; i < toFilterFields.size(); i++) { - BloomFilter bloomFilter = bloomFilters.get(i); - String fieldName = toFilterFields.get(i); - computeBitSet(field2id.get(fieldName), bloomFilter, bitSet); - } + + int filterSize = toFilterFields.size(); int svIndex = 0; - for (int i = 0; i < originalRecordCount; i++) { - boolean contain = bitSet.get(i); - if (contain) { - sv2.setIndex(svIndex, i); - svIndex++; - } else { - filteredRows++; + if (filterSize == 1) { + BloomFilter bloomFilter = bloomFilters.get(0); + String fieldName = toFilterFields.get(0); + int fieldId = field2id.get(fieldName); + for (int rowIndex = 0; rowIndex < originalRecordCount; rowIndex++) { + long hash = hash64.hash64Code(rowIndex, 0, fieldId); + boolean contain = bloomFilter.find(hash); + if (contain) { + sv2.setIndex(svIndex, rowIndex); + svIndex ++; + } + } + } else { + for (int i = 0; i < toFilterFields.size(); i++) { + BloomFilter bloomFilter = bloomFilters.get(i); + String fieldName = toFilterFields.get(i); + computeBitSet(field2id.get(fieldName), bloomFilter, bitSet); + } + for (int i = 0; i < originalRecordCount; i++) { + boolean contain = bitSet.get(i); + if (contain) { + sv2.setIndex(svIndex, i); + svIndex++; + } else { + filteredRows++; + } Review comment: As a vector memory format, I prefer one calculation per column execution model which is data local friendly. This also applies to the HashJoin's hashing phase which currently is not that way. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services