wgtmac commented on code in PR #1020:
URL: https://github.com/apache/parquet-mr/pull/1020#discussion_r1070853151


##########
parquet-column/src/main/java/org/apache/parquet/column/values/bloomfilter/BlockSplitBloomFilter.java:
##########
@@ -394,4 +395,24 @@ public long hash(float value) {
   public long hash(Binary value) {
     return hashFunction.hashBytes(value.getBytes());
   }
+
+  @Override
+  public void merge(BloomFilter otherBloomFilter) throws IOException {
+    Preconditions.checkArgument(otherBloomFilter != null, "Cannot merge a null 
BloomFilter");
+    Preconditions.checkArgument((getAlgorithm() == 
otherBloomFilter.getAlgorithm()),
+      "BloomFilters must have the same algorithm (%s != %s)",
+        getAlgorithm(), otherBloomFilter.getAlgorithm());
+    Preconditions.checkArgument((getHashStrategy() == 
otherBloomFilter.getHashStrategy()),
+      "BloomFilters must have the same hashStrategy (%s != %s)",
+        getHashStrategy(), otherBloomFilter.getHashStrategy());
+    Preconditions.checkArgument((getBitsetSize() == 
otherBloomFilter.getBitsetSize()),

Review Comment:
   Maybe we should put this check to the 1st place to make it fail fast. 
Usually parameters above do not change.



##########
parquet-column/src/main/java/org/apache/parquet/column/values/bloomfilter/BlockSplitBloomFilter.java:
##########
@@ -394,4 +395,24 @@ public long hash(float value) {
   public long hash(Binary value) {
     return hashFunction.hashBytes(value.getBytes());
   }
+
+  @Override
+  public void merge(BloomFilter otherBloomFilter) throws IOException {

Review Comment:
   I still think it is helpful to extract the check logic to a separate method 
like `boolean canMergeFrom(BloomFilter other)` because user has to check these 
parameters anyway before calling the `merge` method. Then `merge` 
implementation can simply call `canMergeFrom`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@parquet.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to