nsivabalan commented on code in PR #10457:
URL: https://github.com/apache/hudi/pull/10457#discussion_r3269711715
##########
hudi-common/src/main/java/org/apache/hudi/io/storage/HoodieFileWriterFactory.java:
##########
@@ -128,4 +128,14 @@ public static BloomFilter createBloomFilter(HoodieConfig
config) {
config.getIntOrDefault(HoodieStorageConfig.BLOOM_FILTER_DYNAMIC_MAX_ENTRIES),
config.getStringOrDefault(HoodieStorageConfig.BLOOM_FILTER_TYPE));
}
+
+ /**
+ * Check if need to enable bloom filter.
+ */
+ public static boolean enableBloomFilter(boolean populateMetaFields,
HoodieConfig config) {
+ return populateMetaFields &&
(config.getBooleanOrDefault(HoodieStorageConfig.PARQUET_WITH_BLOOM_FILTER_ENABLED)
+ // HoodieIndexConfig is located in the package hudi-client-common, and
the package hudi-client-common depends on the package hudi-common,
+ // so the class HoodieIndexConfig cannot be accessed in hudi-common,
otherwise there will be a circular dependency problem
+ || (config.contains("hoodie.index.type") &&
config.getString("hoodie.index.type").contains("BLOOM")));
Review Comment:
hey @waitingF :
We generally wanted to have bloom filter enabled irrespective of the index
type.
why can't we flip the conditions here.
for bucket index, why can't we go ahead and disable by default.
for all other indexes, we can leave it enabled by default unless someone
explicitly overrides via
`HoodieStorageConfig.PARQUET_WITH_BLOOM_FILTER_ENABLED`
@yihua @danny0405 : what do you guys think
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]