[ https://issues.apache.org/jira/browse/HADOOP-19673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
AMC-team updated HADOOP-19673: ------------------------------ Attachment: HADOOP-19673.000.patch Fix Version/s: 2.8.5 Status: Patch Available (was: Open) > BloomMapFile: invalid io.mapfile.bloom.error.rate (≤0 or ≥1) causes NaN/zero > vector size and writer construction failure > ------------------------------------------------------------------------------------------------------------------------ > > Key: HADOOP-19673 > URL: https://issues.apache.org/jira/browse/HADOOP-19673 > Project: Hadoop Common > Issue Type: Bug > Components: common, io > Affects Versions: 2.8.5 > Reporter: AMC-team > Priority: Major > Fix For: 2.8.5 > > Attachments: HADOOP-19673.000.patch > > > {{BloomMapFile.Writer#initBloomFilter(Configuration)}} computes the Bloom > filter vector size as: > {code:java} > int numKeys = conf.getInt(IO_MAPFILE_BLOOM_SIZE_KEY, > IO_MAPFILE_BLOOM_SIZE_DEFAULT); > float errorRate = conf.getFloat(IO_MAPFILE_BLOOM_ERROR_RATE_KEY, > IO_MAPFILE_BLOOM_ERROR_RATE_DEFAULT); > int vectorSize = (int) Math.ceil( > (double)(-HASH_COUNT * numKeys) / > Math.log(1.0 - Math.pow(errorRate, 1.0 / HASH_COUNT)) > ); {code} > When {{io.mapfile.bloom.error.rate}} is *≤ 0* or {*}≥ 1{*}: > * {{Math.pow(errorRate, 1/k)}} produces *NaN* (negative base with > non-integer exponent) or an invalid value; > * {{Math.log(1 - NaN)}} becomes {*}NaN{*}; > * {{Math.ceil(NaN)}} cast to {{int}} yields {*}0{*}, so {{{}vectorSize == > 0{}}}; > * constructing {{DynamicBloomFilter}} subsequently fails, and > {{BloomMapFile.Writer}} construction fails (observed as assertion failure in > tests). > The code misses input validation for {{io.mapfile.bloom.error.rate}} which > should be strictly within {{{}(0, 1){}}}. With invalid values, the math > silently degrades to NaN/0 and fails at runtime. > *Reproduction* > Injected values: {{io.mapfile.bloom.error.rate = 0,-1}} > Test: {{org.apache.hadoop.io.TestBloomMapFile#testBloomMapFileConstructors}} > {code:java} > [INFO] Running org.apache.hadoop.io.TestBloomMapFile > [ERROR] Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 0.358 > s <<< FAILURE! - in org.apache.hadoop.io.TestBloomMapFile > [ERROR] org.apache.hadoop.io.TestBloomMapFile.testBloomMapFileConstructors > Time elapsed: 0.272 s <<< FAILURE! > java.lang.AssertionError: testBloomMapFileConstructors error !!! > at > org.apache.hadoop.io.TestBloomMapFile.testBloomMapFileConstructors(TestBloomMapFile.java:287{code} -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org