[ https://issues.apache.org/jira/browse/HIVE-21397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16790006#comment-16790006 ]
Denys Kuzmenko commented on HIVE-21397: --------------------------------------- Hi [~gopalv]. This patch was not intended for Hive, it's ORC related (see https://issues.apache.org/jira/browse/ORC-477). https://github.com/apache/orc/pull/373 (branch-1.5) https://github.com/apache/orc/pull/374 (master) On Hive side we would need to update orc-core version to 1.5.5, once it's released. > BloomFilter for hive Managed [ACID] table does not work as expected > ------------------------------------------------------------------- > > Key: HIVE-21397 > URL: https://issues.apache.org/jira/browse/HIVE-21397 > Project: Hive > Issue Type: Bug > Components: Hive, HiveServer2, Transactions > Affects Versions: 3.1.1 > Reporter: vaibhav > Assignee: Denys Kuzmenko > Priority: Blocker > Attachments: HIVE-21397.1.patch, OrcUtils.patch, orc_file_dump.out, > orc_file_dump.q > > > Steps to Reproduce this issue : > ----------------------------------------- > 1. Create a HIveManaged table as below : > ----------------------------------------- > {code:java} > CREATE TABLE `bloomTest`( > `msisdn` string, > `imsi` varchar(20), > `imei` bigint, > `cell_id` bigint) > ROW FORMAT SERDE > 'org.apache.hadoop.hive.ql.io.orc.OrcSerde' > STORED AS INPUTFORMAT > 'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' > OUTPUTFORMAT > 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' > LOCATION > > 'hdfs://c1162-node2.squadron-labs.com:8020/warehouse/tablespace/managed/hive/bloomTest; > > TBLPROPERTIES ( > 'bucketing_version'='2', > 'orc.bloom.filter.columns'='msisdn,cell_id,imsi', > 'orc.bloom.filter.fpp'='0.02', > 'transactional'='true', > 'transactional_properties'='default', > 'transient_lastDdlTime'='1551206683') {code} > ----------------------------------------- > 2. Insert a few rows. > ----------------------------------------- > ----------------------------------------- > 3. Check if bloom filter or active : [ It does not show bloom filters for > hive managed tables ] > ----------------------------------------- > {code:java} > [hive@c1162-node2 root]$ hive --orcfiledump > hdfs://c1162-node2.squadron-labs.com:8020/warehouse/tablespace/managed/hive/bloomTest/delta_0000001_0000001_0000 > | grep -i bloom > SLF4J: Class path contains multiple SLF4J bindings. > SLF4J: Found binding in > [jar:file:/usr/hdp/3.1.0.0-78/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class] > > SLF4J: Found binding in > [jar:file:/usr/hdp/3.1.0.0-78/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class] > > SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an > explanation. > SLF4J: Actual binding is of type > [org.apache.logging.slf4j.Log4jLoggerFactory] > Processing data file > hdfs://c1162-node2.squadron-labs.com:8020/warehouse/tablespace/managed/hive/bloomTest/delta_0000001_0000001_0000/bucket_00000 > [length: 791] > Structure for > hdfs://c1162-node2.squadron-labs.com:8020/warehouse/tablespace/managed/hive/bloomTest/delta_0000001_0000001_0000/bucket_00000 > {code} > ----------------------------------------- > On Another hand: For hive External tables it works : > ----------------------------------------- > {code:java} > CREATE external TABLE `ext_bloomTest`( > `msisdn` string, > `imsi` varchar(20), > `imei` bigint, > `cell_id` bigint) > ROW FORMAT SERDE > 'org.apache.hadoop.hive.ql.io.orc.OrcSerde' > STORED AS INPUTFORMAT > 'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' > OUTPUTFORMAT > 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' > TBLPROPERTIES ( > 'bucketing_version'='2', > 'orc.bloom.filter.columns'='msisdn,cell_id,imsi', > 'orc.bloom.filter.fpp'='0.02') {code} > ----------------------------------------- > {code:java} > [hive@c1162-node2 root]$ hive --orcfiledump > hdfs://c1162-node2.squadron-labs.com:8020/warehouse/tablespace/external/hive/ext_bloomTest/000000_0 > | grep -i bloom > SLF4J: Class path contains multiple SLF4J bindings. > SLF4J: Found binding in > [jar:file:/usr/hdp/3.1.0.0-78/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class] > > SLF4J: Found binding in > [jar:file:/usr/hdp/3.1.0.0-78/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class] > > SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an > explanation. > SLF4J: Actual binding is of type > [org.apache.logging.slf4j.Log4jLoggerFactory] > Processing data file > hdfs://c1162-node2.squadron-labs.com:8020/warehouse/tablespace/external/hive/ext_bloomTest/000000_0 > [length: 755] > Structure for > hdfs://c1162-node2.squadron-labs.com:8020/warehouse/tablespace/external/hive/ext_bloomTest/000000_0 > > Stream: column 1 section BLOOM_FILTER_UTF8 start: 41 length 110 > Stream: column 2 section BLOOM_FILTER_UTF8 start: 178 length 114 > Stream: column 4 section BLOOM_FILTER_UTF8 start: 340 length 109 {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)