Mustafa İman created HIVE-24531:
-----------------------------------

             Summary: Vectorized table scan ignores binary column
                 Key: HIVE-24531
                 URL: https://issues.apache.org/jira/browse/HIVE-24531
             Project: Hive
          Issue Type: Bug
            Reporter: Mustafa İman


There is a binary field in over1k dataset in hive codebase. Vectorized table 
scan ignores binary field and passes as null in all rows. The issue affects 
insert queries too with external tables and managed tables when 
"hive.stats.autogather=false". 

To reproduce:

Add "set hive.stats.autogather=false;" on top of "vector_data_types.q"

Run mvn test -Dtest=TestMiniLlapLocalCliDriver -Dqfile=vector_data_types.q"

Observe that "bin" column is all NULL when querying any of the tables.

 

Below is a simplified version of the same test:
{code:java}
set hive.mapred.mode=nonstrict;
set hive.explain.user=false;
set hive.fetch.task.conversion=none;
set hive.stats.autogather=false;

DROP TABLE over1k_n8;
DROP TABLE over1korc_n1;

-- data setup
CREATE TABLE over1k_n8(t tinyint,
           si smallint,
           i int,
           b bigint,
           f float,
           d double,
           bo boolean,
           s string,
           ts timestamp,
           `dec` decimal(4,2),
           bin binary)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '|'
STORED AS TEXTFILE;

LOAD DATA LOCAL INPATH '../../data/files/over1k' OVERWRITE INTO TABLE over1k_n8;
analyze table over1k_n8 compute statistics;
analyze table over1k_n8 compute statistics for columns;

select * from over1k_n8 limit 10;
select count(1) from over1k_n8 where bin is null;

CREATE TABLE over1korc_n1(t tinyint,
           si smallint,
           i int,
           b bigint,
           f float,
           d double,
           bo boolean,
           s string,
           ts timestamp,
           `dec` decimal(4,2),
           bin binary)
STORED AS ORC;

explain vectorization detail
INSERT INTO TABLE over1korc_n1 SELECT * FROM over1k_n8;

INSERT INTO TABLE over1korc_n1 SELECT * FROM over1k_n8;

select count(1) from over1korc_n1 where bin is null;

select * from over1korc_n1 limit 10;
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to