[
https://issues.apache.org/jira/browse/HIVE-9050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14239108#comment-14239108
]
Usein Faradzhev commented on HIVE-9050:
---------------------------------------
Example
CREATE TABLE master AS SELECT id FROM default.dual LATERAL VIEW
explode(split('1,2', ',')) s AS id;
CREATE TABLE detail(id int, str string) STORED AS ORC
TBLPROPERTIES("orc.compress"="SNAPPY");
INSERT INTO TABLE detail SELECT 1 AS id, str FROM default.dual LATERAL VIEW
explode(split(',', ',')) s AS str;
Values is empty
SELECT
d.*,
(CASE WHEN d.str IS NULL THEN 'IS_NULL'
WHEN d.str = '' THEN 'IS_EMPTY'
ELSE 'NOT_EMPTY'
END) value_type
FROM detail d;
d.id d.str value_type
1 IS_EMPTY
1 IS_EMPTY
Value is NULL instead of an empty values
SELECT
d.*,
(CASE WHEN d.str IS NULL THEN 'IS_NULL'
WHEN d.str = '' THEN 'IS_EMPTY'
ELSE 'NOT_EMPTY'
END) value_type
FROM detail d
JOIN master m ON m.id = d.id;
d.id d.str value_type
1 NULL IS_NULL
1 NULL IS_NULL
If to use textfile format, all query returns an empty values
CREATE TABLE detail_txt(id int, str string) STORED AS TEXTFILE;
INSERT INTO TABLE detail_txt SELECT 1 AS id, str FROM default.dual LATERAL VIEW
explode(split(',', ',')) s AS str;
SELECT
d.*,
(CASE WHEN d.str IS NULL THEN 'IS_NULL'
WHEN d.str = '' THEN 'IS_EMPTY'
ELSE 'NOT_EMPTY'
END) value_type
FROM detail_txt d;
d.id d.str value_type
1 IS_EMPTY
1 IS_EMPTY
SELECT
d.*,
(CASE WHEN d.str IS NULL THEN 'IS_NULL'
WHEN d.str = '' THEN 'IS_EMPTY'
ELSE 'NOT_EMPTY'
END) value_type
FROM detail_txt d
JOIN master m ON m.id = d.id;
d.id d.str value_type
1 IS_EMPTY
1 IS_EMPTY
> NULL values for empty strings when joining with ORC table
> ---------------------------------------------------------
>
> Key: HIVE-9050
> URL: https://issues.apache.org/jira/browse/HIVE-9050
> Project: Hive
> Issue Type: Bug
> Components: SQL
> Affects Versions: 0.13.0
> Environment: CentOS release 6.4 (Final), Hortonworks 2.1, Tez
> Hive 0.13.0.2.1.3.0-563
> Subversion
> git://ip-10-0-0-91/grid/0/jenkins/workspace/BIGTOP-HDP_RPM_REPO-baikal-GA-centos6/bigtop/build/hive/rpm/BUILD/h
> ive-0.13.0.2.1.3.0 -r a738a76c72d6d9dd304691faada57a94429256bc
> Compiled by jenkins on Thu Jun 26 18:28:50 EDT 2014
> From source with checksum 4dbd99dd254f0c521ad8ab072045325d
> Reporter: Usein Faradzhev
>
> When ORC table contains an empty strings and the SQL query contains at least
> one join a hive returns NULL instead of empty values.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)