Wrong delimiter is getting picked up for structs inside an array.
-----------------------------------------------------------------
Key: HIVE-2443
URL: https://issues.apache.org/jira/browse/HIVE-2443
Project: Hive
Issue Type: Bug
Components: Serializers/Deserializers
Reporter: Thulasi Ram Naidu P
Priority: Minor
I am trying to create table with multiple level of delimiters. But the default
LazySimpleSerDe doesn't pick up the second serializer for serializing a struct
inside an array which I specified using COLLECTION ITEMS DELIMITED BY.
My table looks like this:
create external table if not exists mytable(col1 bigint, col2 string,
col3 string, col4 double, col5 double, col6 double, col7 double, col8
array<struct<id1:string, id2:string, id3:string, id4:string,
id5:int>>)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '\t'
COLLECTION ITEMS TERMINATED BY ',:'
Location '<FILEPATH>';
Input data:
123456 XYZ1 RANDOM 1 1 1 1
x1:y1:z1:w1:5,x2:y2:z2:w1:5
When I do "Select * from mytable" I am expecting output to be
123456 XYZ1 RANDOM 1.0 1.0 1.0 1.0
[{"id1":"x1","id2":"y1","id3":"z1","id4":"w1","id5":5},{"id1":"x2","id2":"y2","id3":"z2","id4":"w1","id5":5}]
However, it is returning,
123456 XYZ1 RANDOM 1.0 1.0 1.0 1.0
[{"id1":"x1:y1:z1:w1:5","id2":null,"id3":null,"id4":null,"id5":null},{"id1":"x2:y2:z2:w1:5","id2":null,"id3":null,"id4":null,"id5":null}]
But when I changed the schema of table as
create external table if not exists mytable(col1 bigint, col2 string,
col3 string, col4 double, col5 double, col6 double, col7 double, col8
array<struct<id1:string, id2:string, id3:string, id4:string,
id5:int>>)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '\t'
COLLECTION ITEMS TERMINATED BY ','
MAP KEYS TERMINATED BY ':'
Now the select query is returning the values correctly.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira