Mala Chikka Kempanna created PARQUET-26:
-------------------------------------------

             Summary: Parquet doesn't recognize the nested Array type in MAP as 
ArrayWritable.
                 Key: PARQUET-26
                 URL: https://issues.apache.org/jira/browse/PARQUET-26
             Project: Parquet
          Issue Type: Bug
            Reporter: Mala Chikka Kempanna
         Attachments: test.dat

When trying to insert hive data of type of MAP<string, array<int>> into 
Parquet, it throws the following error 

Caused by: parquet.io.ParquetEncodingException: This should be an ArrayWritable 
or MapWritable: 
org.apache.hadoop.hive.ql.io.parquet.writable.BinaryWritable@c644ef1c 
at 
org.apache.hadoop.hive.ql.io.parquet.write.DataWritableWriter.writeData(DataWritableWriter.java:86)
 

Problem is reproducible with following steps:
Relevant test data is attached.

1. 
CREATE TABLE test_hive (
node string,
stime string,
stimeutc string,
swver string,
moid MAP <string,string>,
pdfs MAP <string,array<int>>,
utcdate string,
motype string)
ROW FORMAT DELIMITED
    FIELDS TERMINATED BY '|'
    COLLECTION ITEMS TERMINATED BY ','
    MAP KEYS TERMINATED BY '=';


2.
LOAD DATA LOCAL INPATH '/root/38388/test.dat' INTO TABLE test_hive; 

3.

CREATE TABLE test_parquet(
pdfs MAP <string,array<int>>
)
STORED AS PARQUET ;

4.

INSERT INTO TABLE test_parquet SELECT pdfs FROM test_hive;



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to