Csaba Ringhofer created IMPALA-12927:
----------------------------------------

             Summary: Support reading BINARY columns in JSON tables
                 Key: IMPALA-12927
                 URL: https://issues.apache.org/jira/browse/IMPALA-12927
             Project: IMPALA
          Issue Type: Sub-task
          Components: Backend
            Reporter: Csaba Ringhofer


Currently Impala cannot read BINARY columns in JSON files written by Hive 
correctly and returns runtime errors:

{code}

select * from functional_json.binary_tbl;
+----+--------------+------------+
| id | string_col   | binary_col |
+----+--------------+------------+
| 1  | ascii        | NULL       |
| 2  | ascii        | NULL       |
| 3  | null         | NULL       |
| 4  | empty        |            |
| 5  | valid utf8   | NULL       |
| 6  | valid utf8   | NULL       |
| 7  | invalid utf8 | NULL       |
| 8  | invalid utf8 | NULL       |
+----+--------------+------------+
WARNINGS: Error converting column: functional_json.binary_tbl.binary_col, type: 
STRING, data: 'binary1'
Error parsing row: file: 
hdfs://localhost:20500/test-warehouse/binary_tbl_json/000000_0, before offset: 
481
Error converting column: functional_json.binary_tbl.binary_col, type: STRING, 
data: 'binary2'
Error parsing row: file: 
hdfs://localhost:20500/test-warehouse/binary_tbl_json/000000_0, before offset: 
481
Error converting column: functional_json.binary_tbl.binary_col, type: STRING, 
data: 'árvíztűrőtükörfúró'
Error parsing row: file: 
hdfs://localhost:20500/test-warehouse/binary_tbl_json/000000_0, before offset: 
481
Error converting column: functional_json.binary_tbl.binary_col, type: STRING, 
data: '你好hello'
Error parsing row: file: 
hdfs://localhost:20500/test-warehouse/binary_tbl_json/000000_0, before offset: 
481
Error converting column: functional_json.binary_tbl.binary_col, type: STRING, 
data: '��'
Error parsing row: file: 
hdfs://localhost:20500/test-warehouse/binary_tbl_json/000000_0, before offset: 
481
Error converting column: functional_json.binary_tbl.binary_col, type: STRING, 
data: '�D3"'
Error parsing row: file: 
hdfs://localhost:20500/test-warehouse/binary_tbl_json/000000_0, before offset: 
481

{code}

The single file in the table looks like this:

{code}

 hdfs://localhost:20500/test-warehouse/binary_tbl_json/000000_0

{"id":1,"string_col":"ascii","binary_col":"binary1"}
{"id":2,"string_col":"ascii","binary_col":"binary2"}
{"id":3,"string_col":"null","binary_col":null}
{"id":4,"string_col":"empty","binary_col":""}
{"id":5,"string_col":"valid utf8","binary_col":"árvíztűrőtükörfúró"}
{"id":6,"string_col":"valid utf8","binary_col":"你好hello"}
{"id":7,"string_col":"invalid utf8","binary_col":"\u0000�\u0000�"}
{"id":8,"string_col":"invalid utf8","binary_col":"�D3\"\u0011\u0000"}

{code}

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

Reply via email to