Csaba Ringhofer created IMPALA-12927: ----------------------------------------
Summary: Support reading BINARY columns in JSON tables Key: IMPALA-12927 URL: https://issues.apache.org/jira/browse/IMPALA-12927 Project: IMPALA Issue Type: Sub-task Components: Backend Reporter: Csaba Ringhofer Currently Impala cannot read BINARY columns in JSON files written by Hive correctly and returns runtime errors: {code} select * from functional_json.binary_tbl; +----+--------------+------------+ | id | string_col | binary_col | +----+--------------+------------+ | 1 | ascii | NULL | | 2 | ascii | NULL | | 3 | null | NULL | | 4 | empty | | | 5 | valid utf8 | NULL | | 6 | valid utf8 | NULL | | 7 | invalid utf8 | NULL | | 8 | invalid utf8 | NULL | +----+--------------+------------+ WARNINGS: Error converting column: functional_json.binary_tbl.binary_col, type: STRING, data: 'binary1' Error parsing row: file: hdfs://localhost:20500/test-warehouse/binary_tbl_json/000000_0, before offset: 481 Error converting column: functional_json.binary_tbl.binary_col, type: STRING, data: 'binary2' Error parsing row: file: hdfs://localhost:20500/test-warehouse/binary_tbl_json/000000_0, before offset: 481 Error converting column: functional_json.binary_tbl.binary_col, type: STRING, data: 'árvíztűrőtükörfúró' Error parsing row: file: hdfs://localhost:20500/test-warehouse/binary_tbl_json/000000_0, before offset: 481 Error converting column: functional_json.binary_tbl.binary_col, type: STRING, data: '你好hello' Error parsing row: file: hdfs://localhost:20500/test-warehouse/binary_tbl_json/000000_0, before offset: 481 Error converting column: functional_json.binary_tbl.binary_col, type: STRING, data: '��' Error parsing row: file: hdfs://localhost:20500/test-warehouse/binary_tbl_json/000000_0, before offset: 481 Error converting column: functional_json.binary_tbl.binary_col, type: STRING, data: '�D3"' Error parsing row: file: hdfs://localhost:20500/test-warehouse/binary_tbl_json/000000_0, before offset: 481 {code} The single file in the table looks like this: {code} hdfs://localhost:20500/test-warehouse/binary_tbl_json/000000_0 {"id":1,"string_col":"ascii","binary_col":"binary1"} {"id":2,"string_col":"ascii","binary_col":"binary2"} {"id":3,"string_col":"null","binary_col":null} {"id":4,"string_col":"empty","binary_col":""} {"id":5,"string_col":"valid utf8","binary_col":"árvíztűrőtükörfúró"} {"id":6,"string_col":"valid utf8","binary_col":"你好hello"} {"id":7,"string_col":"invalid utf8","binary_col":"\u0000�\u0000�"} {"id":8,"string_col":"invalid utf8","binary_col":"�D3\"\u0011\u0000"} {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org