Adam Tamas created IMPALA-10145: ----------------------------------- Summary: UnicodeDecodeError in Thrift 0.11.0 generated files Key: IMPALA-10145 URL: https://issues.apache.org/jira/browse/IMPALA-10145 Project: IMPALA Issue Type: Bug Reporter: Adam Tamas
If there is a string with undecodable characters in the query results, then an error will happen during the fetching while thrift 0.11.0 generated python files were in use which results in an UnicodeDecodeError. Depending on which protocol is in use with the impala-shell, the error will happen in different places. Examples for hs2-http and hs2 protocolls: {code:java} [localhost:28000] default> select unhex('aa'); Query: select unhex('aa') Query submitted at: 2020-09-04 12:41:14 (Coordinator: http://tadam-OptiPlex-7070:25000) Query progress can be monitored at: http://tadam-OptiPlex-7070:25000/query_plan?query_id=d041ab999f597fec:46a8b51800000000 Caught exception 'utf8' codec can't decode byte 0xaa in position 0: invalid start byte, type=<type 'exceptions.UnicodeDecodeError'> in FetchResults. Unknown Exception : 'utf8' codec can't decode byte 0xaa in position 0: invalid start byte Traceback (most recent call last): File "/home/tadam/imp/impala/shell/build/impala-shell-4.0.0-SNAPSHOT/impala_shell.py", line 1183, in _execute_stmt for rows in rows_fetched: File "/home/tadam/imp/impala/shell/build/impala-shell-4.0.0-SNAPSHOT/lib/impala_client.py", line 781, in fetch resp = self._do_hs2_rpc(FetchResults) File "/home/tadam/imp/impala/shell/build/impala-shell-4.0.0-SNAPSHOT/lib/impala_client.py", line 942, in _do_hs2_rpc return rpc() File "/home/tadam/imp/impala/shell/build/impala-shell-4.0.0-SNAPSHOT/lib/impala_client.py", line 778, in FetchResults return self.imp_service.FetchResults(req) File "/home/tadam/imp/impala/shell/build/impala-shell-4.0.0-SNAPSHOT/gen-py/TCLIService/TCLIService.py", line 717, in FetchResults return self.recv_FetchResults() File "/home/tadam/imp/impala/shell/build/impala-shell-4.0.0-SNAPSHOT/gen-py/TCLIService/TCLIService.py", line 736, in recv_FetchResults result.read(iprot) File "/home/tadam/imp/impala/shell/build/impala-shell-4.0.0-SNAPSHOT/gen-py/TCLIService/TCLIService.py", line 3593, in read self.success.read(iprot) File "/home/tadam/imp/impala/shell/build/impala-shell-4.0.0-SNAPSHOT/gen-py/TCLIService/ttypes.py", line 5888, in read self.results.read(iprot) File "/home/tadam/imp/impala/shell/build/impala-shell-4.0.0-SNAPSHOT/gen-py/TCLIService/ttypes.py", line 2670, in read _elem115.read(iprot) File "/home/tadam/imp/impala/shell/build/impala-shell-4.0.0-SNAPSHOT/gen-py/TCLIService/ttypes.py", line 2556, in read self.stringVal.read(iprot) File "/home/tadam/imp/impala/shell/build/impala-shell-4.0.0-SNAPSHOT/gen-py/TCLIService/ttypes.py", line 2352, in read _elem95 = iprot.readString().decode('utf-8') if sys.version_info[0] == 2 else iprot.readString() File "/usr/lib/python2.7/encodings/utf_8.py", line 16, in decode return codecs.utf_8_decode(input, errors, True) UnicodeDecodeError: 'utf8' codec can't decode byte 0xaa in position 0: invalid start byte [Not connected] > {code} {code:java} [localhost:21050] default> select unhex('aa'); Query: select unhex('aa') Query submitted at: 2020-09-04 12:42:22 (Coordinator: http://tadam-OptiPlex-7070:25000) Query progress can be monitored at: http://tadam-OptiPlex-7070:25000/query_plan?query_id=3a481e2a0581ea7c:a6e1901800000000 Caught exception 'utf8' codec can't decode byte 0xaa in position 0: invalid start byte, type=<type 'exceptions.UnicodeDecodeError'> in FetchResults. Unknown Exception : 'utf8' codec can't decode byte 0xaa in position 0: invalid start byte Traceback (most recent call last): File "/home/tadam/imp/impala/shell/build/impala-shell-4.0.0-SNAPSHOT/impala_shell.py", line 1183, in _execute_stmt for rows in rows_fetched: File "/home/tadam/imp/impala/shell/build/impala-shell-4.0.0-SNAPSHOT/lib/impala_client.py", line 781, in fetch resp = self._do_hs2_rpc(FetchResults) File "/home/tadam/imp/impala/shell/build/impala-shell-4.0.0-SNAPSHOT/lib/impala_client.py", line 942, in _do_hs2_rpc return rpc() File "/home/tadam/imp/impala/shell/build/impala-shell-4.0.0-SNAPSHOT/lib/impala_client.py", line 778, in FetchResults return self.imp_service.FetchResults(req) File "/home/tadam/imp/impala/shell/build/impala-shell-4.0.0-SNAPSHOT/gen-py/TCLIService/TCLIService.py", line 717, in FetchResults return self.recv_FetchResults() File "/home/tadam/imp/impala/shell/build/impala-shell-4.0.0-SNAPSHOT/gen-py/TCLIService/TCLIService.py", line 736, in recv_FetchResults result.read(iprot) File "/home/tadam/imp/impala/shell/build/impala-shell-4.0.0-SNAPSHOT/gen-py/TCLIService/TCLIService.py", line 3583, in read iprot._fast_decode(self, iprot, [self.__class__, self.thrift_spec]) UnicodeDecodeError: 'utf8' codec can't decode byte 0xaa in position 0: invalid start byte {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org