Joe McDonnell has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/18576 )

Change subject: IMPALA-11325: Fix UnicodeDecodeError for shell file output
......................................................................

IMPALA-11325: Fix UnicodeDecodeError for shell file output

When using the --output_file commandline option for
impala-shell, the shell fails with UnicodeDecodeError
if the output contains Unicode characters.

For example, if running this command:
impala-shell -B -q "select '引'" --output_file=output.txt
This fails with:
UnicodeDecodeError : 'ascii' codec can't decode byte 0xe5 in position 0: 
ordinal not in range(128)

This happens due to an encode('utf-8') call happening
in OutputStream::write() on a string that is already UTF-8 encoded.
This changes the code to skip the encode('utf-8') call for Python 2.
Python 3 is using a string and still needs the encode call.

This is mostly a pragmatic fix to make the code a little bit
more functional, and there is more work to be done to have
clear contracts for the format() methods and clear points
of conversion to/from bytes.

Testing:
 - Ran shell tests with Python 2 and Python 3 on Ubuntu 18
 - Added a shell test that outputs a Unicode character
   to an output file. Without the fix, this test fails.

Change-Id: Ic40be3d530c2694465f7bd2edb0e0586ff0e1fba
Reviewed-on: http://gerrit.cloudera.org:8080/18576
Reviewed-by: Michael Smith <michael.sm...@cloudera.com>
Reviewed-by: Quanlong Huang <huangquanl...@gmail.com>
Tested-by: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
---
M shell/shell_output.py
M tests/shell/test_shell_commandline.py
2 files changed, 26 insertions(+), 1 deletion(-)

Approvals:
  Michael Smith: Looks good to me, but someone else must approve
  Quanlong Huang: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/18576
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ic40be3d530c2694465f7bd2edb0e0586ff0e1fba
Gerrit-Change-Number: 18576
Gerrit-PatchSet: 4
Gerrit-Owner: Joe McDonnell <joemcdonn...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <joemcdonn...@cloudera.com>
Gerrit-Reviewer: Michael Smith <michael.sm...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <huangquanl...@gmail.com>

Reply via email to