Joe McDonnell has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/18576 )
Change subject: IMPALA-11325: Fix UnicodeDecodeError for shell file output ...................................................................... IMPALA-11325: Fix UnicodeDecodeError for shell file output When using the --output_file commandline option for impala-shell, the shell fails with UnicodeDecodeError if the output contains Unicode characters. For example, if running this command: impala-shell -B -q "select '引'" --output_file=output.txt This fails with: UnicodeDecodeError : 'ascii' codec can't decode byte 0xe5 in position 0: ordinal not in range(128) This happens due to an encode('utf-8') call happening in OutputStream::write() on a string that is already UTF-8 encoded. This changes the code to skip the encode('utf-8') call for Python 2. Python 3 is using a string and still needs the encode call. This is mostly a pragmatic fix to make the code a little bit more functional, and there is more work to be done to have clear contracts for the format() methods and clear points of conversion to/from bytes. Testing: - Ran shell tests with Python 2 and Python 3 on Ubuntu 18 - Added a shell test that outputs a Unicode character to an output file. Without the fix, this test fails. Change-Id: Ic40be3d530c2694465f7bd2edb0e0586ff0e1fba Reviewed-on: http://gerrit.cloudera.org:8080/18576 Reviewed-by: Michael Smith <michael.sm...@cloudera.com> Reviewed-by: Quanlong Huang <huangquanl...@gmail.com> Tested-by: Impala Public Jenkins <impala-public-jenk...@cloudera.com> --- M shell/shell_output.py M tests/shell/test_shell_commandline.py 2 files changed, 26 insertions(+), 1 deletion(-) Approvals: Michael Smith: Looks good to me, but someone else must approve Quanlong Huang: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/18576 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Ic40be3d530c2694465f7bd2edb0e0586ff0e1fba Gerrit-Change-Number: 18576 Gerrit-PatchSet: 4 Gerrit-Owner: Joe McDonnell <joemcdonn...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Joe McDonnell <joemcdonn...@cloudera.com> Gerrit-Reviewer: Michael Smith <michael.sm...@cloudera.com> Gerrit-Reviewer: Quanlong Huang <huangquanl...@gmail.com>