[issue1293741] doctest runner cannot handle non-ascii characters
Terry J. Reedy tjre...@udel.edu added the comment: OP: The doctest module fails when the expected result string has non-ascii charcaters even if the # -*- coding: XXX -*- line is properly set. I believe the claim in msg70907 of #2811 is correct: the file encoding only affects the conversion of *unicode* literals to unicode objects. It does not affect the conversion of byte literals to byte string objects. Nor does it affect the later interpretation of byte strings by testmod. As msg26299 also says, make the doctstring a unicode, not byte string, to have the encoding cookie take effect. So the original bug claim is invalid. That aside, the issue was fixed in 3.0 by making text be unicode. Seriously, issues like this were part of the motivation for 3.0. That aside, test modules are not revised in bugfix releases without severe reason. Closing for all these reasons: invalid, out-of-date, fixed; take one's pick. -- nosy: +tjreedy resolution: - out of date status: open - closed type: - feature request versions: +Python 2.7 -Python 2.4, Python 2.5, Python 2.6 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue1293741 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1293741] doctest runner cannot handle non-ascii characters
Changes by Éric Araujo mer...@netwok.org: -- nosy: +Merwok ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue1293741 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1293741] doctest runner cannot handle non-ascii characters
Wolodja Wentland wentl...@cl.uni-heidelberg.de added the comment: Here is some more information. --- snip --- Normal behaviour $ locale LANG=en_US.UTF-8 LC_CTYPE=en_US.UTF-8 LC_NUMERIC=POSIX LC_TIME=en_GB.UTF-8 LC_COLLATE=en_GB.UTF-8 LC_MONETARY=de_DE.UTF-8 LC_MESSAGES=en_US.UTF-8 LC_PAPER=de_DE.UTF-8 LC_NAME=en_US.UTF-8 LC_ADDRESS=de_DE.UTF-8 LC_TELEPHONE=de_DE.UTF-8 LC_MEASUREMENT=de_DE.UTF-8 LC_IDENTIFICATION=de_DE.UTF-8 LC_ALL= $ python2.6 Python 2.6.3 (r263:75183, Oct 6 2009, 17:19:56) [GCC 4.3.4] on linux2 Type help, copyright, credits or license for more information. print '缺陷' 缺陷 print u'缺陷' 缺陷 '缺陷' '\xe7\xbc\xba\xe9\x99\xb7' u'缺陷' u'\u7f3a\u9677' '缺陷'.decode('utf8') u'\u7f3a\u9677' u'\u7f3a\u9677' u'\u7f3a\u9677' $ cat unicode_bug.py #!/usr/bin/env python # -*- coding: UTF-8 -*- def print_string(): print '缺陷' 缺陷 pass def print_unicode(): print u'缺陷' 缺陷 pass def string_repr(): '缺陷' '\xe7\xbc\xba\xe9\x99\xb7' pass def unicode_repr(): u'缺陷' u'\u7f3a\u9677' pass def decode(): '缺陷'.decode('utf8') u'\u7f3a\u9677' pass def unicode_escape_repr(): u'\u7f3a\u9677' u'\u7f3a\u9677' pass if __name__ == __main__: import doctest doctest.testmod() $ python2.5 unicode_bug.py /usr/lib/python2.5/doctest.py:1460: UnicodeWarning: Unicode equal comparison failed to convert both arguments to Unicode - interpreting them as being unequal if got == want: /usr/lib/python2.5/doctest.py:1480: UnicodeWarning: Unicode equal comparison failed to convert both arguments to Unicode - interpreting them as being unequal if got == want: Traceback (most recent call last): File unicode_bug.py, line 48, in module doctest.testmod() File /usr/lib/python2.5/doctest.py, line 1815, in testmod runner.run(test) File /usr/lib/python2.5/doctest.py, line 1361, in run return self.__run(test, compileflags, out) File /usr/lib/python2.5/doctest.py, line 1277, in __run self.report_failure(out, test, example, got) File /usr/lib/python2.5/doctest.py, line 1141, in report_failure self._checker.output_difference(example, got, self.optionflags)) File /usr/lib/python2.5/doctest.py, line 1565, in output_difference return 'Expected:\n%sGot:\n%s' % (_indent(want), _indent(got)) UnicodeDecodeError: 'ascii' codec can't decode byte 0xe7 in position 14: ordinal not in range(128) $ python2.6 unicode_bug.py /usr/local/lib/python2.6/doctest.py:1475: UnicodeWarning: Unicode equal comparison failed to convert both arguments to Unicode - interpreting them as being unequal if got == want: /usr/local/lib/python2.6/doctest.py:1495: UnicodeWarning: Unicode equal comparison failed to convert both arguments to Unicode - interpreting them as being unequal if got == want: Traceback (most recent call last): File unicode_bug.py, line 48, in module doctest.testmod() File /usr/local/lib/python2.6/doctest.py, line 1830, in testmod runner.run(test) File /usr/local/lib/python2.6/doctest.py, line 1374, in run return self.__run(test, compileflags, out) File /usr/local/lib/python2.6/doctest.py, line 1290, in __run self.report_failure(out, test, example, got) File /usr/local/lib/python2.6/doctest.py, line 1154, in report_failure self._checker.output_difference(example, got, self.optionflags)) File /usr/local/lib/python2.6/doctest.py, line 1580, in output_difference return 'Expected:\n%sGot:\n%s' % (_indent(want), _indent(got)) UnicodeDecodeError: 'ascii' codec can't decode byte 0xe7 in position 14: ordinal not in range(128) $ nosetests -V nosetests version 0.11.1 $ nosetests --with-doctest -v unicode_bug.py Doctest: unicode_bug.decode ... ok Doctest: unicode_bug.print_string ... ok Doctest: unicode_bug.print_unicode ... /usr/local/lib/python2.6/doctest.py:1475: UnicodeWarning: Unicode equal comparison failed to convert both arguments to Unicode - interpreting them as being unequal if got == want: /usr/local/lib/python2.6/doctest.py:1495: UnicodeWarning: Unicode equal comparison failed to convert both arguments to Unicode - interpreting them as being unequal if got == want: ERROR Doctest: unicode_bug.string_repr ... FAIL Doctest: unicode_bug.unicode_escape_repr ... ok Doctest: unicode_bug.unicode_repr ... FAIL == ERROR: Doctest: unicode_bug.print_unicode -- Traceback (most recent call last): File /usr/local/lib/python2.6/doctest.py, line 2140, in runTest test, out=new.write, clear_globs=False) File /usr/local/lib/python2.6/doctest.py, line 1374, in run return self.__run(test, compileflags, out) File /usr/local/lib/python2.6/doctest.py, line 1290, in __run self.report_failure(out, test, example, got) File /usr/local/lib/python2.6/doctest.py, line 1154, in
[issue1293741] doctest runner cannot handle non-ascii characters
Changes by Wolodja Wentland wentl...@cl.uni-heidelberg.de: Added file: http://bugs.python.org/file15174/unicode_bug_literals.py ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue1293741 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1293741] doctest runner cannot handle non-ascii characters
Christoph Burgmer cburg...@ira.uka.de added the comment: My last patch only changed the encoding used in DocTestRunner.run(). This new patch will apply the same to DocTestCase.runTest(). -- Added file: http://bugs.python.org/file14422/doctest.unicode.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue1293741 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1293741] doctest runner cannot handle non-ascii characters
Christoph Burgmer cburg...@ira.uka.de added the comment: See attached patch which works for error reporting and verbose output. -- keywords: +patch nosy: +christoph Added file: http://bugs.python.org/file14407/doctest.unicode.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue1293741 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1293741] doctest runner cannot handle non-ascii characters
Changes by Rodrigo Bernardo Pimentel r...@isnomore.net: -- nosy: +rbp ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue1293741 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1293741] doctest runner cannot handle non-ascii characters
Luciano Ramalho luci...@ramalho.org added the comment: I have confirmed everything that akaihola reports in Python 2.4, 2.5 and 2.6, but the problem is not limited to non-matching test output. It also happens with doctests with zero failures when the module is run with the -v command-line switch, or testmod is called with verbose=True. The attached file shows a work-around: handle the UnicodeEncodeError thrown by testmod, and display the object attribute of the exception to see exactly where the problem is. -- nosy: +luciano title: doctest runner cannot handle non-ascii characters - doctest runner cannot handle non-ascii characters versions: +Python 2.5, Python 2.6 Added file: http://bugs.python.org/file12684/issue1293741.py ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue1293741 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com