[issue2301] [Py3k] No text shown when SyntaxError (when not UTF8)
Hirokazu Yamamoto [EMAIL PROTECTED] added the comment: Hello. I tracked down source code and found where err-text is set. Index: Parser/parsetok.c === --- Parser/parsetok.c (revision 61411) +++ Parser/parsetok.c (working copy) @@ -218,7 +218,7 @@ assert(tok-cur - tok-buf INT_MAX); err_ret-offset = (int)(tok-cur - tok-buf); len = tok-inp - tok-buf; - text = PyTokenizer_RestoreEncoding(tok, len, err_ret-offset); +/* text = PyTokenizer_RestoreEncoding(tok, len, err_ret-offset); */ if (text == NULL) { text = (char *) PyObject_MALLOC(len + 1); if (text != NULL) { It seems tok-buf is encoded with UTF-8, and PyTokenizer_RestoreEncoding() resotores it to original encoding of source file. So I tried above patch, output was expected on cp932/euc_jp source files. Maybe this function is not needed in py3k? I cannot find other place where this function is used. # Probably PyErr_ProgramText() needs more effort to be fixed. __ Tracker [EMAIL PROTECTED] http://bugs.python.org/issue2301 __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2301] [Py3k] No text shown when SyntaxError (when not UTF8)
Martin v. Löwis [EMAIL PROTECTED] added the comment: You are probably right about the source of the problem; I was confusing it with a regular exception, e.g. print(年,a) However, I also fail to reproduce the problem on OSX. I get File a.py, line 3 print �N ^ SyntaxError: invalid syntax I'm not quite sure what the N is doing in there, but the first character is the replacement character (hopefully, the tracker will reproduce it correctly); I get that because pythonrun uses the replace codec. I guess you are not seeing it because then the replacement character cannot actually be output to your terminal. Please try print(\ufffd) to see what that does. __ Tracker [EMAIL PROTECTED] http://bugs.python.org/issue2301 __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2301] [Py3k] No text shown when SyntaxError (when not UTF8)
Hirokazu Yamamoto [EMAIL PROTECTED] added the comment: I was confusing it with a regular exception, e.g. print(年,a) I'm now invesigating this problem. This comes from another reason. Please look at fp_setreadl in Parser/tokenizer.c. This function opens file using codec and doesn't seek to current position. (fp_setreadl is used when codecs is neigher utf-8 nor iso-8859-1 tok-decoding_state == STATE_NORMAL) So # coding: ascii # 1 # 2 # 3 raise RuntimeError(a) # 4 # 5 # 6 outputs C:\Documents and Settings\WhiteRabbitpy3k ascii.py Traceback (most recent call last): File ascii.py, line 6, in module # 4 RuntimeError: a [22821 refs] # One line shifted. And # dummy # coding: ascii # 1 # 2 # 3 raise RuntimeError(a) # 4 # 5 # 6 outputs C:\Documents and Settings\WhiteRabbitpy3k ascii.py Traceback (most recent call last): File ascii.py, line 8, in module # 5 RuntimeError: a [22821 refs] # Two lines shifted. __ Tracker [EMAIL PROTECTED] http://bugs.python.org/issue2301 __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2301] [Py3k] No text shown when SyntaxError (when not UTF8)
Hirokazu Yamamoto [EMAIL PROTECTED] added the comment: However, I also fail to reproduce the problem on OSX. I get File a.py, line 3 print �N ^ SyntaxError: invalid syntax Umm, strange... I can output correct result even if using euc_jp (my terminal named command prompt cannot output euc_jp string directly, AFAIK) print(\ufffd) print(\ufffd) Traceback (most recent call last): File stdin, line 1, in module File e:\python-dev\py3k\lib\io.py, line 1247, in write b = encoder.encode(s) UnicodeEncodeError: 'cp932' codec can't encode character '\ufffd' in position 0: illegal multibyte sequence __ Tracker [EMAIL PROTECTED] http://bugs.python.org/issue2301 __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2301] [Py3k] No text shown when SyntaxError (when not UTF8)
Hirokazu Yamamoto [EMAIL PROTECTED] added the comment: I'm now invesigating this problem. This comes from another reason. Of course, even if this line number problem is fixed, encoding problem still remains. Probably I'll look at it next. __ Tracker [EMAIL PROTECTED] http://bugs.python.org/issue2301 __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2301] [Py3k] No text shown when SyntaxError (when not UTF8)
Martin v. Löwis [EMAIL PROTECTED] added the comment: The original issue is now fixed in r61462. Please open another issue for the case of regular exceptions. -- resolution: - fixed status: open - closed __ Tracker [EMAIL PROTECTED] http://bugs.python.org/issue2301 __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2301] [Py3k] No text shown when SyntaxError (when not UTF8)
Changes by Hirokazu Yamamoto [EMAIL PROTECTED]: -- title: [Py3k] - [Py3k] No text shown when SyntaxError (when not UTF8) __ Tracker [EMAIL PROTECTED] http://bugs.python.org/issue2301 __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2301] [Py3k] No text shown when SyntaxError (when not UTF8)
Hirokazu Yamamoto [EMAIL PROTECTED] added the comment: Probably same problem exists in PyErr_ProgramText(). __ Tracker [EMAIL PROTECTED] http://bugs.python.org/issue2301 __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2301] [Py3k] No text shown when SyntaxError (when not UTF8)
Martin v. Löwis [EMAIL PROTECTED] added the comment: This will involve quite some work to fix. When fetching the code, the source encoding must be recognized. Contributions are welcome. (I personally consider this issue minor, as I would encourage users to use UTF-8 as the source encoding, anyway). -- nosy: +loewis __ Tracker [EMAIL PROTECTED] http://bugs.python.org/issue2301 __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com