[issue3594] PyTokenizer_FindEncoding() never succeeds

2008-09-03 Thread Antoine Pitrou
Antoine Pitrou [EMAIL PROTECTED] added the comment: I don't understand the whole decoding machinery in the tokenizer, but the patch looks ok to me. (tested in debug mode under Linux and Windows) -- nosy: +pitrou ___ Python tracker [EMAIL PROTECTED]

[issue3594] PyTokenizer_FindEncoding() never succeeds

2008-09-03 Thread Benjamin Peterson
Benjamin Peterson [EMAIL PROTECTED] added the comment: The patch also looks pretty harmless to me. :) -- nosy: +benjamin.peterson ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue3594 ___

[issue3594] PyTokenizer_FindEncoding() never succeeds

2008-09-03 Thread Hye-Shik Chang
Hye-Shik Chang [EMAIL PROTECTED] added the comment: pitrou, that's because Python source code can't be correctly tokenized when it's encoded in few odd encodings like iso-2022 or shift-jis which utilizes \, (, ) and as second byte of two-byte character sequence. For example, '\x81\\' is

[issue3594] PyTokenizer_FindEncoding() never succeeds

2008-09-03 Thread Brett Cannon
Brett Cannon [EMAIL PROTECTED] added the comment: Committed in r66209. -- resolution: - accepted status: open - closed ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue3594 ___

[issue3594] PyTokenizer_FindEncoding() never succeeds

2008-08-21 Thread Brett Cannon
Changes by Brett Cannon [EMAIL PROTECTED]: -- keywords: +needs review ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue3594 ___ ___ Python-bugs-list

[issue3594] PyTokenizer_FindEncoding() never succeeds

2008-08-18 Thread Brett Cannon
New submission from Brett Cannon [EMAIL PROTECTED]: Turns out that PyTokenizer_FindEncoding() never properly succeeds because the tok_state used by it does not have tok-filename set, which is an error condition in the tokenizer. This error has been masked by the one place the function is used,

[issue3594] PyTokenizer_FindEncoding() never succeeds

2008-08-18 Thread Brett Cannon
Brett Cannon [EMAIL PROTECTED] added the comment: I have not bothered to check if this exists in 2.6, but I don't see why it would be any different. -- type: - behavior ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue3594

[issue3594] PyTokenizer_FindEncoding() never succeeds

2008-08-18 Thread Brett Cannon
Brett Cannon [EMAIL PROTECTED] added the comment: Turns out that the NULL return value can signal an error that manifests itself as SyntaxError(encoding problem: with BOM) thanks to the lack of tok-filename being set in Parser/tokenizer.c:fp_setreadl() which is called by check_coding_spec() and

[issue3594] PyTokenizer_FindEncoding() never succeeds

2008-08-18 Thread Brett Cannon
Brett Cannon [EMAIL PROTECTED] added the comment: Attached is a patch that fixes where the error occurs. By opening the file by either file name or file descriptor, the problem goes away. Once this patch is accepted then PyErr_Occurred() should be added to all uses of PyTokenizer_FindEncoding().