[issue2541] Unicode escape sequences not parsed in raw strings.
Georg Brandl [EMAIL PROTECTED] added the comment: Please apply the patch, but rename Unicode escapes to \u and \U escapes first. -- assignee: georg.brandl - benjamin.peterson resolution: rejected - fixed __ Tracker [EMAIL PROTECTED] http://bugs.python.org/issue2541 __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2541] Unicode escape sequences not parsed in raw strings.
Benjamin Peterson [EMAIL PROTECTED] added the comment: Fixed in r62568. -- status: open - closed __ Tracker [EMAIL PROTECTED] http://bugs.python.org/issue2541 __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2541] Unicode escape sequences not parsed in raw strings.
Neal Norwitz [EMAIL PROTECTED] added the comment: What is the status of this bug? AFAICT, the code is now correct. Have the doc changes been applied? The resolution on this report should be updated too. It's currently rejected. -- nosy: +nnorwitz __ Tracker [EMAIL PROTECTED] http://bugs.python.org/issue2541 __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2541] Unicode escape sequences not parsed in raw strings.
Benjamin Peterson [EMAIL PROTECTED] added the comment: It's rejected because the OP wanted unicode escapes to be applied in unicode strings, and I haven't applied the docs because nobody has told me I should. __ Tracker [EMAIL PROTECTED] http://bugs.python.org/issue2541 __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2541] Unicode escape sequences not parsed in raw strings.
Marc-Andre Lemburg [EMAIL PROTECTED] added the comment: While that's true for cPickle, it is not for pickle. The pickle protocol itself is defined in terms of the raw-unicode-escape codec (see pickle.py). Besides, you cannot assume that the Python interpreter itself is the only use-case for these codecs. The raw-unicode-escape codec is well usable for other purposes where you need a compact way of encoding Unicode, especially if you're strings are mostly Latin-1 and only include non-UCS2 code points every now and then. That's also the reason why pickle uses it. __ Tracker [EMAIL PROTECTED] http://bugs.python.org/issue2541 __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2541] Unicode escape sequences not parsed in raw strings.
Marc-Andre Lemburg [EMAIL PROTECTED] added the comment: You can't change the codec - it's being used in other places as well, e.g. for use cases where you need to have a 7-bit encoded readable version of a Unicode object. Adding a new codec would be fine, though I don't know how this would map raw Unicode strings with non-ASCII characters in them to an 8-bit string. Perhaps this is not needed at all in Py3k. -- nosy: +lemburg __ Tracker [EMAIL PROTECTED] http://bugs.python.org/issue2541 __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2541] Unicode escape sequences not parsed in raw strings.
Changes by Marc-Andre Lemburg [EMAIL PROTECTED]: __ Tracker [EMAIL PROTECTED] http://bugs.python.org/issue2541 __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2541] Unicode escape sequences not parsed in raw strings.
Marc-Andre Lemburg [EMAIL PROTECTED] added the comment: You can't change the codec - it's being used in other places as well, e.g. for use cases where you need to have an 8-bit encoded readable version of a Unicode object (which happens to be Latin-1 + Unicode escapes for all non-Latin-1 characters, due to Unicode being a superset of Latin-1). Adding a new codec would be fine, though I don't know how this would map raw Unicode strings with non-Latin-1 characters in them to an 8-bit string. Perhaps this is not needed at all in Py3k. __ Tracker [EMAIL PROTECTED] http://bugs.python.org/issue2541 __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2541] Unicode escape sequences not parsed in raw strings.
Amaury Forgeot d'Arc [EMAIL PROTECTED] added the comment: Isn't unicode-escape enough for this purpose? __ Tracker [EMAIL PROTECTED] http://bugs.python.org/issue2541 __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2541] Unicode escape sequences not parsed in raw strings.
Marc-Andre Lemburg [EMAIL PROTECTED] added the comment: What do you mean with enough ? The raw-unicode-escape codec is used in Python 2.x to convert literal strings of the form ur to Unicode objects. It's a variant of the unicode-escape codec. The codec is also being used in cPickle, pickle, variants of pickle, Python code generators, etc. It serves its purpose, just like unicode-escape and all the other codecs in Python. __ Tracker [EMAIL PROTECTED] http://bugs.python.org/issue2541 __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2541] Unicode escape sequences not parsed in raw strings.
Amaury Forgeot d'Arc [EMAIL PROTECTED] added the comment: I mean: now that raw strings cannot represent all unicode points (or more precisely, they need the file encoding to do so), is there a use case for raw-unicode-escape that cannot be filled by the unicode-escape codec? Note that pickle does not use raw-unicode-escape as is: it replaces backslashes by \u005c. This has the nice effect that pickled strings can also be decoded by unicode-escape. That's why I propose to completely remove raw-unicode-escape, and use unicode-escape instead. __ Tracker [EMAIL PROTECTED] http://bugs.python.org/issue2541 __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2541] Unicode escape sequences not parsed in raw strings.
Amaury Forgeot d'Arc [EMAIL PROTECTED] added the comment: pickle still uses it when protocol=0 (and cPickle as well, but in trunk/ only of course) __ Tracker [EMAIL PROTECTED] http://bugs.python.org/issue2541 __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2541] Unicode escape sequences not parsed in raw strings.
Benjamin Peterson [EMAIL PROTECTED] added the comment: Sorry, Guido said this is not allowed: http://mail.python.org/pipermail/python-3000/2008-April/012952.html. I reverted it in r62165. -- resolution: fixed - rejected __ Tracker [EMAIL PROTECTED] http://bugs.python.org/issue2541 __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2541] Unicode escape sequences not parsed in raw strings.
Guido van Rossum [EMAIL PROTECTED] added the comment: The docs still need to be updated! An entry in what's new in 3.0 should also be added. -- assignee: - georg.brandl components: +Documentation -Unicode nosy: +georg.brandl, gvanrossum status: closed - open __ Tracker [EMAIL PROTECTED] http://bugs.python.org/issue2541 __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2541] Unicode escape sequences not parsed in raw strings.
Benjamin Peterson [EMAIL PROTECTED] added the comment: How's this? -- keywords: +patch Added file: http://bugs.python.org/file9947/py3k_raw_strings_unicode_escapes.patch __ Tracker [EMAIL PROTECTED] http://bugs.python.org/issue2541 __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2541] Unicode escape sequences not parsed in raw strings.
Guido van Rossum [EMAIL PROTECTED] added the comment: Instead of ignored (which might be read ambiguously) how about not treated specially? You also still need to add some words to whatsnew. __ Tracker [EMAIL PROTECTED] http://bugs.python.org/issue2541 __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2541] Unicode escape sequences not parsed in raw strings.
Benjamin Peterson [EMAIL PROTECTED] added the comment: not treated specially it is! Added file: http://bugs.python.org/file9948/py3k_raw_strings_unicode_escapes2.patch __ Tracker [EMAIL PROTECTED] http://bugs.python.org/issue2541 __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2541] Unicode escape sequences not parsed in raw strings.
Georg Brandl [EMAIL PROTECTED] added the comment: The segment use different rules for interpreting backslash escape sequences. should be killed entirely, and the whole rule told here. Also, a few paragraphs later there are more references to raw strings, e.g. When an ``'r'`` or ``'R'`` prefix is used in a string literal, which need to be fixed too. __ Tracker [EMAIL PROTECTED] http://bugs.python.org/issue2541 __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2541] Unicode escape sequences not parsed in raw strings.
Benjamin Peterson [EMAIL PROTECTED] added the comment: I made the requested improvements and mentioned it in NEWS. Is there worth putting in the tutorial, since it mentions Unicode strings and raw strings? Added file: http://bugs.python.org/file9952/py3k_raw_strings_unicode_escapes3.patch __ Tracker [EMAIL PROTECTED] http://bugs.python.org/issue2541 __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2541] Unicode escape sequences not parsed in raw strings.
Amaury Forgeot d'Arc [EMAIL PROTECTED] added the comment: What about the raw-unicode-escape codec? Can we leave it different from raw strings literals? __ Tracker [EMAIL PROTECTED] http://bugs.python.org/issue2541 __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2541] Unicode escape sequences not parsed in raw strings.
Benjamin Peterson [EMAIL PROTECTED] added the comment: You use the ur string mode. print ur\u0020 -- nosy: +benjamin.peterson resolution: - invalid status: open - closed __ Tracker [EMAIL PROTECTED] http://bugs.python.org/issue2541 __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2541] Unicode escape sequences not parsed in raw strings.
Benjamin Peterson [EMAIL PROTECTED] added the comment: Thanks for noticing, Amaury, and your patch works for me. -- priority: - critical __ Tracker [EMAIL PROTECTED] http://bugs.python.org/issue2541 __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2541] Unicode escape sequences not parsed in raw strings.
Amaury Forgeot d'Arc [EMAIL PROTECTED] added the comment: No, it's about python 3.0. I confirm the problem, and propose a patch: --- Python/ast.c.original 2008-04-03 15:12:15.548389400 +0200 +++ Python/ast.c2008-04-03 15:12:28.359475800 +0200 @@ -3232,7 +3232,7 @@ return NULL; } } -if (!*bytesmode !rawmode) { +if (!*bytesmode) { return decode_unicode(s, len, rawmode, encoding); } if (*bytesmode) { -- nosy: +amaury.forgeotdarc resolution: invalid - status: closed - open __ Tracker [EMAIL PROTECTED] http://bugs.python.org/issue2541 __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2541] Unicode escape sequences not parsed in raw strings.
Benjamin Peterson [EMAIL PROTECTED] added the comment: Fixed in r62128. -- resolution: - fixed status: open - closed __ Tracker [EMAIL PROTECTED] http://bugs.python.org/issue2541 __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2541] Unicode escape sequences not parsed in raw strings.
New submission from John Millikin [EMAIL PROTECTED]: According to http://docs.python.org/dev/3.0/reference/lexical_analysis.html#id9, raw strings with \u and \U escape sequences should have these sequences parsed as usual. However, they are currently escaped. r'\u0020' '\\u0020' Expected: r'\u0020' ' ' -- components: Unicode messages: 64890 nosy: jmillikin severity: normal status: open title: Unicode escape sequences not parsed in raw strings. type: behavior versions: Python 3.0 __ Tracker [EMAIL PROTECTED] http://bugs.python.org/issue2541 __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com