Marc-Andre Lemburg m...@egenix.com added the comment:
I think it's better to close the ticket as won't fix.
--
resolution: - wont fix
status: open - closed
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7961
Ori Avtalion o...@avtalion.name added the comment:
In which specific case did you find the problem you mentioned ?
I didn't. I only pointed out the inconsistency.
I'm happy with rejecting this bug, if it's not seen as a problem.
--
___
Python
Marc-Andre Lemburg m...@egenix.com added the comment:
Mark Dickinson wrote:
Mark Dickinson dicki...@gmail.com added the comment:
Thanks for the patch.
Rather than remove that optimization entirely, I'd consider pushing it into
PyUnicode_Decode.
All tests (whether for the standard
Marc-Andre Lemburg m...@egenix.com added the comment:
Ori Avtalion wrote:
Ori Avtalion o...@avtalion.name added the comment:
Ignoring the custom utf-8/latin-8 conversion functions, the actual checking
if a codec exists is done in Python/codecs.c's PyCodec_Decode.
Is that where I
Mark Dickinson dicki...@gmail.com added the comment:
Specifically, the behaviour comes from an early check for empty strings in the
PyUnicode_FromEncodedObject function:
/* Convert to Unicode */
if (len == 0) {
Py_INCREF(unicode_empty);
v = (PyObject *)unicode_empty;
Ori Avtalion o...@avtalion.name added the comment:
OK.
The attached patch removes the empty string check before decoding.
I'm not sure where tests should go, since I can only find them in Lib/test/ and
this is not a library change.
--
keywords: +patch
Added file:
Mark Dickinson dicki...@gmail.com added the comment:
Thanks for the patch.
Rather than remove that optimization entirely, I'd consider pushing it into
PyUnicode_Decode.
All tests (whether for the standard library or for the core) go into Lib/test,
so that would be the right place.
Mark Dickinson dicki...@gmail.com added the comment:
I take that back: test_codecs_errors isn't the right function to add these
tests to. I actually don't see any current tests for invalid codecs. Part of
the problem would be coming up with an invalid codec name in the first place:
as I
Mark Dickinson dicki...@gmail.com added the comment:
And PyUnicode_Decode doesn't look up the encoding in the registry either:
that's somewhere in PyCodec_Decode.
I'm going to butt out now and leave this to those who know the code better. :)
--
Ori Avtalion o...@avtalion.name added the comment:
Ignoring the custom utf-8/latin-8 conversion functions, the actual checking if
a codec exists is done in Python/codecs.c's PyCodec_Decode.
Is that where I should move the aforementioned optimization to?
Is it safe to assume that the decoded
10 matches
Mail list logo