[issue7961] Py3k: decoding empty bytestring with invalid encoding throws no error

2010-02-22 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: I think it's better to close the ticket as won't fix. -- resolution: - wont fix status: open - closed ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7961

[issue7961] Py3k: decoding empty bytestring with invalid encoding throws no error

2010-02-20 Thread Ori Avtalion
Ori Avtalion o...@avtalion.name added the comment: In which specific case did you find the problem you mentioned ? I didn't. I only pointed out the inconsistency. I'm happy with rejecting this bug, if it's not seen as a problem. -- ___ Python

[issue7961] Py3k: decoding empty bytestring with invalid encoding throws no error

2010-02-19 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Mark Dickinson wrote: Mark Dickinson dicki...@gmail.com added the comment: Thanks for the patch. Rather than remove that optimization entirely, I'd consider pushing it into PyUnicode_Decode. All tests (whether for the standard

[issue7961] Py3k: decoding empty bytestring with invalid encoding throws no error

2010-02-19 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Ori Avtalion wrote: Ori Avtalion o...@avtalion.name added the comment: Ignoring the custom utf-8/latin-8 conversion functions, the actual checking if a codec exists is done in Python/codecs.c's PyCodec_Decode. Is that where I

[issue7961] Py3k: decoding empty bytestring with invalid encoding throws no error

2010-02-18 Thread Mark Dickinson
Mark Dickinson dicki...@gmail.com added the comment: Specifically, the behaviour comes from an early check for empty strings in the PyUnicode_FromEncodedObject function: /* Convert to Unicode */ if (len == 0) { Py_INCREF(unicode_empty); v = (PyObject *)unicode_empty;

[issue7961] Py3k: decoding empty bytestring with invalid encoding throws no error

2010-02-18 Thread Ori Avtalion
Ori Avtalion o...@avtalion.name added the comment: OK. The attached patch removes the empty string check before decoding. I'm not sure where tests should go, since I can only find them in Lib/test/ and this is not a library change. -- keywords: +patch Added file:

[issue7961] Py3k: decoding empty bytestring with invalid encoding throws no error

2010-02-18 Thread Mark Dickinson
Mark Dickinson dicki...@gmail.com added the comment: Thanks for the patch. Rather than remove that optimization entirely, I'd consider pushing it into PyUnicode_Decode. All tests (whether for the standard library or for the core) go into Lib/test, so that would be the right place.

[issue7961] Py3k: decoding empty bytestring with invalid encoding throws no error

2010-02-18 Thread Mark Dickinson
Mark Dickinson dicki...@gmail.com added the comment: I take that back: test_codecs_errors isn't the right function to add these tests to. I actually don't see any current tests for invalid codecs. Part of the problem would be coming up with an invalid codec name in the first place: as I

[issue7961] Py3k: decoding empty bytestring with invalid encoding throws no error

2010-02-18 Thread Mark Dickinson
Mark Dickinson dicki...@gmail.com added the comment: And PyUnicode_Decode doesn't look up the encoding in the registry either: that's somewhere in PyCodec_Decode. I'm going to butt out now and leave this to those who know the code better. :) --

[issue7961] Py3k: decoding empty bytestring with invalid encoding throws no error

2010-02-18 Thread Ori Avtalion
Ori Avtalion o...@avtalion.name added the comment: Ignoring the custom utf-8/latin-8 conversion functions, the actual checking if a codec exists is done in Python/codecs.c's PyCodec_Decode. Is that where I should move the aforementioned optimization to? Is it safe to assume that the decoded