Re: [Python-Dev] Draft PEP: Deprecate codecs.StreamReader and codecs.StreamWriter
Benjamin Peterson benjamin at python.org writes: 2011/7/6 Nick Coghlan ncoghlan at gmail.com: The API of the resulting object is the same (i.e. they're file-like objects). The behavioural differences are due to cases where the codec-specific classes are currently broken. Yes, but as we all know too well, people are surely relying on whatever behavior there is, broken or not. There's also the fact that code which currently runs under 2.x and 3.x would stop working if codecs.StreamReader/StreamWriter were to go away. Of course, if the codecs interfaces were re-implemented using io module code, the only portability issues would be because of people relying on broken aspects of the existing codecs code - which is unlikely to be all (or even most) of the people using codecs.StreamReader/StreamWriter. Regards, Vinay Sajip ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Draft PEP: Deprecate codecs.StreamReader and codecs.StreamWriter
Victor Stinner wrote: Hi, Last may, I proposed to deprecate open() function, StreamWriter and StreamReader classes of the codecs module. I accepted to keep open() after the discussion on python-dev. Here is a more complete proposition as a PEP. It is a draft and I expect a lot of comments :) The PEP's arguments for deprecating two essential codec design components are very one sided, by comparing issues to features. Please add all the comments I've made on the subject to the PEP. The most important one missing is the fact and major difference that TextIOWrapper does not work on a per codec basis, but only on a per stream basis. By removing the StreamReader and StreamWriter API parts of the codec design, you essentially make it impossible to add per codec variations and optimizations that require full access to the stream interface. A mentioned before, many improvements are possible and lots of those can build on TextIOWrapper and the incremental codec parts. That said, I'm not really up for a longer discussion on this. We've already had the discussion and decided against removing those parts of the codec API. Redirecting codecs.open() to open() should be investigated. For the issues you mention in the PEP, please open tickets or add ticket references to the PEP. Thanks, -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jul 07 2011) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ ::: Try our new mxODBC.Connect Python Database Interface for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ Victor --- PEP: xxx Title: Deprecate codecs.StreamReader and codecs.StreamWriter Version: $Revision$ Last-Modified: $Date$ Author: Victor Stinner Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 28-May-2011 Python-Version: 3.3 Abstract io.TextIOWrapper and codecs.StreamReaderWriter offer the same API [#f1]_. TextIOWrapper has more features and is faster than StreamReaderWriter. Duplicate code means that bugs should be fixed twice and that we may have subtle differences between the two implementations. The codecs modules was introduced in Python 2.0, see the PEP 100. The io module was introduced in Python 2.6 and 3.0 (see the PEP 3116), and reimplemented in C in Python 2.7 and 3.1. Motivation == When the Python I/O model was updated for 3.0, the concept of a stream-with-known-encoding was introduced in the form of io.TextIOWrapper. As this class is critical to the performance of text-based I/O in Python 3, this module has an optimised C version which is used by CPython by default. Many corner cases in handling buffering, stateful codecs and universal newlines have been dealt with since the release of Python 3.0. This new interface overlaps heavily with the legacy codecs.StreamReader, codecs.StreamWriter and codecs.StreamReaderWriter interfaces that were part of the original codec interface design in PEP 100. These interfaces are organised around the principle of an encoding with an associated stream (i.e. the reverse of arrangement in the io module), so the original PEP 100 design required that codec writers provide appropriate StreamReader and StreamWriter implementations in addition to the core codec encode() and decode() methods. This places a heavy burden on codec authors providing these specialised implementations to correctly handle many of the corner cases that have now been dealt with by io.TextIOWrapper. While deeper integration between the codec and the stream allows for additional optimisations in theory, these optimisations have in practice either not been carried out and else the associated code duplication means that the corner cases that have been fixed in io.TextIOWrapper are still not handled correctly in the various StreamReader and StreamWriter implementations. Accordingly, this PEP proposes that: * codecs.open() be updated to delegate to the builtin open() in Python 3.3; * the legacy codecs.Stream* interfaces, including the streamreader and streamwriter attributes of codecs.CodecInfo be deprecated in Python 3.3 and removed in Python 3.4. Rationale = StreamReader and StreamWriter issues * StreamReader is unable to translate newlines. * StreamReaderWriter handles reads using StreamReader and writes using StreamWriter. These two classes may be inconsistent. To stay consistent, flush() must be
Re: [Python-Dev] Draft PEP: Deprecate codecs.StreamReader and codecs.StreamWriter
On Thu, Jul 7, 2011 at 1:51 PM, Benjamin Peterson benja...@python.org wrote: 2011/7/6 Nick Coghlan ncogh...@gmail.com: On Thu, Jul 7, 2011 at 11:16 AM, Benjamin Peterson benja...@python.org wrote: 2011/7/6 Victor Stinner victor.stin...@haypocalc.com: codecs.open() will be changed to reuse the builtin open() function (TextIOWrapper). This doesn't strike me as particularly backwards compatible, since you've just enumerated the differences between StreamWriter/Reader and TextIOWrapper. The API of the resulting object is the same (i.e. they're file-like objects). The behavioural differences are due to cases where the codec-specific classes are currently broken. Yes, but as we all know too well, people are surely relying on whatever behavior there is, broken or not. True, but that's why changes like this are always a judgement call - is the gain in correctness worth the risk of breakage? We sometimes break workarounds when we fix bugs, too. From the discussion last time around, that particular change wasn't very controversial, which is why it is already in the 3.3 development tree. Unless somebody steps forward to fix them, the Stream* classes have to go (albeit with a suitable period of deprecation). They're *actively harmful* in their current state, so retaining the status quo is not a viable option in this case. Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Draft PEP: Deprecate codecs.StreamReader and codecs.StreamWriter
Nick Coghlan ncoghlan at gmail.com writes: Unless somebody steps forward to fix them, the Stream* classes have to go (albeit with a suitable period of deprecation). They're *actively harmful* in their current state, so retaining the status quo is not a viable option in this case. I can understand that there might be specific issues with them, but isn't actively harmful a little strong? I don't see who is being actively harmed by them, nor how. Regards, Vinay Sajip ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Draft PEP: Deprecate codecs.StreamReader and codecs.StreamWriter
On Thu, Jul 7, 2011 at 6:49 PM, Vinay Sajip vinay_sa...@yahoo.co.uk wrote: Nick Coghlan ncoghlan at gmail.com writes: Unless somebody steps forward to fix them, the Stream* classes have to go (albeit with a suitable period of deprecation). They're *actively harmful* in their current state, so retaining the status quo is not a viable option in this case. I can understand that there might be specific issues with them, but isn't actively harmful a little strong? I don't see who is being actively harmed by them, nor how. Anyone forward porting codecs.open based code will get subpar IO in Python 3 *because* they're trying to do the right thing in Python 2. That's actively harmful in my book. Codec writers are also told to implement utterly unnecessary functionality just because PEP 100 says so. Significantly less common, but still harmful. The bare minimum change needed is for codecs.open() to do the right thing in Py3k and be a wrapper around builtin open() and the main IO stack. Once that happens, the legacy Stream* APIs become redundant cruft that should be deprecated (although that part is significantly less important than fixing codecs.open() itself) Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Draft PEP: Deprecate codecs.StreamReader and codecs.StreamWriter
Le 07/07/2011 03:16, Benjamin Peterson a écrit : 2011/7/6 Victor Stinnervictor.stin...@haypocalc.com: codecs.open() will be changed to reuse the builtin open() function (TextIOWrapper). This doesn't strike me as particularly backwards compatible, since you've just enumerated the differences between StreamWriter/Reader and TextIOWrapper. Which kind of differences are you thinking about? I only listed two attributes specific to StreamReaderWriter (.reader and .writer). You mean that these attributes are used? There are maybe other subtle differences between Stream* and TextIOWrapper, but I don't think that anyone relies on them. Should I try to list all differences in the PEP? If I understood correctly the previous discussion, an important point is to be able to write code Python 2 which just works on Python 3 without using 2to3. If you don't rely on the subtle implementation details of Stream*, it's possible (to use codecs.open in Python 3, even if codecs.open is implemented to reuse TextIOWrapper via open). If you rely on the differences, I bet that it is easy to not use these differences (and continue to be compatible with Python 2). For example, you can replace f.reader.read() by f.read(), it's just the same. The two classical usages of codecs.open() (of text files) are: - read the whole file content - read the file line by line For this two usecases, the API is exactly the same. Using f=codecs.open() or f=open() (in Python 3, or f=io.open() in Python 2), you can use: - for line in f: ... - while True: line = f.readline(); if not line: break; ... - f.read() I'm not saying that my PEP doesn't break the compatibility, it *does* break the backward compatibility. That's why we need a PEP. That's why there is a section called Backwards Compatibility in the PEP. I'm trying to say that I bet that nobody will notice. The most impacting change will be (at the end) the removal of the StreamReader/StreamWriter API. If a program uses directly these classes, it will have to be updated to use TextIOWrapper (or codecs.open() or open() maybe). I wrote in a previous email: I did search for usage of these classes on the Internet, and except projects implementing their own codecs (and so implement their StreamReader/StreamWriter classes, even if they don't use it), I only found one project using directly StreamReader: pygment (*). I searched quickly, so don't trust these results :-) StreamReader friends are used indirectly through codecs.open(). My patch changes codecs.open() to make it reuse open (io.TextIOWrapper), so the deprecation of StreamReader would not be noticed by most users. (*) I also found Sphinx, but I was wrong: it doesn't use StreamReader, it just has a full copy of the UTF-8-SIG codec which has a StreamReader class. I don't think that the class is used. Victor ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Draft PEP: Deprecate codecs.StreamReader and codecs.StreamWriter
Le 07/07/2011 05:26, Nick Coghlan a écrit : Victor, could you please check this into the PEPs repo? It's easier to reference once it has a real number. How do I upload it? Should I contact a PEP editor? How? Victor ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Draft PEP: Deprecate codecs.StreamReader and codecs.StreamWriter
Le 07/07/2011 10:07, M.-A. Lemburg a écrit : The PEP's arguments for deprecating two essential codec design components are very one sided, by comparing issues to features. Yes, please help me to write an unbiased PEP. I don't know which tool is more appropriate to write a PEP with many authors. Can I upload it to the peps repository? According to the PEP 1, only a PEP editor can do that. Please add all the comments I've made on the subject to the PEP. I tried to incorporate all of your comments, but because the discussion on the bug tracker and on python-dev was long, I missed maybe some (important) points. Sorry about that, and please tell me which points should be added to the PEP. The most important one missing is the fact and major difference that TextIOWrapper does not work on a per codec basis, but only on a per stream basis. Yeah, it's not clear in the PEP, I should detail this point. By removing the StreamReader and StreamWriter API parts of the codec design, you essentially make it impossible to add per codec variations and optimizations that require full access to the stream interface. A mentioned before, many improvements are possible and lots of those can build on TextIOWrapper and the incremental codec parts. I wrote that in the Possible improvements of StreamReader and StreamWriter section: A codec can implement variants which are optimized for the specific encoding ... and It would be possible to change StreamReader and StreamWriter to make them use IncrementalDecoder and IncrementalEncoder. For the issues you mention in the PEP, please open tickets or add ticket references to the PEP. Ok, I will do that. There are other Stream* issues, a recent example: http://bugs.python.org/issue12508 Victor ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Draft PEP: Deprecate codecs.StreamReader and codecs.StreamWriter
Nick Coghlan ncoghlan at gmail.com writes: Anyone forward porting codecs.open based code will get subpar IO in Python 3 *because* they're trying to do the right thing in Python 2. That's actively harmful in my book. I see. Presumably if they're doing a porting exercise, then it's easy enough for them to convert codecs.open() calls to open(), if they don't want the performance to be sub-optimal. But I thought the main thrust of this was about deprecation of the Stream* classes, not open() vs. codecs.open(). Codec writers are also told to implement utterly unnecessary functionality just because PEP 100 says so. Significantly less common, but still harmful. Presumably only an issue for anyone writing new codecs for 2.x - I'm not sure how many cases that'd be. The bare minimum change needed is for codecs.open() to do the right thing in Py3k and be a wrapper around builtin open() and the main IO stack. Once that happens, the legacy Stream* APIs become redundant cruft that should be deprecated (although that part is significantly less important than fixing codecs.open() itself) I've no issue with telling people to use open() rather than codecs.open() when moving code from 2.x to 3.x. But in 2.x, is there any other API which allows you to wrap arbitrary streams? If not, then ISTM that removing the Stream* classes would give 2.x-3.x porting projects more trouble than codecs.open() - open(). Regards, Vinay Sajip ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Draft PEP: Deprecate codecs.StreamReader and codecs.StreamWriter
On Jul 07, 2011, at 12:26 PM, Victor Stinner wrote: Le 07/07/2011 05:26, Nick Coghlan a écrit : Victor, could you please check this into the PEPs repo? It's easier to reference once it has a real number. How do I upload it? Should I contact a PEP editor? How? Email p...@python.org Cheers, -Barry signature.asc Description: PGP signature ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Draft PEP: Deprecate codecs.StreamReader and codecs.StreamWriter
On Thu, Jul 7, 2011 at 9:12 PM, Barry Warsaw ba...@python.org wrote: On Jul 07, 2011, at 12:26 PM, Victor Stinner wrote: Le 07/07/2011 05:26, Nick Coghlan a écrit : Victor, could you please check this into the PEPs repo? It's easier to reference once it has a real number. How do I upload it? Should I contact a PEP editor? How? Email p...@python.org Or just check it in to hg.python.org/peps (claiming the next number in sequence - 400 at the time of writing this email). I asked if that approach was OK quite some time ago and David said yes - PEP 1 is written the way it is because not everyone that writes a PEP has commit privileges for the python.org repos. Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Draft PEP: Deprecate codecs.StreamReader and codecs.StreamWriter
On Thu, Jul 7, 2011 at 8:53 PM, Vinay Sajip vinay_sa...@yahoo.co.uk wrote: I've no issue with telling people to use open() rather than codecs.open() when moving code from 2.x to 3.x. But in 2.x, is there any other API which allows you to wrap arbitrary streams? If not, then ISTM that removing the Stream* classes would give 2.x-3.x porting projects more trouble than codecs.open() - open(). No, using the io module is a far more robust way to wrap arbitrary streams than using the codecs module. It's unfortunate that nobody pointed out the redundancy when PEP 3116 was discussed and implemented, as I expect PEP 100 would have been updated and the Stream* APIs would have been either reused or officially jettisoned as part of the Py3k migration. However, we're now in a situation where we have: 1. A robust Unicode capable IO implementation (the io module, based on PEP 3116) that is available in both 2.x and 3.x that is designed to minimise the amount of work involved in writing new codecs 2. A legacy IO implementation (the codecs module) that is available in both 2.x and 3.x, but requires additional work on the part of codec authors and isn't as robust as the PEP 3116 implementation So the options are: A. Bring the codecs module IO implementation up to the standard of the io module implementation (less the C acceleration) and maintain the requirement that codec authors provide StreamReader and StreamWriter implementations. B. Retain the full codecs module API, but reimplement relevant parts in terms of the io module. C. Deprecate the codecs.Stream* interfaces and make codecs.open() a simple wrapper around the builtin open() function. Formally drop the requirement that codec authors provide StreamReader/Writer instances (since they are not used by the core IO implementation) Currently, nobody has stepped forward to do the work of maintaining the codecs IO implementation independently of the io module, so the only two options seriously on the table are B and C. That may change if someone actually goes through and *fixes* all the error cases that are handled correctly by the io module but not by the codecs module and credibly promises to continue to do so for at least the life of 3.3. A 2to3 fixer that simply changes codecs.open to open is not viable, as the function signatures are not compatible (the buffering parameter appears in a different location): codecs.open(): open(filename, mode='rb', encoding=None, errors='strict', buffering=1) 3.x builtin open(): open(file, mode='r', buffering=-1, encoding=None, errors=None, newline=None, closefd=True) Now, the backported io module does make it possible to write correct code as far back as 2.6 that can be forward ported cleanly to 3.x without requiring code modifications. However, it would be nice to transparently upgrade code that uses codecs.open to the core IO implementation in 3.x. For people new to Python, the parallel (and currently deficient) alternative IO implementation also qualifies at the very least as an attractive nuisance. Now, it may be that this PEP runs afoul of Guido's stated preference not to introduce any more backwards incompatibilities between 2.x and 3.x that aren't absolutely essential. In that case, it may be reasonable to add an option D to the mix, where we just add documentation notes telling people not to use the affected codecs module APIs and officially declare that bug reports on those APIs will be handled with don't use these, use the io module instead, as that would also deal with the maintenance problem. It's pretty ugly from an end user's point of view, though. Regards, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Draft PEP: Deprecate codecs.StreamReader and codecs.StreamWriter
Le 07/07/2011 13:43, Nick Coghlan a écrit : Or just check it in to hg.python.org/peps (claiming the next number in sequence - 400 at the time of writing this email). I asked if that approach was OK quite some time ago and David said yes - PEP 1 is written the way it is because not everyone that writes a PEP has commit privileges for the python.org repos. Ok, done: http://www.python.org/dev/peps/pep-0400/ http://hg.python.org/peps/file/tip/pep-0400.txt I started to include Marc-Andre's suggestions. Victor ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Draft PEP: Deprecate codecs.StreamReader and codecs.StreamWriter
Le 07/07/2011 12:53, Vinay Sajip a écrit : I've no issue with telling people to use open() rather than codecs.open() when moving code from 2.x to 3.x. But in 2.x, is there any other API which allows you to wrap arbitrary streams? Yes, io.TextIOWrapper. Victor ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Draft PEP: Deprecate codecs.StreamReader and codecs.StreamWriter
B. Retain the full codecs module API, but reimplement relevant parts in terms of the io module. This solution would not break backward compatibility, or less than my PEP. I didn't try to implement this solution. It should be possible for StreamReader (- TextIOWrapper), StreamWriter (- TextIOWrapper) and StreamReaderWriter (- TextIOWrapper), but not for EncodedFile (by the why, who use this horrible class? :-)). I would prefer solution C to have only one obvious way to read-write text files in Python 3(.4). A 2to3 fixer that simply changes codecs.open to open is not viable, as the function signatures are not compatible (the buffering parameter appears in a different location): codecs.open(): open(filename, mode='rb', encoding=None, errors='strict', buffering=1) 3.x builtin open(): open(file, mode='r', buffering=-1, encoding=None, errors=None, newline=None, closefd=True) What? The prototypes are very close, you just to have to invert some arguments. Why do you think that we cannot implement that? Now, the backported io module does make it possible to write correct code as far back as 2.6 that can be forward ported cleanly to 3.x without requiring code modifications. However, it would be nice to transparently upgrade code that uses codecs.open to the core IO implementation in 3.x. For people new to Python, the parallel (and currently deficient) alternative IO implementation also qualifies at the very least as an attractive nuisance. Use codecs.open() if you would like to support Python 3 and Python 2 older than 2.6. If you don't care of Python 2.5, use directly the io module (you just have to know that it is slow in Python 2.6). Victor ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Draft PEP: Deprecate codecs.StreamReader and codecs.StreamWriter
On Thu, 7 Jul 2011 06:53:50 + (UTC) Vinay Sajip vinay_sa...@yahoo.co.uk wrote: Benjamin Peterson benjamin at python.org writes: 2011/7/6 Nick Coghlan ncoghlan at gmail.com: The API of the resulting object is the same (i.e. they're file-like objects). The behavioural differences are due to cases where the codec-specific classes are currently broken. Yes, but as we all know too well, people are surely relying on whatever behavior there is, broken or not. There's also the fact that code which currently runs under 2.x and 3.x would stop working if codecs.StreamReader/StreamWriter were to go away. That's a fact of life for any deprecation. But it only stops working *after* the deprecation period has expired. And deprecated stuff can actually stay in for a long time, depending on its popularity. The main point of the PEP, IMO, is actually the deprecation itself. By deprecating, we signal that something isn't actively maintained anymore, and that a (allegedly better) alternative is available. I think that's a very reasonable thing to do, regardless of whether or not the thing actually gets removed in a later version. Regards Antoine. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Draft PEP: Deprecate codecs.StreamReader and codecs.StreamWriter
On Thu, 07 Jul 2011 10:07:38 +0200 M.-A. Lemburg m...@egenix.com wrote: That said, I'm not really up for a longer discussion on this. We've already had the discussion and decided against removing those parts of the codec API. I don't remember any such decision. We decided against unilateraly removing them without a PEP, which is quite different. Now we have a PEP and the matter can be discussed using the appropriate procedure. Regards Antoine. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Draft PEP: Deprecate codecs.StreamReader and codecs.StreamWriter
On Thu, 7 Jul 2011 22:08:45 +1000 Nick Coghlan ncogh...@gmail.com wrote: Currently, nobody has stepped forward to do the work of maintaining the codecs IO implementation independently of the io module, so the only two options seriously on the table are B and C. Since nobody has stepped up to implement option B either, I think the only option seriously on the table is C. Now, it may be that this PEP runs afoul of Guido's stated preference not to introduce any more backwards incompatibilities between 2.x and 3.x that aren't absolutely essential. Well, a deprecation isn't an incompatibility in itself. Especially when deprecation warnings are hidden by default. Regards Antoine. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Draft PEP: Deprecate codecs.StreamReader and codecs.StreamWriter
On 7/7/2011 7:28 AM, Antoine Pitrou wrote: The main point of the PEP, IMO, is actually the deprecation itself. By deprecating, we signal that something isn't actively maintained anymore, and that a (allegedly better) alternative is available. I think that's a very reasonable thing to do, regardless of whether or not the thing actually gets removed in a later version. Yes, the final decision could be deprecate now, remove in 4.0, as happened during the 2.x series. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Draft PEP: Deprecate codecs.StreamReader and codecs.StreamWriter
Le 07/07/2011 19:33, Terry Reedy a écrit : On 7/7/2011 7:28 AM, Antoine Pitrou wrote: The main point of the PEP, IMO, is actually the deprecation itself. By deprecating, we signal that something isn't actively maintained anymore, and that a (allegedly better) alternative is available. I think that's a very reasonable thing to do, regardless of whether or not the thing actually gets removed in a later version. Yes, the final decision could be deprecate now, remove in 4.0, as happened during the 2.x series. Python 4? Are you serious? Victor ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Draft PEP: Deprecate codecs.StreamReader and codecs.StreamWriter
On Thu, Jul 7, 2011 at 11:43, Victor Stinner victor.stin...@haypocalc.comwrote: Le 07/07/2011 19:33, Terry Reedy a écrit : On 7/7/2011 7:28 AM, Antoine Pitrou wrote: The main point of the PEP, IMO, is actually the deprecation itself. By deprecating, we signal that something isn't actively maintained anymore, and that a (allegedly better) alternative is available. I think that's a very reasonable thing to do, regardless of whether or not the thing actually gets removed in a later version. Yes, the final decision could be deprecate now, remove in 4.0, as happened during the 2.x series. Python 4? Are you serious? Yes he is, as are others who would support that position (not me; I prefer two releases of pending deprecation, one release deprecation, then removal). When I was organizing the stdlib reorg, one viewpoint that came up was to never actually remove module code but simply deprecate it so that that those who care to use the module can continue to do so, but otherwise let it bit-rot so that pre-existing code does not necessarily break. -Brett Victor __**_ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/**mailman/listinfo/python-devhttp://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/**mailman/options/python-dev/** brett%40python.orghttp://mail.python.org/mailman/options/python-dev/brett%40python.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Draft PEP: Deprecate codecs.StreamReader and codecs.StreamWriter
On Fri, Jul 8, 2011 at 4:43 AM, Victor Stinner victor.stin...@haypocalc.com wrote: Le 07/07/2011 19:33, Terry Reedy a écrit : On 7/7/2011 7:28 AM, Antoine Pitrou wrote: The main point of the PEP, IMO, is actually the deprecation itself. By deprecating, we signal that something isn't actively maintained anymore, and that a (allegedly better) alternative is available. I think that's a very reasonable thing to do, regardless of whether or not the thing actually gets removed in a later version. Yes, the final decision could be deprecate now, remove in 4.0, as happened during the 2.x series. Python 4? Are you serious? Py3k was a mythological some time in the dim distant future target for backwards incompatible changes for a long time before it became a real project that people were working on actually building. Py4k is now a similarly mythological beast :) However, like Brett, I don't think it's actually needed in this particular case. Deprecation in 3.3, removal in 3.5 is a time frame completely in line with the desire to avoid a repeat of the PyCObject/PyCapsule related incompatibility problems. Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Draft PEP: Deprecate codecs.StreamReader and codecs.StreamWriter
Hi, Last may, I proposed to deprecate open() function, StreamWriter and StreamReader classes of the codecs module. I accepted to keep open() after the discussion on python-dev. Here is a more complete proposition as a PEP. It is a draft and I expect a lot of comments :) Victor --- PEP: xxx Title: Deprecate codecs.StreamReader and codecs.StreamWriter Version: $Revision$ Last-Modified: $Date$ Author: Victor Stinner Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 28-May-2011 Python-Version: 3.3 Abstract io.TextIOWrapper and codecs.StreamReaderWriter offer the same API [#f1]_. TextIOWrapper has more features and is faster than StreamReaderWriter. Duplicate code means that bugs should be fixed twice and that we may have subtle differences between the two implementations. The codecs modules was introduced in Python 2.0, see the PEP 100. The io module was introduced in Python 2.6 and 3.0 (see the PEP 3116), and reimplemented in C in Python 2.7 and 3.1. Motivation == When the Python I/O model was updated for 3.0, the concept of a stream-with-known-encoding was introduced in the form of io.TextIOWrapper. As this class is critical to the performance of text-based I/O in Python 3, this module has an optimised C version which is used by CPython by default. Many corner cases in handling buffering, stateful codecs and universal newlines have been dealt with since the release of Python 3.0. This new interface overlaps heavily with the legacy codecs.StreamReader, codecs.StreamWriter and codecs.StreamReaderWriter interfaces that were part of the original codec interface design in PEP 100. These interfaces are organised around the principle of an encoding with an associated stream (i.e. the reverse of arrangement in the io module), so the original PEP 100 design required that codec writers provide appropriate StreamReader and StreamWriter implementations in addition to the core codec encode() and decode() methods. This places a heavy burden on codec authors providing these specialised implementations to correctly handle many of the corner cases that have now been dealt with by io.TextIOWrapper. While deeper integration between the codec and the stream allows for additional optimisations in theory, these optimisations have in practice either not been carried out and else the associated code duplication means that the corner cases that have been fixed in io.TextIOWrapper are still not handled correctly in the various StreamReader and StreamWriter implementations. Accordingly, this PEP proposes that: * codecs.open() be updated to delegate to the builtin open() in Python 3.3; * the legacy codecs.Stream* interfaces, including the streamreader and streamwriter attributes of codecs.CodecInfo be deprecated in Python 3.3 and removed in Python 3.4. Rationale = StreamReader and StreamWriter issues * StreamReader is unable to translate newlines. * StreamReaderWriter handles reads using StreamReader and writes using StreamWriter. These two classes may be inconsistent. To stay consistent, flush() must be called after each write which slows down interlaced read-write. * StreamWriter doesn't support line buffering (flush if the input text contains a newline). * StreamReader classes of the CJK encodings (e.g. GB18030) don't support universal newlines, only UNIX newlines ('\\n'). * StreamReader and StreamWriter are stateful codecs but don't expose functions to control their state (getstate() or setstate()). Each codec has to implement corner cases, see Issue with stateful codecs. * StreamReader and StreamWriter are very similar to IncrementalReader and IncrementalEncoder, some code is duplicated for stateful codecs (e.g. UTF-16). * Each codec has to reimplement its own StreamReader and StreamWriter class, even if it's trivial (just call the encoder/decoder). * codecs.open(filename, r) creates a io.TextIOWrapper object. * No codec implements an optimized method in StreamReader or StreamWriter based on the specificities of the codec. TextIOWrapper features '' * TextIOWrapper supports any kind of newline, including translating newlines (to UNIX newlines), to read and write. * TextIOWrapper reuses incremental encoders and decoders (no duplication of code). * The io module (TextIOWrapper) is faster than the codecs module (StreamReader). It is implemented in C, whereas codecs is implemented in Python. * TextIOWrapper has a readahead algorithm which speeds up small reads: read character by character or line by line (io is 10x through 25x faster than codecs on these operations). * TextIOWrapper has a write buffer. * TextIOWrapper.tell() is optimized. * TextIOWrapper supports random access (read+write) using a single class which permit to optimize interlaced read-write (but no such optimization is implemented). Possible
Re: [Python-Dev] Draft PEP: Deprecate codecs.StreamReader and codecs.StreamWriter
2011/7/6 Victor Stinner victor.stin...@haypocalc.com: codecs.open() will be changed to reuse the builtin open() function (TextIOWrapper). This doesn't strike me as particularly backwards compatible, since you've just enumerated the differences between StreamWriter/Reader and TextIOWrapper. -- Regards, Benjamin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Draft PEP: Deprecate codecs.StreamReader and codecs.StreamWriter
On Thu, Jul 7, 2011 at 11:16 AM, Benjamin Peterson benja...@python.org wrote: 2011/7/6 Victor Stinner victor.stin...@haypocalc.com: codecs.open() will be changed to reuse the builtin open() function (TextIOWrapper). This doesn't strike me as particularly backwards compatible, since you've just enumerated the differences between StreamWriter/Reader and TextIOWrapper. The API of the resulting object is the same (i.e. they're file-like objects). The behavioural differences are due to cases where the codec-specific classes are currently broken. Victor, could you please check this into the PEPs repo? It's easier to reference once it has a real number. Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Draft PEP: Deprecate codecs.StreamReader and codecs.StreamWriter
2011/7/6 Nick Coghlan ncogh...@gmail.com: On Thu, Jul 7, 2011 at 11:16 AM, Benjamin Peterson benja...@python.org wrote: 2011/7/6 Victor Stinner victor.stin...@haypocalc.com: codecs.open() will be changed to reuse the builtin open() function (TextIOWrapper). This doesn't strike me as particularly backwards compatible, since you've just enumerated the differences between StreamWriter/Reader and TextIOWrapper. The API of the resulting object is the same (i.e. they're file-like objects). The behavioural differences are due to cases where the codec-specific classes are currently broken. Yes, but as we all know too well, people are surely relying on whatever behavior there is, broken or not. -- Regards, Benjamin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com