[issue19548] 'codecs' module docs improvements

2015-01-06 Thread Roundup Robot

Roundup Robot added the comment:

New changeset 0646eee8296a by Nick Coghlan in branch '3.4':
Issue 19548: update codecs module documentation
https://hg.python.org/cpython/rev/0646eee8296a

New changeset 4d00d0109147 by Nick Coghlan in branch 'default':
Merge issue 19548 changes from 3.4
https://hg.python.org/cpython/rev/4d00d0109147

--
nosy: +python-dev

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19548
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19548] 'codecs' module docs improvements

2015-01-06 Thread Nick Coghlan

Nick Coghlan added the comment:

Thanks for the work on this folks, both Jan for the feedback, Martin for the 
writing, and everyone else for their comments.

I don't believe we addressed all of Jan's comments, but I'd like to request 
that any further comments be filed as separate issues, now that the larger 
restructure of the content is out of the way.

--
resolution:  - fixed
stage: commit review - resolved
status: open - closed

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19548
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19548] 'codecs' module docs improvements

2015-01-06 Thread Martin Panter

Martin Panter added the comment:

Thanks Nick. Here is a small followup patch for the default (3.5) branch to 
keep things consistent.

--
Added file: http://bugs.python.org/file37618/default-branch-followup.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19548
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19548] 'codecs' module docs improvements

2015-01-06 Thread Roundup Robot

Roundup Robot added the comment:

New changeset 20a5a56ce090 by Nick Coghlan in branch 'default':
Issue #19548: clean up merge issues in codecs docs
https://hg.python.org/cpython/rev/20a5a56ce090

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19548
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19548] 'codecs' module docs improvements

2015-01-06 Thread Nick Coghlan

Nick Coghlan added the comment:

Thanks for the follow-up patch Martin - I missed those when I did the merge 
forward from 3.4.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19548
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19548] 'codecs' module docs improvements

2015-01-05 Thread Martin Panter

Martin Panter added the comment:

Adding patch v5, for the 3.4 branch. There is at least one reference that still 
needs fixing in the default branch that is not applicable to the 3.4 branch. 
Main changes from Nick’s patch:

* Removed sentence now redundant with introduction to open() and EncodedFile()
* Fixed wording to allow for missing surrogateescape_errors() etc
* Changed heading to clarify Codec objects are stateless
* Restored relaxation for StreamWriter writing to text stream
* New wording under “Encodings and Unicode”
* Update cross references to new “Error Handlers” section

--
Added file: 
http://bugs.python.org/file37610/issue19548-codecs-doc.v5.py3.4.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19548
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19548] 'codecs' module docs improvements

2014-12-29 Thread Nick Coghlan

Nick Coghlan added the comment:

I started making a few edits based on Zuo and Walter's comments while getting 
this patch ready for merging, and decided the end result could benefit from an 
additional round of feedback before committing it.

This particular patch is also aimed at the Python 3.4 maintenance branch rather 
than at trunk - the introduction of the new namereplace error handler in 3.5 
means that the previous patch didn't apply cleanly to the maintenance branch.

While Zoinkity's feedback is also valid (i.e. multibyte codecs aren't 
documented properly, custom codec registration is both harder than it really 
should be and not well documented), I think those are better filed and handled 
as separate issues, rather than trying to handle them here as part of the 
general bring the current content of the codec module documentation up to date 
with the current state of Python 3.

--
assignee: docs@python - ncoghlan
stage: patch review - commit review
Added file: http://bugs.python.org/file37553/issue19548-codecs-doc.py34.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19548
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19548] 'codecs' module docs improvements

2014-12-22 Thread Martin Panter

Martin Panter added the comment:

Adding patch v2 after learning how to compile the docs and fixing my errors. I 
also simplified the descriptions of the CodecInfo attributes by defering the 
constructor signatures to where they are fully defined under “Codec base 
classes”, and merged the list of error handlers there as well.

A side effect of merging error handler lists is that “surrogatepass” is now 
defined for codecs in general, not just Codec.encode() and decode().

Also I noticed that “unicode_escape” actually does Latin-1 decoding.

--
Added file: http://bugs.python.org/file37525/codecs-doc.v2.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19548
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19548] 'codecs' module docs improvements

2014-12-22 Thread Nick Coghlan

Nick Coghlan added the comment:

Thanks for those drafts, Martin - they look like a strong improvement to me. 
While I still had plenty of comments/questions on v2, I think that's more a 
reflection on how long it has been since we gave these docs a thorough overall 
review, moreso than a reflection on the proposed changes.

Victor - I added you to the nosy list for this one, as I'd specifically like 
your comments on the StreamReader/Writer docs updates. I'd like to make it 
clear that these are distinct from the text encoding only APIs in the io 
module, while still accurately describing the behaviour of the standard codecs.

--
nosy: +haypo
versions:  -Python 2.7

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19548
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19548] 'codecs' module docs improvements

2014-12-22 Thread Ezio Melotti

Changes by Ezio Melotti ezio.melo...@gmail.com:


--
nosy: +ezio.melotti

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19548
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19548] 'codecs' module docs improvements

2014-12-22 Thread Martin Panter

Martin Panter added the comment:

New patch version addressing many of the comments; thanks for reviewing! Also 
adds and extends some unit tests to confirm some of the corner cases I am 
documenting.

--
Added file: http://bugs.python.org/file37530/codecs-doc.v3.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19548
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19548] 'codecs' module docs improvements

2014-12-17 Thread Serhiy Storchaka

Changes by Serhiy Storchaka storch...@gmail.com:


--
stage:  - needs patch
type:  - enhancement
versions: +Python 2.7, Python 3.5 -Python 3.3

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19548
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19548] 'codecs' module docs improvements

2014-12-17 Thread Martin Panter

Martin Panter added the comment:

Here is a patch addressing many of the points raised. Please have a look and 
give any feedback. Beware I am not very familiar with the Restructured Text 
markup and haven’t tried compiling it.

1. Mentioned bytes-to-bytes and text-to-text in general right at the top. Any 
APIs (e.g. see Issue 20132) that don't support them should be pointed out as 
exceptions to the rule.

8. The underlying mode is forced to binary, so 'r' is the same as 'rb'. I 
removed the 'b' from the signature for clarity.

## Jan’s points not yet addressed: ##

3. I expect the built-in open() function would already be much more obvious and 
advertised, so I didn't add any cross-reference from codecs.open().

5. Both points still need addressing:
  * Lack of requirement for implementing incremental codecs
  * Responsibility of implementing error handlers

9. First point left unaddressed:
  * register_error() error_handler replacement data type (unsure of details)

## Numbering Nick’s points: ##

12. Codec name normalization: Not addressed; what should be written?

[13. Registration not reversible: Added in patch]

[14. Added CodecInfo class, pulling out some existing details from register().]

15. “encodings” module: not done

16. Import system: not done

## My (Martin’s) point: ##

[17. IncrementalEncoder.reset(): done]

## Zoinkity’s points, not addressed: ##

18. Multibyte codecs

19. register() usage example

## Some new points of my own that need fixing: ##

20. The doc string for register() says the search function is also allowed to 
return a tuple of functions, but the reference manual does not mention this. 
Which is more accurate? (I notice CodecInfo is a subclass of “tuple”.)

21. EncodedFile() seems to return StreamRecoder instances. Perhaps move them 
closer together? Should probably warn that EncodedFile's data_encoding is 
handled by a stateless codec.

22. The Codec.encode() and decode() methods return a length consumed, but I 
suspect they have to consume everything they are supplied because the code I 
have seen ignores this return value.

--
keywords: +patch
Added file: http://bugs.python.org/file37485/codecs-doc.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19548
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19548] 'codecs' module docs improvements

2014-12-17 Thread Berker Peksag

Changes by Berker Peksag berker.pek...@gmail.com:


--
nosy: +berker.peksag
stage: needs patch - patch review

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19548
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19548] 'codecs' module docs improvements

2014-06-05 Thread Zoinkity .

Zoinkity . added the comment:

One glaring omission is any information about multibyte codecs--the class, its 
methods, and how to even define one.  

Also, the primary use for codecs.register would be to append a single codec to 
the lookup registry.  Simple usage of the method only provides lookup for the 
provided codecs and will not include regularly-accessible ones such as utf-8. 
 It would be enormously helpful to provide an example of proper, safe usage.

--
nosy: +Zoinkity..

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19548
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19548] 'codecs' module docs improvements

2014-02-03 Thread Mark Lawrence

Changes by Mark Lawrence breamore...@yahoo.co.uk:


--
nosy:  -BreamoreBoy

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19548
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19548] 'codecs' module docs improvements

2014-01-05 Thread Martin Panter

Martin Panter added the comment:

Addition to the list of improvements:

* Under codecs.IncrementalEncoder.reset() it mentions calling encode('', 
final=True). This call does not work as written for the byte encoders in my 
experience, because they do not accept empty text strings. Perhaps it should 
just say to use the final=True flag with no data.

--
nosy: +vadmium

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19548
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19548] 'codecs' module docs improvements

2014-01-05 Thread Serhiy Storchaka

Changes by Serhiy Storchaka storch...@gmail.com:


--
nosy: +doerwalter

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19548
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19548] 'codecs' module docs improvements

2013-11-16 Thread Nick Coghlan

Nick Coghlan added the comment:

Another big one: the encodings module API is not documented in the prose docs, 
and nor is the interface between the default search function and the individual 
encoding definitions. There's some decent info in help(encoding) though.

The interaction with the import system could also be documented better - you 
can actually blacklist codecs by manipulating sys.modules and the encodings 
namespace, and you can search additional locations for codec modules by 
manipulating encodings.__path__ (even without it being declared as a namespace 
package)

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19548
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19548] 'codecs' module docs improvements

2013-11-16 Thread Marc-Andre Lemburg

Marc-Andre Lemburg added the comment:

On 16.11.2013 14:25, Nick Coghlan wrote:
 
 Nick Coghlan added the comment:
 
 Another big one: the encodings module API is not documented in the prose 
 docs, and nor is the interface between the default search function and the 
 individual encoding definitions. There's some decent info in help(encoding) 
 though.
 
 The interaction with the import system could also be documented better - you 
 can actually blacklist codecs by manipulating sys.modules and the encodings 
 namespace, and you can search additional locations for codec modules by 
 manipulating encodings.__path__ (even without it being declared as a 
 namespace package)

Those were not documented on purpose, since they are an implementation
detail of the encodings package search function.

If you document them now, you'll set the implementation in stone,
making future changes to the logic difficult. I'd advise against
this to stay flexible, unless you want to open up the encodings
package as namespace package - then you'd have to add documentation
for the import interface.

--
nosy: +lemburg

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19548
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19548] 'codecs' module docs improvements

2013-11-16 Thread Mark Lawrence

Mark Lawrence added the comment:

Could they be documented with a massive warning in red Cpython implementation 
detail - subject to change without notice?  Or documented in a place that is 
only accessible to developers and not users?  Or...???

--
nosy: +BreamoreBoy

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19548
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19548] 'codecs' module docs improvements

2013-11-16 Thread Marc-Andre Lemburg

Marc-Andre Lemburg added the comment:

On 16.11.2013 15:03, Mark Lawrence wrote:
 
 Mark Lawrence added the comment:
 
 Could they be documented with a massive warning in red Cpython 
 implementation detail - subject to change without notice?  Or documented in 
 a place that is only accessible to developers and not users?  Or...???

The API is documented in encodings/__init__.py for developers.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19548
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19548] 'codecs' module docs improvements

2013-11-16 Thread Nick Coghlan

Nick Coghlan added the comment:

On 16 November 2013 23:33, Marc-Andre Lemburg rep...@bugs.python.org wrote:
 On 16.11.2013 14:25, Nick Coghlan wrote:
 Those were not documented on purpose, since they are an implementation
 detail of the encodings package search function.

 If you document them now, you'll set the implementation in stone,
 making future changes to the logic difficult. I'd advise against
 this to stay flexible, unless you want to open up the encodings
 package as namespace package - then you'd have to add documentation
 for the import interface.

Yes, that was what got me thinking along those lines, but to make that
possible, the contents of encodings/__init__.py would need to be moved
somewhere else. So this probably isn't on the table for 3.4.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19548
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19548] 'codecs' module docs improvements

2013-11-15 Thread Terry J. Reedy

Changes by Terry J. Reedy tjre...@udel.edu:


--
versions:  -Python 3.2

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19548
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19548] 'codecs' module docs improvements

2013-11-10 Thread Jan Kaliszewski

New submission from Jan Kaliszewski:

When learning about the 'codecs' module I encountered several places in the 
docs of the module that, I believe, could be improved to be clearer and easier 
for codecs-begginers: 

1. Ad `codecs.encode` and `codecs.decode` descriptions: I believe it would be 
worth to mention that, unlike str.encode()/bytes.decode(), these functions (and 
all their counterparts in the classes the module contains) support not only 
traditional str/bytes encodings, but also bytes-to-bytes as well as 
str-to-str encodings. 

2. Ad 'codecs.register': in two places there is such a text: `These have to be 
factory functions providing the following interface: factory([...] 
errors='strict')` -- `errors='strict'` may be confusing (at the first sight it 
may suggest that the only valid value is 'strict'; maybe `factory(errors=error 
handler label)` with an appropriate description below would be better?).

3. Ad `codecs.open`: I believe there should be a reference to the built-in 
open() as an alternative that is better is most cases.

4. Ad `codecs.BOM*`: `These constants define various encodings of the Unicode 
byte order mark (BOM).` -- the world `encodings` seems to be confusing here; 
maybe `These constants define various byte sequences being Unicode byte order 
marks (BOMs) for several encodings. They are used...` would be better?

5. Ad `7.2.1. Codec Base Classes` + 
`codecs.IncrementalEncoder`/`codecs/IncrementalDecoder`:
  * `Each codec has to define four interfaces to make it usable as codec in 
Python: stateless encoder, stateless decoder, stream reader and stream writer` 
-- only four? Not six? What about incremental encoder/decoder???
  * Comparing the fragments (and tables) about error halding methods (Codecs 
Base Classes, IncrementalEncoder, IncrementalDecoder) with similar fragment in 
the `codecs.register` description and with the `codecs.register_error` 
description I was confused: is it the matter of a particular codec 
implementation or of a registered error handler to implement a particular way 
of error handling? I believe it would be worth to describe clearly relations 
between these elements of the API. Also more detailed description of 
differences beetween error handling for encoding and decoding, and translation 
would be a good thing. 

6. Ad `7.2.1.6. StreamReaderWriter Objects` and `7.2.1.7. StreamRecoder 
Objects`: It would be worth to say explicitly that, contrary to previously 
described abstract classes (IncrementalEncoder/Decoder, StreamReader/Writer), 
these classes are *concrete* ones (if I understand it correctly).

7. Ad `7.2.4. Python Specific Encodings`:
  * `raw_unicode_encoding` -- see: ticket #19539.
  * `unicode_encoding` -- `Produce a string that is suitable as Unicode literal 
in Python source code` but it is *not* a string; it's a *bytes* object (which 
could be used in source code using an `ascii`-compatibile encoding).
  * `bytes-to-bytes` and `str-to-str` encodings -- maybe it would be nice to 
mention that these encodings cannot be used with str.encode()/bytes.decode() 
methods (and to mention again they *can* be used with the functions/method 
provided by the `codecs` module).

--
assignee: docs@python
components: Documentation
messages: 202593
nosy: docs@python, zuo
priority: normal
severity: normal
status: open
title: 'codecs' module docs improvements
versions: Python 2.6, Python 2.7, Python 3.1, Python 3.2, Python 3.3, Python 3.4

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19548
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19548] 'codecs' module docs improvements

2013-11-10 Thread Jan Kaliszewski

Jan Kaliszewski added the comment:

s/world/word
s/begginers/beginners

(sorry, it's late night here)

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19548
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19548] 'codecs' module docs improvements

2013-11-10 Thread Jan Kaliszewski

Jan Kaliszewski added the comment:

8. Again ad `codecs.open`: the default file mode is actually 'rb', not 'r'.

9. Several places in the docs -- ad: `codecs.register_error`, `codecs.open`, 
`codecs.EncodedFile`, `Codec.encode/decode`, `codecs.StreamWriter/StreamReader` 
-- do not cover cases of using bytes-to-bytes and/or str-to-str encodings 
(especially when using `string`/`bytes` and `text`/`binary` terms).

10. `codecs.replace_errors` -- `bytestring` should be replaced with `bytes-like 
object` (as in other places).

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19548
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19548] 'codecs' module docs improvements

2013-11-10 Thread Jan Kaliszewski

Jan Kaliszewski added the comment:

11. Ad encoding 'undefined': The sentence `Can be used as the system encoding 
if no automatic coercion between byte and Unicode strings is desired.` was 
suitable for Python 2.x, but not for Python 3.x'. I believe, this sentence 
should be removed.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19548
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19548] 'codecs' module docs improvements

2013-11-10 Thread Jan Kaliszewski

Changes by Jan Kaliszewski z...@chopin.edu.pl:


--
versions:  -Python 2.6, Python 2.7, Python 3.1

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19548
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19548] 'codecs' module docs improvements

2013-11-10 Thread Ned Deily

Changes by Ned Deily n...@acm.org:


--
nosy: +ncoghlan

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19548
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19548] 'codecs' module docs improvements

2013-11-10 Thread Nick Coghlan

Nick Coghlan added the comment:

A few more:

- codec name normalisation (lower case,  space to hyphen) is not mentioned
in the codecs.register description

- search function registration is not reversible, which doesn't play well
with module reloading

- codecs.CodecInfo init signature is not covered

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19548
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com