[issue13997] Clearly explain the bare minimum Python 3 users should know about Unicode

2013-01-27 Thread Ezio Melotti
Ezio Melotti added the comment: OK, I'm going to close this then. I'll take a look at the links and see if what they say can be included in the HOWTO. As I mentioned in an earlier post I made a few talks about Unicode and encodings, so I will take some material from there too. Depending on t

[issue13997] Clearly explain the bare minimum Python 3 users should know about Unicode

2013-01-27 Thread Nick Coghlan
Nick Coghlan added the comment: Include a couple of "See Also" links out to my essay and Ned's article and that sounds good to me. (Assuming I've adjusted the DNS settings correctly, this alternate URL for my essay should start working soon: http://python-notes.curiousefficiency.org/en/latest

[issue13997] Clearly explain the bare minimum Python 3 users should know about Unicode

2013-01-27 Thread Ezio Melotti
Ezio Melotti added the comment: If we agree on this, I can propose a patch in #4153 and this issue can be closed. -- ___ Python tracker ___ _

[issue13997] Clearly explain the bare minimum Python 3 users should know about Unicode

2013-01-27 Thread Terry J. Reedy
Terry J. Reedy added the comment: I basically agree with Ezio. The doc currently starts with Introduction to Unicode History of Character Codes ... It ends with Tips for Writing Unicode-aware Programs. ... The most important tip is: Software should only work with Unicode strings intern

[issue13997] Clearly explain the bare minimum Python 3 users should know about Unicode

2013-01-27 Thread Ezio Melotti
Ezio Melotti added the comment: Maybe the Unicode HOWTO could be reorganized so that it first introduces the bare minimum and then expands the concepts for whoever wants to know more? Or should we have a "basic" and an "advanced" Unicode HOWTO? -- __

[issue13997] Clearly explain the bare minimum Python 3 users should know about Unicode

2013-01-26 Thread Nick Coghlan
Nick Coghlan added the comment: Current status: #14015 is still valid (i.e. surrogateescape is not well documented) #4153: the Unicode HOWTO still covers more than the bare minimum people need to know Ned Batchelder's "Pragmatic Unicode" is one of the best intros to the topic I have seen: http

[issue13997] Clearly explain the bare minimum Python 3 users should know about Unicode

2013-01-26 Thread Ezio Melotti
Ezio Melotti added the comment: What's the status of this? Issue #4153 might also be related. -- ___ Python tracker ___ ___ Python-bu

[issue13997] Clearly explain the bare minimum Python 3 users should know about Unicode

2012-07-14 Thread Eli Bendersky
Changes by Eli Bendersky : -- nosy: -eli.bendersky ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail

[issue13997] Clearly explain the bare minimum Python 3 users should know about Unicode

2012-03-31 Thread Chris Rebert
Chris Rebert added the comment: Links to the "rambling Unicode thread"s for posterity and convenience: Gets into several issues, among them, Unicode: http://mail.python.org/pipermail/python-ideas/2012-February/013665.html Unicode-specific offshoot of the above: http://mail.python.org/pipermail

[issue13997] Clearly explain the bare minimum Python 3 users should know about Unicode

2012-02-18 Thread Terry J. Reedy
Terry J. Reedy added the comment: Yes, the 'how to' alternatives, with + and -, should be included in the doc addition. I thought it the best thing to come out of the python-ideas thread. -- ___ Python tracker __

[issue13997] Clearly explain the bare minimum Python 3 users should know about Unicode

2012-02-18 Thread Nick Coghlan
Nick Coghlan added the comment: The other thing that came out of the rambling Unicode thread on python-ideas is that we should clearly articulate the options for processing files in a task-based fashion and describe the trade-offs for the different alternatives. I started writing up my notes

[issue13997] Clearly explain the bare minimum Python 3 users should know about Unicode

2012-02-17 Thread Ezio Melotti
Ezio Melotti added the comment: FWIW I recently made a talk at PyCon Finland called "Understanding Encodings" that goes through the things you mentioned in the last message. I could turn that in a patch for the Unicode Howto. -- ___ Python tracker

[issue13997] Clearly explain the bare minimum Python 3 users should know about Unicode

2012-02-17 Thread Terry J. Reedy
Terry J. Reedy added the comment: I agree with no new builtin and appreciate that being taken off the table. I think the place is the Unicode How-to. I think that document should be renamed Encodings and Unicode How-to. The reasons are 1) one has to first understand the concept of encoding ch

[issue13997] Clearly explain the bare minimum Python 3 users should know about Unicode

2012-02-14 Thread Jim Jewett
Jim Jewett added the comment: See bugs/python.org/issue14015 for one reason that surrogateescape isn't better known. -- nosy: +Jim.Jewett ___ Python tracker ___ ___

[issue13997] Clearly explain the bare minimum Python 3 users should know about Unicode

2012-02-13 Thread Tshepang Lekhonkhobe
Changes by Tshepang Lekhonkhobe : -- nosy: +tshepang ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: http://mai

[issue13997] Clearly explain the bare minimum Python 3 users should know about Unicode

2012-02-13 Thread Giampaolo Rodola'
Changes by Giampaolo Rodola' : -- nosy: +giampaolo.rodola ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: http:

[issue13997] Clearly explain the bare minimum Python 3 users should know about Unicode

2012-02-12 Thread Antoine Pitrou
Antoine Pitrou added the comment: > My mental model here is text editors, which let you open any file, do > their best to display as much as they can and allow you to manipulate > it without damaging the bits you don't change. I don't see any reason > why people shouldn't be able to write Python

[issue13997] Clearly explain the bare minimum Python 3 users should know about Unicode

2012-02-12 Thread Florent Xicluna
Changes by Florent Xicluna : -- nosy: +flox ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.

[issue13997] Clearly explain the bare minimum Python 3 users should know about Unicode

2012-02-12 Thread Paul Moore
Paul Moore added the comment: A better example in terms of "intended to be text" might be ChangeLog files. These are clearly text files, but of sufficiently standard format that they can be manipulated programmatically. Consider a program to get a list of all authors who changed a particular

[issue13997] Clearly explain the bare minimum Python 3 users should know about Unicode

2012-02-12 Thread Nick Coghlan
Nick Coghlan added the comment: If such use cases are indeed better handled as bytes, then that's what should be documented. However, there are some text processing assumptions that no longer hold when using bytes instead of strings (such as "x[0:1] == x[0]"). You also can't safely pass such

[issue13997] Clearly explain the bare minimum Python 3 users should know about Unicode

2012-02-12 Thread Nadeem Vawda
Changes by Nadeem Vawda : -- nosy: +nadeem.vawda ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.py

[issue13997] Clearly explain the bare minimum Python 3 users should know about Unicode

2012-02-12 Thread STINNER Victor
STINNER Victor added the comment: Why do you use Unicode with the ugly surrogateescape error handler in this case? Bytes are just fine for such usecase. The surrogateescape error handler produces unusual characters in range U+DC80-U+DCFF which cannot be printed to a console because sys.stdout u

[issue13997] Clearly explain the bare minimum Python 3 users should know about Unicode

2012-02-12 Thread Nick Coghlan
Nick Coghlan added the comment: Usually because the file may contain certain ASCII markers (or you're inserting such markers), but beyond that, you only care that it's in a consistent ASCII compatible encoding. Parsing log files from sources that aren't set up correctly often falls into this

[issue13997] Clearly explain the bare minimum Python 3 users should know about Unicode

2012-02-12 Thread STINNER Victor
STINNER Victor added the comment: > A common programming task is "I want to process this text file, > I know it's in an ASCII compatible encoding, I don't know which > one specifically, but I'm only manipulating the ASCII parts > so it doesn't matter". Can you give more detail about this use ca

[issue13997] Clearly explain the bare minimum Python 3 users should know about Unicode

2012-02-12 Thread Ezio Melotti
Changes by Ezio Melotti : -- components: +Unicode nosy: +ezio.melotti stage: -> needs patch type: -> enhancement ___ Python tracker ___

[issue13997] Clearly explain the bare minimum Python 3 users should know about Unicode

2012-02-11 Thread Eli Bendersky
Eli Bendersky added the comment: If the concept is accepted. I see no better place for this than the Unicode HOWTO. If it's too long, then a TL;DR; section should be added in the beginning detailing "the bare minimum". No need to scatter such information in bits and pieces around the documentati

[issue13997] Clearly explain the bare minimum Python 3 users should know about Unicode

2012-02-11 Thread Nick Coghlan
Changes by Nick Coghlan : -- assignee: -> docs@python components: +Documentation nosy: +docs@python title: Add open_ascii() builtin -> Clearly explain the bare minimum Python 3 users should know about Unicode versions: +Python 3.2, Python 3.3 ___ Py