Chris Angelico <ros...@gmail.com>:

> On Fri, Jul 14, 2017 at 6:53 PM, Marko Rauhamaa <ma...@pacujo.net> wrote:
>> Chris Angelico <ros...@gmail.com>:
>> Then, why bother with Unicode to begin with? Why not just use bytes?
>> After all, Python3's strings have the very same pitfalls:
>>
>>   - you don't know the length of a text in characters
>>   - chr(n) doesn't return a character
>>   - you can't easily find the 7th character in a piece of text
>
> First you have to define "character".

I'm referring to the

    Grapheme clusters, a.k.a.real characters

>>   - you can't compare the equality of two pieces of text
>>   - you can't use a piece of text as a reliable dict key
>
> (Dict key usage is defined in terms of equality, so these two are the
> same concern.)

Ideally, yes. However, someone might say, "don't use == to compare
equality; use unicode.textually_equal() instead". That advise might
satisfy the first requirement but not the second.

> Yes, you can. For most purposes, textual equality should be defined in
> terms of NFC or NFD normalization. Python already gives you that. You
> could argue that a string should always be stored NFC (or NFD, take
> your pick), and then the equality operator would handle this; but I'm
> not sure the benefit is worth it.

As I said, Python3's strings are neither here nor there. They don't
quite solve the problem Python2's strings had. They will push the
internationalization problems a bit farther out but fall short of the
mark.

he developer still has to worry a lot. Unicode seemingly solved one
problem only to present the developer of a bagful of new problems.

And if Python3's strings are a half-measure, why not stick to bytes?

> If you're trying to use strings as identifiers in any way (say, file
> names, or document lookup references), using the NFC/NFD normalized
> form of the string should be sufficient.

Show me ten Python3 database applications, and I'll show you ten Python3
database applications that don't normalize their primary keys.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list

Reply via email to