Re: Grapheme clusters, a.k.a.real characters

Marko Rauhamaa Fri, 14 Jul 2017 01:59:19 -0700

Chris Angelico <[email protected]>:

> On Fri, Jul 14, 2017 at 6:15 PM, Marko Rauhamaa <[email protected]> wrote:
>> Furthermore, you only dismissed my question about
>>
>>    len(text)
>>
>> What about
>>
>>    text[-1]
>>    re.match("a.c", text)
>
> The considerations and concerns in the second half of my paragraph -
> the bit you didn't quote - directly address these two.


I guess you refer to:

   These kinds of linguistic considerations shouldn't be codified into
   the core of the language.

Then, why bother with Unicode to begin with? Why not just use bytes?
After all, Python3's strings have the very same pitfalls:

  - you don't know the length of a text in characters

  - chr(n) doesn't return a character

  - you can't easily find the 7th character in a piece of text

  - you can't compare the equality of two pieces of text

  - you can't use a piece of text as a reliable dict key

etc.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Grapheme clusters, a.k.a.real characters

Reply via email to