Re: Dicebot on leaving D: It is anarchy driven development in all its glory.

ag0aep6g via Digitalmars-d Thu, 06 Sep 2018 04:45:51 -0700

On 09/06/2018 12:40 PM, Chris wrote:

To avoid this you have to normalize and recompose any decomposedcharacters. I remember that Mac OS X used (and still uses?) decomposedcharacters by default, so when you typed 'á' into your cli, it wouldautomatically decompose it to 'a' + acute. `string` however returnslen=2 for composed characters too. If you do a lot of string handling itwill come back to bite you sooner or later.

You say that D users shouldn't need a '"Unicode license" before they doanything with strings'. And you say that Python 3 gets it right (ormaybe less wrong than D).

But here we see that Python requires a similar amount of Unicodeknowledge. Without your Unicode license, you couldn't make sense of`len` giving different results for two strings that look the same.

So both D and Python require a Unicode license. But on top of that, Dalso requires an auto-decoding license. You need to know that `string`is both a range of code points and an array of code units. And you needto know that `.length` belongs to the array side, not the range side.Once you know that (and more), things start making sense in D.

My point is: D doesn't require more Unicode knowledge than Python. ButD's auto-decoding gives `string` a dual nature, and that can certainlybe confusing. It's part of why everybody dislikes auto-decoding.

(Not saying that Python is free from such pitfalls. I simply don't knowthe language well enough.)

Re: Dicebot on leaving D: It is anarchy driven development in all its glory.

Reply via email to