Re: The Case Against Autodecode

Dmitry Olshansky via Digitalmars-d Fri, 03 Jun 2016 12:17:06 -0700

On 02-Jun-2016 23:27, Walter Bright wrote:

On 6/2/2016 12:34 PM, deadalnix wrote:

On Thursday, 2 June 2016 at 19:05:44 UTC, Andrei Alexandrescu wrote:

Pretty much everything. Consider s and s1 string variables with possibly
different encodings (UTF8/UTF16).


* s.all!(c => c == 'ö') works only with autodecoding. It returns
always false
without.


False. Many characters can be represented by different sequences of
codepoints.
For instance, ê can be ê as one codepoint or ^ as a modifier followed
by e. ö is
one such character.


There are 3 levels of Unicode support. What Andrei is talking about is
Level 1.

http://unicode.org/reports/tr18/tr18-5.1.html

I wonder what rationale there is for Unicode to have two different
sequences of codepoints be treated as the same. It's madness.

Yeah, Unicode was not meant to be easy it seems. Or this is whateverhappens with evolutionary design that started with "everything is a16-bit character".


--
Dmitry Olshansky

Re: The Case Against Autodecode

Reply via email to