Re: The Case Against Autodecode

Walter Bright via Digitalmars-d Fri, 03 Jun 2016 03:06:04 -0700

On 6/3/2016 1:05 AM, H. S. Teoh via Digitalmars-d wrote:

However, this
meant that some precomposed characters were "redundant": they
represented character + diacritic combinations that could equally well
be expressed separately. Normalization was the inevitable consequence.

It is not inevitable. Simply disallow the 2 codepoint sequences - the single onehas to be used instead.

There is precedent. Some characters can be encoded with more than one UTF-8sequence, and the longer sequences were declared invalid. Simple.

I.e. have the normalization up front when the text is created rather thaneverywhere else.

Re: The Case Against Autodecode

Reply via email to