Re: The Case For Autodecode

ag0aep6g via Digitalmars-d Fri, 03 Jun 2016 12:56:43 -0700

On 06/03/2016 09:09 PM, Steven Schveighoffer wrote:

Except many chars *do* properly convert. This should work:


char c = 'a';
dchar d = c;
assert(d == 'a');

Yeah, that's what I meant by "standalone code unit". Code units that ontheir own represent a code point would not be touched.

As I mentioned in my earlier reply, some kind of "bounds checking" for
the conversion could be a possibility.

Hm... an interesting possiblity:

dchar _dchar_convert(char c)
{
    return cast(int)cast(byte)c; // get sign extension for non-ASCII
}

So when the char's most significant bit is set, this fills the upperbits of the dchar with 1s, right? And a set most significant bit in achar means it's part of a multibyte sequence, while in a dchar it meansthat the dchar is invalid, because they only go up to U+10FFFF. Huh. Neat.


Does it work for for char -> wchar, too?

Re: The Case For Autodecode

Reply via email to