Re: char array weirdness

Anon via Digitalmars-d-learn Mon, 28 Mar 2016 16:11:36 -0700

On Monday, 28 March 2016 at 22:49:28 UTC, Jack Stouffer wrote:

On Monday, 28 March 2016 at 22:43:26 UTC, Anon wrote:
On Monday, 28 March 2016 at 22:34:31 UTC, Jack Stouffer wrote:
void main () {
    import std.range.primitives;
char[] val = ['1', '0', 'h', '3', '6', 'm', '2', '8','s'];
    pragma(msg, ElementEncodingType!(typeof(val)));
    pragma(msg, typeof(val.front));
}
prints

    char
    dchar

Why?
Unicode! `char` is UTF-8, which means a character can be from1 to 4 bytes. val.front gives a `dchar` (UTF-32), consumingthose bytes and giving you a sensible value.
But the value fits into a char;

The compiler doesn't know that, and it isn't true in general. Youcould have, for example, U+3042 in your char[]. That would beencoded as three chars. It wouldn't make sense (or be correct)for val.front to yield '\xe3' (the first byte of U+3042 in UTF-8).

a dchar is a waste of space.

If you're processing Unicode text, you *need* to use that space.Any because you're using ranges, it is only 3 extra bytes,anyway. It isn't going to hurt on modern systems.

Why on Earth would a different type be given for the frontvalue than the type of the elements themselves?

Unicode. A single char cannot hold a Unicode code point. A singledchar can.

Re: char array weirdness

Reply via email to