Re: Using decodeFront with a generalised input range

Dennis via Digitalmars-d-learn Fri, 09 Nov 2018 03:16:05 -0800

On Friday, 9 November 2018 at 10:45:49 UTC, Vinay Sajip wrote:

As I see it, a ubyte 0x20 could be decoded to an ASCII char '', and likewise to wchar or dchar. It doesn't (to me) makesense to decode a char to a wchar or dchar. Anyway, you'veshown me how decodeFront can be used, so great!

The character ' ' simply is the number 0x20 in char, wchar anddchar. The difficulty arises when you use non-ascii characters:


if ("€"[0] == '€')

The character code of € is U+20AC, but a char only goes to 0xFF.To work around that, UTF-8 gives higher code points multiplebytes (or code units). The € sign will be represented as [0xE2,0x82, 0xAC]. So the code above actually checks 0xE2 == 0x20AC,which will return false. If you decodeFront on [0xE2, 0x82,0xAC], it will actually output 0x20AC and modify the range to be[] since it consumed all three code units. That way you canhandle code points properly.

See: https://en.wikipedia.org/wiki/UTF-8#Examples

On Friday, 9 November 2018 at 10:45:49 UTC, Vinay Sajip wrote:

Supplementary question: is an operation like r.map!(x =>cast(char) x) effectively a run-time no-op and just to keep thecompiler happy, or does it actually result in code beingexecuted? I came across a similar issue with ranges recentlywhere the answer was to map immutable(byte) to byte in the sameway.


On dmd without optimization, the map function will compile to:
        push    RBP          //
        mov     RBP,RSP      //
        sub     RSP,010h     // build stack frame
        mov     -8[RBP],EDI  // put argument0 on the stack

mov AL,-8[RBP] // put the stack value in the lower 8 bits ofthe return register

        leave                // delete stack frame
        ret                  // return

So that will be essentially a run-time no-op. However, if youpass -O -inline to dmd I'm pretty sure it will optimize it away.GDC and LDC with -O1 or higher will certainly eliminate allrun-time cost.

Re: Using decodeFront with a generalised input range

Reply via email to