Always std.utf.validate, or rely on exceptions?

SimonN via Digitalmars-d-learn Thu, 02 Mar 2017 08:34:05 -0800

Many functions in std.utf throw UTFException when we pass themmalformed UTF, and many functions in std.string throwStringException. From this, I developed a habit of reading userfiles like so, hoping that it traps all malformed UTF:


    try {
        // call D standard lib on string from file
    }
    catch (Exception e) {
        // treat file as bogus
        // log e.msg
    }

But std.string.stripRight!string calls std.utf.codeLength, whichdoesn't ever throw on malformed UTF, but asserts false on errors:


    ubyte codeLength(C)(dchar c) @safe pure nothrow @nogc
        if (isSomeChar!C)
    {
        static if (C.sizeof == 1)
        {
            if (c <= 0x7F) return 1;
            if (c <= 0x7FF) return 2;
            if (c <= 0xFFFF) return 3;
            if (c <= 0x10FFFF) return 4;
            assert(false);
        }
        // ...
    }

Apparently, once my code calls stripRight, I should be sure thatthis string contains only well-formed UTF. Right now, my codedoesn't guarantee that.

Should I always validate text from files manually withstd.utf.validate?

Or should I memorize which functions throw, then validatemanually whenever I call the non-throwing UTF functions? What isthe pattern behind what throws and what asserts false?


-- Simon

Always std.utf.validate, or rely on exceptions?

Reply via email to