On Wednesday, February 08, 2012 09:35:28 H. S. Teoh wrote: > On Wed, Feb 08, 2012 at 08:32:32AM -0800, Jonathan M Davis wrote: > [...] > > > Except that char[] is _not_ an array of characters. It's an array of > > code units. There is a _big_ difference. Not even dchar[] is an array > > of characters. It's both an array of code units and an array of code > > points, but not even that quite gets you characters (though at this > > point, Phobos pretty much treats a code point as if it were a > > character). If you want a character, you need a grapheme (which could > > be multiple code points). _That_ is where the problem comes in. > > > > You can definitely do array operations on strings. In fact, it can be > > very desirable to do so if you want to process strings efficiently. > > But if you treat them like you would ubyte[], you're in for a heap of > > trouble thanks to how unicode works. > > [...] > > Except that the point of my code was to fix byte-order so that they can > be correctly interpreted. I suppose I really should be using ubyte[] for > that instead, and perhaps use a union to translate it to char[] when I > call decode().
You shouldn't normally have to worry about byte order on char[] at all. So, I don't know what you'd be doing that would result in them being in the wrong order. But char is a UTF-8 code unit by definition, so if you're doing something that involves char[] not being a valid array of UTF-8 code units, you're almost certainly going to want to be using ubyte[] instead. There's a lot of stuff in Phobos which will through if you have invalid code points. - Jonathan M Davis