On Aug 22, 2012, at 8:03 PM, Jonathan M Davis <jmdavisp...@gmx.com> wrote:
> On Wednesday, August 22, 2012 19:52:10 Sean Kelly wrote: >> I'm clearly missing something. ASCII and UTF-8 are compatible. What's >> stopping you from just processing these as if they were UTF-8 strings? > > Range-based functions will treat arrays of char or wchar as forward ranges of > dchar. Because of the variable length of their code points, they aren't > considered to have length, be random access, or have slicing and will not > generally work with range-based functions which require any of those > operations (though some range-based functions do specialize on strings and > use > those operations where they can based on proper understanding of unicode). Yeah. I understand why the range-based functions use dchar, but for my own use I generally want to work directly with a char string of UTF-8 so I can slice buffers. Typing these as uchar buffers isn't ideal, but it does work. > On the other hand, if you have a string that specifically holds ASCII and you > know that it only holds ASCII, you know that you can safely use length, > random > access, and slicing as if each code unit were a full code point. But the > range-based functions don't know that your string is guaranteed to be ASCII- > only, so they continue to treat it as a range of dchar rather than char. The > solution is to either create a wrapper range whose element type is char or to > cast the char[] to ubyte[]. And Bearophile wants such a wrapper range to be > added to Phobos. Gotcha. Despite it being something I'd use regularly, I wouldn't want this in Phobos because it seems like it could cause maintenance problems. I'd rather explicitly cast to ubyte as a way to flag that I was doing something potentially unsafe.