What if the string converted itself from utf-8 to utf-32 back and forth as necessary (utf-8 for storing and utf-32 for processing):
struct String { public: bool encoded() @property const { return _encoded; } bool encoded(bool should) @property { if(should) if(!encoded) { _utf8 = to!string(_utf32); encoded = true; } else if(encoded) { _utf32 = to!dstring(_utf8); encoded = false; } } // Here goes the part where you get to use the string private: bool _encoded; union { string _utf8; dstring _utf32; } } This has a lot of drawbacks and is purely a curiosity. The idea of expressing the encoding of string as a property of strings, rather, then a difference between separate types of strings. On Thu, Dec 29, 2011 at 1:02 PM, Walter Bright <newshou...@digitalmars.com> wrote: > On 12/29/2011 12:12 AM, Gor Gyolchanyan wrote: >> >> This a a great idea! In this case the default string will be a >> random-access range, not a bidirectional range. Also, processing >> dstring is faster, then string, because no encoding needs to be done. >> Processing power is more expensive, then memory. utf-8 is valuable >> only to pass it as an ASCII string (which is not too common) and to >> store large chunks of it. Both these cases are much less common then >> all the rest of string processing. > > > dstring consumes 4x the memory, and this can easily cause perf degradations > due to thrashing and poor cache locality. -- Bye, Gor Gyolchanyan.