Andrei Alexandrescu wrote: > One idea I've had for a while was to have a universal string type: > > struct UString { > union { > char[] utf8; > wchar[] utf16; > dchar[] utf32; > } > enum Discriminator { utf8, utf16, utf32 }; > Discriminator kind; > IntervalTree!(size_t) skip; > ... > } > > The IntervalTree stores the skip amounts that must be added for a given > index in the string. For ASCII strings that would be null. Then its size > grows with the number of multibyte characters. Beyond a threshold, > representation is transparently switched to utf16 or utf32 as needed and > the tree becomes smaller or null again.
Although I see some potential in a universal string type, I don't think this is the right implementation strategy. I'd rather have my short strings in utf-32 (optimized for speed) and my long strings in utf-8/utf-16 (optimized for memory usage). -- Rainer Deyke - rain...@eldwood.com