On Thu, 04 Feb 2010 18:41:48 -0700, Rainer Deyke wrote: > Andrei Alexandrescu wrote: >> One idea I've had for a while was to have a universal string type: >> >> struct UString { >> union { >> char[] utf8; >> wchar[] utf16; >> dchar[] utf32; >> } >> enum Discriminator { utf8, utf16, utf32 }; Discriminator kind; >> IntervalTree!(size_t) skip; >> ... >> } >>
Firstly, for such "augmented types" in D, such as strings, bignums or any future ideas , it is great to have the facilities of creating them using the struct, so that they can be used elsewhere without regards to whether they are built in as compiler specials or in the library. What is there for struct now is good and getting better in D2, but I still feel a little insecure with understanding how to make a really optimal implementation that is as good as a built in type that the compiler understands. The DPL is being been a help for this. Programmers will want to use raw char[] wchar[] dchar[] for whatever reasons with their ?simple? behaviours, so they should not be made unavailable because more sophisticated types are creatable, purely for unicode strings. I have made a UString implementation, similar to above. But I played a different trick. I was interested for this to also maintain a terminating null char for conversion passing to Windows API functions, in particular 16 bit W. interfaces. struct UString_char { char[] str_; /// ... lots of good D type stuff, constructor and assign conversions, access size_t length() { return str_.length - 1; // hide terminating null } } struct UString_wchar { wchar[] str_; /// ditto D type stuff } struct UString_dchar { dchar[] str_; } // throw in void[] for charity. (although no one will need it) struct UString_void { void[] ptr_; } enum UStringType { UC_CHAR, UC_WCHAR, UC_DCHAR } struct UString { union { UString_void vstr; UString_char cstr; UString_wchar wstr; UString_dchar dstr; } UStringType ztype; // type things to track what we are. } I could then choose individual components by themselves, where appropriate, even get them working. In D2 immutable not for str_ array, while appending or fiddling null terminator. I did not get associative array working as a key using UString, have not tried since. Also made a class version called VString containing the union. There's a lot of issues. I also must acknowledge the prior art of the mtext code, and its MString structure type. I was partly inspired by seeing this, and how complex it was to do nearly everything. When last I checked mtext it was kind of broken for recent D1 and D2 compilers, and I did not want to fix. I admit I did not like the complexity of the direct union { char[], wchar[], dchar[] } Splitting up into seperatedly usable structs seems to me to give 3 times the potential for the same price. The advantage of using struct for such types is it may help bring about perfection of such a POD based "type creation" facility. I note from looking at some of the phobos D2 code, eg std.array, this seems to be attempted in places. Nearly all the more interesting D types, arrays, maps, are all equivalent to smallish POD types, with at least 2-3 times the machine word size (32/64 bit). Making it all work and understandable and avoiding WTFbug is a big challenge.