2011/6/26 Jonathan M Davis <[email protected]>: > On 2011-06-25 15:15, kenji hara wrote: >> > 1. Keep toStringz as it is (as well as toUTF16z) and either consider >> > stringz to be some sort of word unique to the D community or just admit >> > that we're not going to camelcase it because it would break too much >> > code to do so. >> >> ++vote, but not all. >> >> Currently, the return type of toStringz is "zero-termniated UTF-8", >> not "C-string". >> >> The 'C-string' word has multiple meanings=encodings. ASCII, Latin-1, >> EUC, Shift-JIS (in Japan), UTF-8 (Linux?), UTF-16 (in Windows) ... >> It depends on context. >> >> But, maybe, many of ’C-string' equals to "zero-terminated UTF-8' or >> "zero-terminated UTF-16". >> Other encodings should be supported by another module (std.encoding? >> Is it living?). >> >> My proposal: >> 1. Add three aliased types. >> alias immutable(char)* stringz; // useful in Linux >> alias immutable(wchar)* wstringz; // useful in Windows >> alias immutable(dchar)* dstringz; // >> 2. Rename current toStringz to toUTF8z, and add deprecated aliasing >> 'toStringz' to keep compatibility. >> (Adding toUTF32z in std.string module will increase consistency. >> Templated toUTFXXz family is more better.) >> 3. std.conv.to support conversion from 'any string type' to >> (|wd)stringz type (by using toUTFXXz family). >> >> The main point is we should make the aliased type names as 'De facto' >> type names, like string, wstring, dstring. (Remember the three string >> types are aliased type in fact.) >> >> We can treat the type name uint as 'unsigned int'. Because it is just >> built-in type name! >> >> User defined type names shoude be camel cased usually in D. >> Then, let's make them built-in! Therefore we can remove camel cased >> names from our choices. >> >> I think this proposal is usefulness, keeping compatibility, and consistent. > > From this and related discussions, it seems that the current plan is to create > a toUTFz function which is templated on the pointer type that you want > returned (char*, const(char)*, immutable(char)*, wchar*, etc.) and which takes > any string type. Then you can get a zero-terminated string with whatever level > of constness you want from any string. std.conv.to would then be updated such > that converting from any string to any character pointer would call toUTFz. We > may or may not have toStringz, toWstringz, and toDstringz which use toUTFz. > > Regardless, I don't see much point in creating the types stringz, wstringz, > and dstringz. There's nothing which guarantees that they're going to be zero- > terminated, so they could be complete misnomers, depending on how they're > used, and they're specifically immutable whereas you often need mutable zero- > terminated strings. So, ultimately, I don't think that they'd add much. We > _do_ need better conversion functions though. > > - Jonathan M Davis >
> There's nothing which guarantees that they're going to be zero- > terminated, so they could be complete misnomers, depending on how they're > used, Ah, you are right. I didn't think about it. I agree to you. > to create > a toUTFz function which is templated on the pointer type that you want > returned (char*, const(char)*, immutable(char)*, wchar*, etc.) I tihnk the templated function toUTFz needs default type inference feature like follows: ---- string s = "..."; auto sz = toUTFz(s); static assert(is(typeof(sz) == immutable(char)*)); ---- Thanks for your explain. Kenji
