Re: Relaxing the definition of isSomeString and isNarrowString

via Digitalmars-d Sun, 24 Aug 2014 11:30:50 -0700

On Sunday, 24 August 2014 at 18:19:45 UTC, Andrew Godfrey wrote:

The OP and the question of auto-decoding share the same rootproblem: Even though D does a lot better with UTF than otherlanguages I've used, it still confuses characters with codepoints somewhat. "Element type is some character" is an examplefrom OP. So clarify for me:If a programmer makes an array of either 'char' or 'wchar',does that always, unambiguously, mean a UTF8 or UTF16 codepoint?


It has to, because it is required by the specification. But ...

E.g. If interoperating with C code, they will never make themistake of using these types for a non-string byte/word array?

... of course this cannot be guaranteed. In fact, even thedruntime currently just assumes that program arguments andenvironment variables are UTF8 encoded on Unix, AFAIK. This istrue in most cases, but of course not guaranteed. Potentiallyalso problematic are the functions taking filenames. In Unix,filenames are just opaque arrays of bytes, but those functionstake `string` parameters, i.e. assuming UTF8 encoding. This couldforce the user to place non-UTF8 sequences into strings.

Re: Relaxing the definition of isSomeString and isNarrowString

Reply via email to