Re: Why the hell doesn't foreach decode strings

Norbert Nemec Mon, 24 Oct 2011 12:25:39 -0700

On 21.10.2011 06:06, Jonathan M Davis wrote:

It's this very problem that leads some people to argue that string should be
its own type which holds an array of code units (which can be accessed when
needed) rather than doing what we do now where we try and treat a string as
both an array of chars and a range of dchars. The result is schizophrenic.

Indeed - expressing strings as arrays of characters will always fallshort of the unicode concept in some way. A true unicode-compliantlanguages have to handle strings as opaque objects that do not have anyencoding. There is a number of operations that can be done with theseobjects (concatenation, comparison, searching, etc.). Any kind ofdefined memory representation can only be obtained by an explicitencoding operation.

Python3, for example, did a fundamental step by introducing thisfundamental distinction. At first it seems silly, having to think aboutencodings so often when writing trivial code. After a short while, thestrict conceptual separation between unencoded "strings" and encoded"arrays of something" really helps avoiding ugly problems.

Sure, for a performance-critical language, the issue becomes a lottrickier. I still think it is possible and ultimately the only way tosolve tricky problems that will otherwise always crop up somewhere.

Re: Why the hell doesn't foreach decode strings

Reply via email to