On Wednesday, 2 August 2017 at 17:37:09 UTC, Steven Schveighoffer
wrote:
What is expected? What I see on the screen when I run my code
is:
[Ü]
Upper case?
What I see when I run your "working" code is:
[?]
Your terminal is incapable of rendering the Latin-1 encoding. The
program prints one byte of value 0xfc. You may pipe the output
into hexdump -C:
00000000 5b fc 5d 0a |[ü].|
00000004
You are missing the point that your input string is invalid.
It's perfectly okay to put any value a octet can take into an
octet. I did not claim that the data in the string memory is
syntactically valid UTF-8. Read the comment in line 9 of my post
of 15:02:22.
std.algorithm is not validating the entire string,
True and it should not. So this is what I want.
and so it doesn't throw an error like string.stripLeft does.
That is the point. You wrote
| I wouldn't expect good performance from this, as there is
auto-decoding all
| over the place.
I erroneously thought that using byCodeUnit disables the whole
UTF-8 processing and enforces operation on (u)bytes. But this is
not the case at least not for stripLeft and probably other string
functions.
writeln doesn't do any decoding of individual strings. It
avoids the problem and just copies your bad data directly.
That is what I expected.