Dominikus Scherkl wrote:
This is a special, custom form of error handling - why assign a character for it?My other suggestion (and the main reason to call the proposed charakter "source failure indicator symbol" (SFIS)) was intended especaly for mall-formed utf-8 input that has overlong encodings.In this special case a converter exactly knows which char is intended, but needs to put out an error to avoid ambiguities. In this case by now it MUST replace the overlong char by U+FFFD (or even cancel the conversion!). But I think SFIS + intended-char is a far better approach, because it 1) warns the reader AND keeps the text readable 2) distinguish overlong encodings from illegal char sequenzes.
You could just use an existing character or non-character for this, e.g., U+303E or U+FFFF or U+FDEF or similar.
markus
--
Opinions expressed here may not reflect my company's positions unless otherwise noted.