Re: UTF-8 Error Handling (was: Re: Unicode 4.0 BETA available for review)

Asmus Freytag Mon, 03 Mar 2003 12:32:44 -0800

At 11:52 AM 3/3/03 -0800, Mark Davis wrote:

Perhaps I wasn't clear; I agree with you on that.

1) It is conformant to skip or substitute text, with just a code at the end
indicating that something of that sort was done.

It's a subtle point, but can be put into your formulation:

What I was after is where the "substitution" itself isn't legal Unicode, i.e. an unpaired surrogate in UTF-32. My take is that, formally speaking, as long as there's an indication of an error condition, I'm free to put anything into the output buffer, even malformed Unicode, and still be conformant.

2) Or, if someone wants more flexibility, to stop at possible errors, and
give the client of the API information so that they can do more complex
processing.

Mark

Re: UTF-8 Error Handling (was: Re: Unicode 4.0 BETA available for review)

Reply via email to