The question raised earlier by David Hollingsworth did not seem to get any responses from this list. I've pasted the text of the email below. I would also like clarification on why the utf-8 in unicode 3.1 only forbids conformant implementations from interpreting nonshortest forms for BMP characters --and does not forbid interpretation of all irregular sequences for all characters.
___ Date: 5 Oct 2001 18:23:58 -0000 From: "David E. Hollingsworth" <[EMAIL PROTECTED]> | Block Address | Add to Address Book To: [EMAIL PROTECTED] Subject: Handling irregular sequences The definition of UTF-32 (and the modifications to UTF-8 for Unicode 3.1) make it clear that conformant processes shall not generate irregular sequences. However, they do not (and perhaps they shouldn't) indicate what a process should do when encountering an irregular sequence, and I'm curious what people are doing in practice. One could apply the traditional Internet aphorism of being liberal in what one accepts, but that didn't pan out so well for non-shortest-form UTF-8, so in addition to wondering what people are doing in practice, I'm also curious about the follow theoretical issue: It doesn't seem very likely to me that someone would write a security check that depends on, say, passing Deseret code points but blocking musical notation code points; however, I wouldn't say it's impossible; moreover, a security check that wants to disallow all non-BMP characters doesn't seem quite so outlandish. If someone did write such a check, it seems to me that the attack described in UAX #27 would apply, by substituting "irregular sequence" for "non-shortest form": Process A performs security checks, but does not check for irregular sequences. Process B accepts the byte sequence from process A, and transforms it into UTF-16 while interpreting irregular sequences. The UTF-16 text may then contain characters that should have been filtered out by process A. Even if I'm mistaken about this, is there a specific argument *for* accepting irregular sequences? --deh! ___ Bernard __________________________________________________ Do You Yahoo!? Make a great connection at Yahoo! Personals. http://personals.yahoo.com