Norman Rasmussen schrieb:
XML defines the list of valid characters to be:
   #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF]

Which of the following should an IM application perform if the user
(attempts to) enter characters outside of this range?

What might the user enter outside this range? I guess that the user is not able to accidently enter characters outside this range.

1) Reject the entry at the UI level - have to check both keypresses,
and clipboard paste
2) UI should filter invalid chars before sending data to xmpp object layer

I'd check for invalid characters when converting data from the UI to the Application-Backend in methods of the backend. But I would not filter, but reject function/method containing invalid characters.

This allows you to reuse the checking if your UI changes but keep the backend which most likely will represent the data in an XML-DOM like manner in a state where only characters that are allowed by XML are present.

3) xmpp object layer should filter/reject data
4) xmpp stream layer should filter/reject xmpp object

An alternate possibility to handle the characters from #x0 - #x1F (excluding #x9, #xA and #xD) is to substitude them with the characters from #x2400 - #x241F.

... or you could use XML1.1 where the set of allowed characters is less restrictive: Char ::= [#x1-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF] /* any Unicode character, excluding the surrogate blocks, FFFE, and FFFF. */

But I am not sure if XMPP allowes usage of XML 1.1. I could not find anything on that at my first look at RFC3920 / RFC3920bis. It seems to be undefined.


Matthias

Reply via email to