Re: [Standards] JID Escaping

Robin Redeker Fri, 20 Jul 2007 03:42:06 -0700

Hi!

I've been reading through the JID escaping once again.  The escaping collisions
are gone (with the \5c exception being gone) comparing to the last time I've
read it.


I think it should be clarified that the JID escaping should _only_ be used in
gateways which want to map outside-strings to a JID node. The reason why I come
to this conclusion is that it's not practical for regular IM clients to
unescape a JID, see below:

It should only be used in a E-Mail client that sends and retrieves mails via
XMPP. All other clients will run into a name-spoofing problem (see below).

In section '2. Requirements' is said:

   It MUST NOT be possible for clients to use this escaping mechanism to avoid
   the goal of stringprep; namely, that JIDs that look alike should have same
   character representation after being processed by stringprep.

The whole purpose of the mechanism described in XEP-0106 is to _avoid_ the goal
of stringprep (to be more exact, it avoids the goal of the nodeprep profile of
stringprep).

I understand that paragraph like the following: The JID escaping should not be
used to allow users to enter node parts for the JID which contain invalid
characters (like &, @, ...). It's only thought as a mapping between eg. e-mail
addresses and JIDs for gateways.

If that is not true and it actually is for using invalid characters in the node
part of a JID, and clients unescape _every_ JID they see we will run into
problems (see below).

For escaping following rule has to be followed:

   * Note: The character sequence \20 MUST NOT be the first or last character
      of an escaped node identifier.

This can only be seen as advice to the client authors not to allow spaces
at the beginning and ending of names.

But please observe that JIDs like this are still valid:

   [EMAIL PROTECTED]

If now a client comes and unescapes that JID it will collide (visually) with
'[EMAIL PROTECTED]'.  There is no rule that unescaping such JIDs is not allowed,
and such a rule would make no sense as it would break the display of perfectly
fine JIDs.

This is the main reason why escaping end unescaping MUST only be done for 
gateway
applications, and there on the side of the gateway and the client.

Unescaping should NOT apply for regular XMPP messaging in any form. When 
receiving
a message from a gateway the client could check of course "is this i a email 
gateway?"
and then perform unescaping of the node part and display the source email 
address
the message came from.

Regular clients, which don't implement any gateway specialcases, should NOT 
handle
escaping at all.

Please also note that the example '5.1 Jabber Identifiers' in the XEP
is misleading:

#       User Input                      Escaped JID                             
Client Display
1       space [EMAIL PROTECTED]         [EMAIL PROTECTED]               space 
[EMAIL PROTECTED]
2       call me "ishmael"@example.com   [EMAIL PROTECTED]   call me 
"ishmael"@example.com

A user CAN'T input a broken JID, the client can't parse broken JIDs like that
or should at least scream out loud if a user tries to enter such a JID.
For escaping the user should get a form like this:

   Node:     space cadet
   Domain:   example.com
   Resource:

The example is misleading because client author could think that they
should implement a heuristic to parse broken JIDs and escape them if they
are not valid JIDs.

I think the whole XEP should be renamed to something like:

   XEP-0106 - JID Mapping for Gateways



Maybe I got everything wrong, but this is the only way I am able to
make sense of that XEP. A true escaping mechanism must be also understood
by the server. And servers then have to unescape the JIDs to compare them.



Robin

Re: [Standards] JID Escaping

Reply via email to