Re: [Standards] summary: allowable characters
Robin Redeker wrote: On Fri, Aug 03, 2007 at 04:29:15AM +0530, Mridul Muralidharan wrote: Just mentioning a basic problem which was discussed at jdev. If two 1.0 server move to 1.1, all the 'older' 1.0 jid's will become unroutable - which are present in user roster/affiliations/privacylists/etc. Yes, this sounds like the death blow for escaping for backward compatibility. It will poison the old 1.0 servers and make whole roster subscriptions unusable once that server upgrades to 1.1. (Not to mention the JIDs in the private XML storage or other places you mentioned). Do you see any problem in just disallowing incompatible 1.1 JIDs to be able to communicate with 1.0 JIDs? The old 1.0-compatible JID accounts on a 1.1 server will of course still be able to talk with people on 1.0 servers. The problem is 1.1 JID's cant communicate with 1.1 contact JID's - if user has [EMAIL PROTECTED], what will the 1.1 server do ? It could either be pointing to a 1.1 [EMAIL PROTECTED] (route as-is), was a 1.0 jid - convert to cont[EMAIL PROTECTED] (needs transformation) or continues to be 1.0 [EMAIL PROTECTED] (route as-is) (all three as different cases, though 1 and 3 look the same). The network won't be split the day servers start speaking XMPP 1.1. By preventing people with JIDs with incompatible characters to speak with 1.0 servers the 1.1 servers can prevent that split. Existing data will be present - and without jid meta-data, we cant associate encoding info. One possible option would be to move to use uri scheme for jid's - (and so this could be the differentiator for 1.1 vs 1.0). More importantly, it would help in case of interop with other protocols. Last time I brought this up, it was considered a bit too disruptive, and so dropped :-) Since Peter was considering 1.1 of xmpp, maybe this would be a good time to rethink this idea ! Regards, Mridul The 1.1-1.0 gap will grow with people who want to use the new characters in their JID, and hopefully the server administrators also upgrade their servers at the same speed that these people come. Clients would also have to take care whether they speak to a 1.0 or 1.1 server. A client error message like: your server doesn't support these characters in the JID, convince the admin to upgrade! will maybe even raise the pressure for admins a bit to upgrade :-) The problem with forcing admins to upgrade I see here is that they are maybe forced to upgrade to a unstable version or not so stable version as they had before. Robin
[Standards] summary: allowable characters
OK, we have had a long long discussion thread about JID Escaping and nodeprep and allowable characters in JIDs etc. Here I summarize the discussion and draw some conclusions for those of you who have checked out. :) 1. Support for XEP-0106: JID Escaping (i.e., mapping of ' to \27 etc.) is needed only in certain specialized deployment situations -- mainly when an organization wants to reuse existing userids (e.g., email addresses) or gateway to other messaging systems. 2. We needed to define JID escaping because version 1 of nodeprep (see RFC 3920) prohibits including the characters SP ' / : @ in XMPP node identifiers. (See also XEP-0029.) 3. IIRC we prohibited some of those characters because very early Jabber clients didn't properly escape things like ' in XMPP 'to' and 'from' addresses. 4. One solution would be to define version 2 of nodeprep in rfc3920bis. As far as I can see, nodeprep2 would allow ' since those can be escaped in XML (e.g., XMPP 'to' address) as the predefined entities quot; amp; apos; lt; gt;. I'm not sure why : was prohibited in the first place so that would be allowed. I suppose / was prohibited because it's used later in a full JID to differentiate the resource identifier, but in a node identifier I don't think it would be confusing so that would be allowed. Clearly we can't allow @ because we use that character as a separator between the node identifier and the domain identifier. So nodeprep2 would be the same as nodepre1 except that it would disallow only the at-sign (@). (Naturally we can discuss this further...) As to how it is discovered that a server supports nodeprep2, I will post a separate message about that. Peter -- Peter Saint-Andre https://stpeter.im/ smime.p7s Description: S/MIME Cryptographic Signature
Re: [Standards] summary: allowable characters
Hello On Thu, Aug 02, 2007 at 11:40:25AM -0600, Peter Saint-Andre wrote: Clearly we can't allow @ because we use that character as a separator between the node identifier and the domain identifier. Email address can contain @ in the username part - the identifier is the last @ in the address. But Emails don't have resources. We would have to decide which is more valuable to have in the node part allowed - one of them can not be allowed. Just wanted to point out it is theoretically possible to have @ in the node too. -- Anything is possible, unless it's not. Michal 'vorner' Vaner pgpf6KIJ5VK8f.pgp Description: PGP signature
Re: [Standards] summary: allowable characters
Michal 'vorner' Vaner wrote: Hello On Thu, Aug 02, 2007 at 11:40:25AM -0600, Peter Saint-Andre wrote: Clearly we can't allow @ because we use that character as a separator between the node identifier and the domain identifier. Email address can contain @ in the username part - the identifier is the last @ in the address. Really? I don't see that in RFC 2822, but I'm not fluent in ABNF. :) But Emails don't have resources. We would have to decide which is more valuable to have in the node part allowed - one of them can not be allowed. I'd vote against @ and say that a node could include /. /psa smime.p7s Description: S/MIME Cryptographic Signature
Re: [Standards] summary: allowable characters
On 8/2/07, Peter Saint-Andre [EMAIL PROTECTED] wrote: What specifically breaks? Does it depend on which characters would be allowed in nodeprep2? I agree that / and @ are problematic, but the characters ' seem less so. But I may be missing something. I believe this section was a left over from the original pre-rfc specification which was attempting to fit a JID into standardized URI notation, which specifically explains allowable characters, reserved characters, and characters which must be escaped. See rfc 2396. The section which deals with those characters is: The angle-bracket and and double-quote () characters are excluded because they are often used as the delimiters around URI in text documents and protocol fields. The character # is excluded because it is used to delimit a URI from a fragment identifier in URI references (Section 4). The percent character % is excluded because it is used for the encoding of escaped characters. delims = | | # | % | I'll go jump down in a hole again. :-) -- -- Thomas
Re: [Standards] summary: allowable characters
Thomas Charron wrote: On 8/2/07, Peter Saint-Andre [EMAIL PROTECTED] wrote: What specifically breaks? Does it depend on which characters would be allowed in nodeprep2? I agree that / and @ are problematic, but the characters ' seem less so. But I may be missing something. I believe this section was a left over from the original pre-rfc specification which was attempting to fit a JID into standardized URI notation, which specifically explains allowable characters, reserved characters, and characters which must be escaped. See rfc 2396. The section which deals with those characters is: The angle-bracket and and double-quote () characters are excluded because they are often used as the delimiters around URI in text documents and protocol fields. The character # is excluded because it is used to delimit a URI from a fragment identifier in URI references (Section 4). The percent character % is excluded because it is used for the encoding of escaped characters. delims = | | # | % | Yeah, but a JID isn't a URI and never will be, that's why I went through all the trouble of writing RFC 4622. But I agree with you that similar reasoning led to exclusion of those characters (and ' too). I'll go jump down in a hole again. :-) Oh don't, we cherish your occasional visits. :) Peter -- Peter Saint-Andre https://stpeter.im/ smime.p7s Description: S/MIME Cryptographic Signature
Re: [Standards] summary: allowable characters
Hi Peter, Peter Saint-Andre wrote: Mridul Muralidharan wrote: Peter Saint-Andre wrote: 4. One solution would be to define version 2 of nodeprep in rfc3920bis. As far as I can see, nodeprep2 would allow ' since those can be escaped in XML (e.g., XMPP 'to' address) as the predefined entities quot; amp; apos; lt; gt;. I'm not sure why : was prohibited in the first place so that would be allowed. I suppose / was prohibited because it's used later in a full JID to differentiate the resource identifier, but in a node identifier I don't think it would be confusing so that would be allowed. user/[EMAIL PROTECTED] and domain/[EMAIL PROTECTED] cant be differentiated if / is allowed. Interesting, I think you're right. Consider foo.com/[EMAIL PROTECTED], it could be the bare JID of a user foo.com/bar at jabber.org or a domain of foo.com with a resource of [EMAIL PROTECTED]. Not good. Btw, changing nodeprep now will cause quite a lot of problem with existing deployments - since the contact jid's are part of the user data - and would pretty much mean we cant adopt bis spec. What specifically breaks? Does it depend on which characters would be allowed in nodeprep2? I agree that / and @ are problematic, but the characters ' seem less so. But I may be missing something. The problem essentially is that any place where we have a JID persisted in the backend (user roster, acl's, affiliations, privacy lists/block lists, etc), it will become incompatible change. For example, what used to be [EMAIL PROTECTED] will now become contact[EMAIL PROTECTED] - causing incompatibilities. Regards, Mridul The number of deployments with these usecases are not as specialized as it might seem. I agree with that. Which is why I stand by XEP-0106. In part I think that those who are so opposed to XEP-0106 are not familiar with the deployment issues. But I agree that XEP-0106 needs to be clarified in the ways we discussed recently. It's on my list to complete those clarifications and post an interim version. /psa
Re: [Standards] summary: allowable characters
Mridul Muralidharan wrote: Peter Saint-Andre wrote: Mridul Muralidharan wrote: Btw, changing nodeprep now will cause quite a lot of problem with existing deployments - since the contact jid's are part of the user data - and would pretty much mean we cant adopt bis spec. What specifically breaks? Does it depend on which characters would be allowed in nodeprep2? I agree that / and @ are problematic, but the characters ' seem less so. But I may be missing something. The problem essentially is that any place where we have a JID persisted in the backend (user roster, acl's, affiliations, privacy lists/block lists, etc), it will become incompatible change. For example, what used to be [EMAIL PROTECTED] will now become contact[EMAIL PROTECTED] - causing incompatibilities. Well we're having a long discussion about this in the jdev room right now: http://www.jabber.org/muc-logs/[EMAIL PROTECTED]/2007-08-02.html I volunteer elmex for posting a summary once we're done. :) /psa smime.p7s Description: S/MIME Cryptographic Signature
Re: [Standards] summary: allowable characters
Peter Saint-Andre schrieb: Well we're having a long discussion about this in the jdev room right now: http://www.jabber.org/muc-logs/[EMAIL PROTECTED]/2007-08-02.html I just read the log. Sounds good and is how I intended/proposed that it would work: - Escaping JIDs when sending to a server that does not support the eXtended iIDs (are these XIDs then? *g*) - Not doing unescaping, when a JID is displayed. Matthias
Re: [Standards] summary: allowable characters
(Warning, long mail ahead! Get a coffee and some time first :-) On Thu, Aug 02, 2007 at 03:34:30PM -0600, Peter Saint-Andre wrote: Mridul Muralidharan wrote: The problem essentially is that any place where we have a JID persisted in the backend (user roster, acl's, affiliations, privacy lists/block lists, etc), it will become incompatible change. For example, what used to be [EMAIL PROTECTED] will now become contact[EMAIL PROTECTED] - causing incompatibilities. Well we're having a long discussion about this in the jdev room right now: http://www.jabber.org/muc-logs/[EMAIL PROTECTED]/2007-08-02.html I volunteer elmex for posting a summary once we're done. :) Yes, basically Mridul is completly right, we can't do much about the already deployed backslashes in JIDs. Especially in 1.0 server rosters. But... First we have to wonder whethere there are actually people with [EMAIL PROTECTED] in their roster, as registering a JID with a \ in the username is a considerable problem with XMPP 1.0 servers with SASL and DIGEST-MD5 (see some older message from me in the JID escaping thread). Of course that should be further investigaged as old-style IQ auth works with [EMAIL PROTECTED] and also some jabberd2 servers allow authentication as [EMAIL PROTECTED] without problems. But there exists a possibility to migrate our old JIDs to the 1.1 world and staying interoperable with 1.0 servers: First: A 1.1 server that is going to communicate with 1.0 server will escape the JIDs from his userbase when he SENDS to a 1.0 entity. Escaping can be performed as described in XEP-0106 (after dropping the silly \20 escaping rule). That will work great if the 1.1 server has NO old userbase. If we have for example jabber.org, a large userbase, and there is actually [EMAIL PROTECTED] as registered user in. And we want to upgrade to a 1.1 server then we will run into the problems Mridul pointed out: 1.0 servers have [EMAIL PROTECTED] in their roster, and if we have now 'stpeter @jabber.org' registering a new account he will collide with that, because his JID will be escaped to the in the 1.0 servers roster existing [EMAIL PROTECTED] Bang, we got a collision. There exists no real easy way to prevent that except just not allowing 'stpeter @jabber.org' to register. To detect a case like this, that a new user with a colliding JID registers, the 1.1 server needs to keep track of the old JIDs in his database. If the 1.1 server knows that [EMAIL PROTECTED] is a JID from the pre-1.1 times, he can assume that [EMAIL PROTECTED] is already in some rosters out there. So he MUST NOT allow anyone who might collide with that to register at jabber.org after the migration to 1.1. So when upgrading jabber.org could just mark all JIDs with a \ in their name to be a pre-1.1 JID and disallow anyone to register who might collide with one of the registered JIDs. This way ' [EMAIL PROTECTED]' can register if no '[EMAIL PROTECTED]' existed before (he knows that from his database with the marks of old JIDs). If ' [EMAIL PROTECTED]' now wants to talk with '[EMAIL PROTECTED]', it would look like this: message from= [EMAIL PROTECTED] to=[EMAIL PROTECTED] / As jabber.org (1.1) knows that chrome.pl (1.0) is in fact 1.0 he escapes like XEP-0106 recommends and sends actually: message from=[EMAIL PROTECTED] to=[EMAIL PROTECTED] / In [EMAIL PROTECTED]'s client will now popup a message from [EMAIL PROTECTED] and except some weird JID he can talk with him. Because if he sends a message back: message from=[EMAIL PROTECTED] to=[EMAIL PROTECTED] / Then jabber.org will unescape the to-field and deliver the message to ' [EMAIL PROTECTED]'. Of course this solution is not a perfect one for the end-users as I will describe below, but I argue that the incompatibilities will increase the pressure on developers a bit and on administrators to adapt XMPP 1.1. And thus that might speed up the migration while providing a compatibility-workaround for maybr 98-99% of the cases, or maybe even 99,% (this needs to be investigated a big IMO, maybe my assumptions are completly wrong). So much for the server-to-server interoperability. Now about 1.1 clients and 1.0 clients. 1.0 clients will have no way to reach ' [EMAIL PROTECTED]', which is fine, either the user knows that guy's JID needs to be escaped because he uses an old client, or he has to upgrade to a client with 1.1 capabilities (what this means is described below). Not being able to send a message to ' [EMAIL PROTECTED]' will increase the pressure on the client developers as stated above. So 1.0 clients are basically out of luck if the user don't know how to escape, however, tell em: get a new client. Of course it's blunt to say that, but I guess we can assume that not our WHOLE old userbase without spaces and all those fancy characters in their JID are NOT GOING TO signup a new account. So the users with spaces and or whatever in
Re: [Standards] summary: allowable characters
Matthias Wimmer wrote: Peter Saint-Andre schrieb: Well we're having a long discussion about this in the jdev room right now: http://www.jabber.org/muc-logs/[EMAIL PROTECTED]/2007-08-02.html I just read the log. Sounds good and is how I intended/proposed that it would work: - Escaping JIDs when sending to a server that does not support the eXtended iIDs (are these XIDs then? *g*) That's good. :) - Not doing unescaping, when a JID is displayed. Right. Peter -- Peter Saint-Andre https://stpeter.im/ smime.p7s Description: S/MIME Cryptographic Signature
Re: [Standards] summary: allowable characters
Dnia 03-08-2007, pią o godzinie 00:28 +0200, Robin Redeker napisał(a): There exists no real easy way to prevent that except just not allowing 'stpeter @jabber.org' to register. To detect a case like this, that a new user with a colliding JID registers, the 1.1 server needs to keep track of the old JIDs in his database. There is really no difference whether you're colliding an 1.0 username, or 1.1. From the connecting 1.0 server perspective it does not matter. So on 1.1 server you just unescape the requested username and check for collision. If there is one, you deny the request. As for the whole idea. The escaping-unescaping is going to happen completely behind the scene, on the server side on both s2s and c2s connections. On the 1.1 side. It's the 1.1 endpoint (no matter if it is server or client) to do escaping sent data and unescaping received data when it talks with 1.0. Well... - We have unchanged 1.0 servers - We have unchanged 1.0 clients (it's users need to escape manually) - We have interoperability 1.0 - 1.1 - We have all possible characters in the nodepart of the JID Looks good to me. -- Tomasz Sterna Xiaoka Grp. http://www.xiaoka.com/
Re: [Standards] summary: allowable characters
Just mentioning a basic problem which was discussed at jdev. If two 1.0 server move to 1.1, all the 'older' 1.0 jid's will become unroutable - which are present in user roster/affiliations/privacylists/etc. Regards, Mridul Robin Redeker wrote: (Warning, long mail ahead! Get a coffee and some time first :-) On Thu, Aug 02, 2007 at 03:34:30PM -0600, Peter Saint-Andre wrote: Mridul Muralidharan wrote: The problem essentially is that any place where we have a JID persisted in the backend (user roster, acl's, affiliations, privacy lists/block lists, etc), it will become incompatible change. For example, what used to be [EMAIL PROTECTED] will now become contact[EMAIL PROTECTED] - causing incompatibilities. Well we're having a long discussion about this in the jdev room right now: http://www.jabber.org/muc-logs/[EMAIL PROTECTED]/2007-08-02.html I volunteer elmex for posting a summary once we're done. :) Yes, basically Mridul is completly right, we can't do much about the already deployed backslashes in JIDs. Especially in 1.0 server rosters. But... First we have to wonder whethere there are actually people with [EMAIL PROTECTED] in their roster, as registering a JID with a \ in the username is a considerable problem with XMPP 1.0 servers with SASL and DIGEST-MD5 (see some older message from me in the JID escaping thread). Of course that should be further investigaged as old-style IQ auth works with [EMAIL PROTECTED] and also some jabberd2 servers allow authentication as [EMAIL PROTECTED] without problems. But there exists a possibility to migrate our old JIDs to the 1.1 world and staying interoperable with 1.0 servers: First: A 1.1 server that is going to communicate with 1.0 server will escape the JIDs from his userbase when he SENDS to a 1.0 entity. Escaping can be performed as described in XEP-0106 (after dropping the silly \20 escaping rule). That will work great if the 1.1 server has NO old userbase. If we have for example jabber.org, a large userbase, and there is actually [EMAIL PROTECTED] as registered user in. And we want to upgrade to a 1.1 server then we will run into the problems Mridul pointed out: 1.0 servers have [EMAIL PROTECTED] in their roster, and if we have now 'stpeter @jabber.org' registering a new account he will collide with that, because his JID will be escaped to the in the 1.0 servers roster existing [EMAIL PROTECTED] Bang, we got a collision. There exists no real easy way to prevent that except just not allowing 'stpeter @jabber.org' to register. To detect a case like this, that a new user with a colliding JID registers, the 1.1 server needs to keep track of the old JIDs in his database. If the 1.1 server knows that [EMAIL PROTECTED] is a JID from the pre-1.1 times, he can assume that [EMAIL PROTECTED] is already in some rosters out there. So he MUST NOT allow anyone who might collide with that to register at jabber.org after the migration to 1.1. So when upgrading jabber.org could just mark all JIDs with a \ in their name to be a pre-1.1 JID and disallow anyone to register who might collide with one of the registered JIDs. This way ' [EMAIL PROTECTED]' can register if no '[EMAIL PROTECTED]' existed before (he knows that from his database with the marks of old JIDs). If ' [EMAIL PROTECTED]' now wants to talk with '[EMAIL PROTECTED]', it would look like this: message from= [EMAIL PROTECTED] to=[EMAIL PROTECTED] / As jabber.org (1.1) knows that chrome.pl (1.0) is in fact 1.0 he escapes like XEP-0106 recommends and sends actually: message from=[EMAIL PROTECTED] to=[EMAIL PROTECTED] / In [EMAIL PROTECTED]'s client will now popup a message from [EMAIL PROTECTED] and except some weird JID he can talk with him. Because if he sends a message back: message from=[EMAIL PROTECTED] to=[EMAIL PROTECTED] / Then jabber.org will unescape the to-field and deliver the message to ' [EMAIL PROTECTED]'. Of course this solution is not a perfect one for the end-users as I will describe below, but I argue that the incompatibilities will increase the pressure on developers a bit and on administrators to adapt XMPP 1.1. And thus that might speed up the migration while providing a compatibility-workaround for maybr 98-99% of the cases, or maybe even 99,% (this needs to be investigated a big IMO, maybe my assumptions are completly wrong). So much for the server-to-server interoperability. Now about 1.1 clients and 1.0 clients. 1.0 clients will have no way to reach ' [EMAIL PROTECTED]', which is fine, either the user knows that guy's JID needs to be escaped because he uses an old client, or he has to upgrade to a client with 1.1 capabilities (what this means is described below). Not being able to send a message to ' [EMAIL PROTECTED]' will increase the pressure on the client developers as stated above. So 1.0 clients are basically out of luck if the user don't know how to escape, however, tell em: get a new client.