Re: [Standards] summary: allowable characters

2007-08-04 Thread Mridul Muralidharan

Robin Redeker wrote:

On Fri, Aug 03, 2007 at 04:29:15AM +0530, Mridul Muralidharan wrote:


Just mentioning a basic problem which was discussed at jdev.

If two 1.0 server move to 1.1, all the 'older' 1.0 jid's will become 
unroutable - which are present in user roster/affiliations/privacylists/etc.




Yes, this sounds like the death blow for escaping for backward
compatibility. It will poison the old 1.0 servers and make whole roster
subscriptions unusable once that server upgrades to 1.1. (Not to mention
the JIDs in the private XML storage or other places you mentioned).

Do you see any problem in just disallowing incompatible 1.1 JIDs to be
able to communicate with 1.0 JIDs? The old 1.0-compatible JID accounts
on a 1.1 server will of course still be able to talk with people on 1.0
servers.


The problem is 1.1 JID's cant communicate with 1.1 contact JID's -
if user has [EMAIL PROTECTED], what will the 1.1 server do ? It could 
either be pointing to a 1.1 [EMAIL PROTECTED] (route as-is), was a 1.0 
jid - convert to cont[EMAIL PROTECTED] (needs transformation) or continues to 
be 1.0 [EMAIL PROTECTED] (route as-is) (all three as different cases, 
though 1 and 3 look the same).




The network won't be split the day servers start speaking XMPP 1.1.
By preventing people with JIDs with incompatible characters to speak
with 1.0 servers the 1.1 servers can prevent that split.


Existing data will be present - and without jid meta-data, we cant 
associate encoding info.


One possible option would be to move to use uri scheme for jid's - (and 
so this could be the differentiator for 1.1 vs 1.0).

More importantly, it would help in case of interop with other protocols.

Last time I brought this up, it was considered a bit too disruptive, and 
so dropped :-) Since Peter was considering 1.1 of xmpp, maybe this would 
be a good time to rethink this idea !



Regards,
Mridul



The 1.1-1.0 gap will grow with people who want to use the new
characters in their JID, and hopefully the server administrators also
upgrade their servers at the same speed that these people come.

Clients would also have to take care whether they speak to a 1.0 or 1.1
server. A client error message like: your server doesn't support these
characters in the JID, convince the admin to upgrade! will maybe even
raise the pressure for admins a bit to upgrade :-)

The problem with forcing admins to upgrade I see here is that they are
maybe forced to upgrade to a unstable version or not so stable version
as they had before.



Robin





[Standards] summary: allowable characters

2007-08-02 Thread Peter Saint-Andre
OK, we have had a long long discussion thread about JID Escaping and
nodeprep and allowable characters in JIDs etc. Here I summarize the
discussion and draw some conclusions for those of you who have checked
out. :)

1. Support for XEP-0106: JID Escaping (i.e., mapping of ' to \27 etc.)
is needed only in certain specialized deployment situations -- mainly
when an organization wants to reuse existing userids (e.g., email
addresses) or gateway to other messaging systems.

2. We needed to define JID escaping because version 1 of nodeprep (see
RFC 3920) prohibits including the characters SP   ' / :   @ in XMPP
node identifiers. (See also XEP-0029.)

3. IIRC we prohibited some of those characters because very early Jabber
clients didn't properly escape things like   '   in XMPP 'to' and
'from' addresses.

4. One solution would be to define version 2 of nodeprep in rfc3920bis.
As far as I can see, nodeprep2 would allow   '   since those can be
escaped in XML (e.g., XMPP 'to' address) as the predefined entities
quot; amp; apos; lt; gt;. I'm not sure why : was prohibited in the
first place so that would be allowed. I suppose / was prohibited because
it's used later in a full JID to differentiate the resource identifier,
but in a node identifier I don't think it would be confusing so that
would be allowed. Clearly we can't allow @ because we use that character
as a separator between the node identifier and the domain identifier. So
nodeprep2 would be the same as nodepre1 except that it would disallow
only the at-sign (@). (Naturally we can discuss this further...) As to
how it is discovered that a server supports nodeprep2, I will post a
separate message about that.

Peter

-- 
Peter Saint-Andre
https://stpeter.im/



smime.p7s
Description: S/MIME Cryptographic Signature


Re: [Standards] summary: allowable characters

2007-08-02 Thread Michal 'vorner' Vaner
Hello

On Thu, Aug 02, 2007 at 11:40:25AM -0600, Peter Saint-Andre wrote:
 Clearly we can't allow @ because we use that character
 as a separator between the node identifier and the domain identifier.

Email address can contain @ in the username part - the identifier is the
last @ in the address. But Emails don't have resources. We would have to
decide which is more valuable to have in the node part allowed - one of
them can not be allowed.

Just wanted to point out it is theoretically possible to have @ in the
node too.

-- 
Anything is possible, unless it's not.

Michal 'vorner' Vaner


pgpf6KIJ5VK8f.pgp
Description: PGP signature


Re: [Standards] summary: allowable characters

2007-08-02 Thread Peter Saint-Andre
Michal 'vorner' Vaner wrote:
 Hello
 
 On Thu, Aug 02, 2007 at 11:40:25AM -0600, Peter Saint-Andre wrote:
 Clearly we can't allow @ because we use that character
 as a separator between the node identifier and the domain identifier.
 
 Email address can contain @ in the username part - the identifier is the
 last @ in the address. 

Really? I don't see that in RFC 2822, but I'm not fluent in ABNF. :)

 But Emails don't have resources. We would have to
 decide which is more valuable to have in the node part allowed - one of
 them can not be allowed.

I'd vote against @ and say that a node could include /.

/psa



smime.p7s
Description: S/MIME Cryptographic Signature


Re: [Standards] summary: allowable characters

2007-08-02 Thread Thomas Charron
On 8/2/07, Peter Saint-Andre [EMAIL PROTECTED] wrote:
 What specifically breaks? Does it depend on which characters would be
 allowed in nodeprep2? I agree that / and @ are problematic, but the
 characters   '   seem less so. But I may be missing something.

  I believe this section was a left over from the original pre-rfc
specification which was attempting to fit a JID into standardized URI
notation, which specifically explains allowable characters, reserved
characters, and characters which must be escaped.  See rfc 2396.  The
section which deals with those characters is:


The angle-bracket  and  and double-quote () characters are
   excluded because they are often used as the delimiters around URI in
   text documents and protocol fields.  The character # is excluded
   because it is used to delimit a URI from a fragment identifier in URI
   references (Section 4). The percent character % is excluded because
   it is used for the encoding of escaped characters.

   delims  =  |  | # | % | 


  I'll go jump down in a hole again.  :-)

-- 
-- Thomas


Re: [Standards] summary: allowable characters

2007-08-02 Thread Peter Saint-Andre
Thomas Charron wrote:
 On 8/2/07, Peter Saint-Andre [EMAIL PROTECTED] wrote:
 What specifically breaks? Does it depend on which characters would be
 allowed in nodeprep2? I agree that / and @ are problematic, but the
 characters   '   seem less so. But I may be missing something.
 
   I believe this section was a left over from the original pre-rfc
 specification which was attempting to fit a JID into standardized URI
 notation, which specifically explains allowable characters, reserved
 characters, and characters which must be escaped.  See rfc 2396.  The
 section which deals with those characters is:
 
 
 The angle-bracket  and  and double-quote () characters are
excluded because they are often used as the delimiters around URI in
text documents and protocol fields.  The character # is excluded
because it is used to delimit a URI from a fragment identifier in URI
references (Section 4). The percent character % is excluded because
it is used for the encoding of escaped characters.
 
delims  =  |  | # | % | 
 

Yeah, but a JID isn't a URI and never will be, that's why I went through
all the trouble of writing RFC 4622. But I agree with you that similar
reasoning led to exclusion of those characters (and  ' too).

   I'll go jump down in a hole again.  :-)

Oh don't, we cherish your occasional visits. :)

Peter

-- 
Peter Saint-Andre
https://stpeter.im/



smime.p7s
Description: S/MIME Cryptographic Signature


Re: [Standards] summary: allowable characters

2007-08-02 Thread Mridul Muralidharan


Hi Peter,

Peter Saint-Andre wrote:

Mridul Muralidharan wrote:

Peter Saint-Andre wrote:

4. One solution would be to define version 2 of nodeprep in rfc3920bis.
As far as I can see, nodeprep2 would allow   '   since those can be
escaped in XML (e.g., XMPP 'to' address) as the predefined entities
quot; amp; apos; lt; gt;. I'm not sure why : was prohibited in the
first place so that would be allowed. I suppose / was prohibited because
it's used later in a full JID to differentiate the resource identifier,
but in a node identifier I don't think it would be confusing so that
would be allowed. 


user/[EMAIL PROTECTED] and domain/[EMAIL PROTECTED] cant be differentiated if / 
is
allowed.


Interesting, I think you're right. Consider foo.com/[EMAIL PROTECTED], it
could be the bare JID of a user foo.com/bar at jabber.org or a domain
of foo.com with a resource of [EMAIL PROTECTED]. Not good.


Btw, changing nodeprep now will cause quite a lot of problem with
existing deployments - since the contact jid's are part of the user data
- and would pretty much mean we cant adopt bis spec.


What specifically breaks? Does it depend on which characters would be
allowed in nodeprep2? I agree that / and @ are problematic, but the
characters   '   seem less so. But I may be missing something.



The problem essentially is that any place where we have a JID persisted 
in the backend (user roster, acl's, affiliations, privacy lists/block 
lists, etc), it will become incompatible change.
For example, what used to be [EMAIL PROTECTED] will now become 
contact[EMAIL PROTECTED] - causing incompatibilities.



Regards,
Mridul





The number of deployments with these usecases are not as specialized as
it might seem.


I agree with that. Which is why I stand by XEP-0106. In part I think
that those who are so opposed to XEP-0106 are not familiar with the
deployment issues. But I agree that XEP-0106 needs to be clarified in
the ways we discussed recently. It's on my list to complete those
clarifications and post an interim version.

/psa





Re: [Standards] summary: allowable characters

2007-08-02 Thread Peter Saint-Andre
Mridul Muralidharan wrote:
 
 Peter Saint-Andre wrote:
 Mridul Muralidharan wrote:
 Btw, changing nodeprep now will cause quite a lot of problem with
 existing deployments - since the contact jid's are part of the user data
 - and would pretty much mean we cant adopt bis spec.

 What specifically breaks? Does it depend on which characters would be
 allowed in nodeprep2? I agree that / and @ are problematic, but the
 characters   '   seem less so. But I may be missing something.
 
 
 The problem essentially is that any place where we have a JID persisted
 in the backend (user roster, acl's, affiliations, privacy lists/block
 lists, etc), it will become incompatible change.
 For example, what used to be [EMAIL PROTECTED] will now become
 contact[EMAIL PROTECTED] - causing incompatibilities.

Well we're having a long discussion about this in the jdev room
right now:

http://www.jabber.org/muc-logs/[EMAIL PROTECTED]/2007-08-02.html

I volunteer elmex for posting a summary once we're done. :)

/psa


smime.p7s
Description: S/MIME Cryptographic Signature


Re: [Standards] summary: allowable characters

2007-08-02 Thread Matthias Wimmer

Peter Saint-Andre schrieb:

Well we're having a long discussion about this in the jdev room
right now:

http://www.jabber.org/muc-logs/[EMAIL PROTECTED]/2007-08-02.html


I just read the log. Sounds good and is how I intended/proposed that it 
would work:
- Escaping JIDs when sending to a server that does not support the 
eXtended iIDs (are these XIDs then? *g*)

- Not doing unescaping, when a JID is displayed.


Matthias


Re: [Standards] summary: allowable characters

2007-08-02 Thread Robin Redeker
(Warning, long mail ahead! Get a coffee and some time first :-)

On Thu, Aug 02, 2007 at 03:34:30PM -0600, Peter Saint-Andre wrote:
 Mridul Muralidharan wrote:
  The problem essentially is that any place where we have a JID persisted
  in the backend (user roster, acl's, affiliations, privacy lists/block
  lists, etc), it will become incompatible change.
  For example, what used to be [EMAIL PROTECTED] will now become
  contact[EMAIL PROTECTED] - causing incompatibilities.
 
 Well we're having a long discussion about this in the jdev room
 right now:
 
 http://www.jabber.org/muc-logs/[EMAIL PROTECTED]/2007-08-02.html
 
 I volunteer elmex for posting a summary once we're done. :)

Yes, basically Mridul is completly right, we can't do much about the
already deployed backslashes in JIDs. Especially in 1.0 server rosters.

But...

First we have to wonder whethere there are actually people with
[EMAIL PROTECTED] in their roster, as registering a JID with a \ in
the username is a considerable problem with XMPP 1.0 servers with SASL
and DIGEST-MD5 (see some older message from me in the JID escaping
thread).

Of course that should be further investigaged as old-style IQ auth works
with [EMAIL PROTECTED] and also some jabberd2 servers allow
authentication as [EMAIL PROTECTED] without problems.

But there exists a possibility to migrate our old JIDs to the 1.1 world
and staying interoperable with 1.0 servers:

First: A 1.1 server that is going to communicate with 1.0 server will
escape the JIDs from his userbase when he SENDS to a 1.0 entity.
Escaping can be performed as described in XEP-0106 (after dropping the
silly \20 escaping rule).

That will work great if the 1.1 server has NO old userbase.

If we have for example jabber.org, a large userbase, and there is
actually [EMAIL PROTECTED] as registered user in. And we want to
upgrade to a 1.1 server then we will run into the problems Mridul
pointed out:

1.0 servers have [EMAIL PROTECTED] in their roster, and if we have
now 'stpeter @jabber.org' registering a new account he will collide with
that, because his JID will be escaped to the in the 1.0 servers roster
existing [EMAIL PROTECTED] Bang, we got a collision.

There exists no real easy way to prevent that except just not allowing
'stpeter @jabber.org' to register. To detect a case like this, that a
new user with a colliding JID registers, the 1.1 server needs to keep
track of the old JIDs in his database.

If the 1.1 server knows that [EMAIL PROTECTED] is a JID from the
pre-1.1 times, he can assume that [EMAIL PROTECTED] is already in
some rosters out there. So he MUST NOT allow anyone who might collide
with that to register at jabber.org after the migration to 1.1.

So when upgrading jabber.org could just mark all JIDs with a \ in their
name to be a pre-1.1 JID and disallow anyone to register who might
collide with one of the registered JIDs.

This way ' [EMAIL PROTECTED]' can register if no
'[EMAIL PROTECTED]' existed before (he knows that from his
database with the marks of old JIDs).

If ' [EMAIL PROTECTED]' now wants to talk with '[EMAIL PROTECTED]', it
would look like this:

   message from= [EMAIL PROTECTED] to=[EMAIL PROTECTED] /

As jabber.org (1.1) knows that chrome.pl (1.0) is in fact 1.0 he escapes
like XEP-0106 recommends and sends actually:

   message from=[EMAIL PROTECTED] to=[EMAIL PROTECTED] /

In [EMAIL PROTECTED]'s client will now popup a message from
[EMAIL PROTECTED] and except some weird JID he can talk with him.
Because if he sends a message back:

   message from=[EMAIL PROTECTED] to=[EMAIL PROTECTED] /

Then jabber.org will unescape the to-field and deliver the message to
' [EMAIL PROTECTED]'.

Of course this solution is not a perfect one for the end-users as I will
describe below, but I argue that the incompatibilities will increase
the pressure on developers a bit and on administrators to adapt XMPP
1.1. And thus that might speed up the migration while providing a
compatibility-workaround for maybr 98-99% of the cases, or maybe even
99,% (this needs to be investigated a big IMO, maybe my assumptions
are completly wrong).

So much for the server-to-server interoperability.



Now about 1.1 clients and 1.0 clients. 1.0 clients will have no way
to reach ' [EMAIL PROTECTED]', which is fine, either the user knows
that guy's JID needs to be escaped because he uses an old client, or he
has to upgrade to a client with 1.1 capabilities (what this means is
described below).

Not being able to send a message to ' [EMAIL PROTECTED]' will increase
the pressure on the client developers as stated above.
So 1.0 clients are basically out of luck if the user don't know how to
escape, however, tell em: get a new client.

Of course it's blunt to say that, but I guess we can assume that not our
WHOLE old userbase without spaces and all those fancy characters in
their JID are NOT GOING TO signup a new account. So the users with
spaces and  or whatever in 

Re: [Standards] summary: allowable characters

2007-08-02 Thread Peter Saint-Andre
Matthias Wimmer wrote:
 Peter Saint-Andre schrieb:
 Well we're having a long discussion about this in the jdev room
 right now:

 http://www.jabber.org/muc-logs/[EMAIL PROTECTED]/2007-08-02.html
 
 I just read the log. Sounds good and is how I intended/proposed that it
 would work:
 - Escaping JIDs when sending to a server that does not support the
 eXtended iIDs (are these XIDs then? *g*)

That's good. :)

 - Not doing unescaping, when a JID is displayed.

Right.

Peter

-- 
Peter Saint-Andre
https://stpeter.im/



smime.p7s
Description: S/MIME Cryptographic Signature


Re: [Standards] summary: allowable characters

2007-08-02 Thread Tomasz Sterna
Dnia 03-08-2007, pią o godzinie 00:28 +0200, Robin Redeker napisał(a):
 There exists no real easy way to prevent that except just not allowing
 'stpeter @jabber.org' to register. To detect a case like this, that a
 new user with a colliding JID registers, the 1.1 server needs to keep
 track of the old JIDs in his database.

There is really no difference whether you're colliding an 1.0 username,
or 1.1.
From the connecting 1.0 server perspective it does not matter.

So on 1.1 server you just unescape the requested username and check for
collision. If there is one, you deny the request.


As for the whole idea.
The escaping-unescaping is going to happen completely behind the scene,
on the server side on both s2s and c2s connections. On the 1.1 side.
It's the 1.1 endpoint (no matter if it is server or client) to do
escaping sent data and unescaping received data when it talks with 1.0.
Well...
- We have unchanged 1.0 servers
- We have unchanged 1.0 clients (it's users need to escape manually)
- We have interoperability 1.0 - 1.1
- We have all possible characters in the nodepart of the JID
Looks good to me.



-- 
Tomasz Sterna
Xiaoka Grp.  http://www.xiaoka.com/



Re: [Standards] summary: allowable characters

2007-08-02 Thread Mridul Muralidharan



Just mentioning a basic problem which was discussed at jdev.

If two 1.0 server move to 1.1, all the 'older' 1.0 jid's will become 
unroutable - which are present in user roster/affiliations/privacylists/etc.



Regards,
Mridul

Robin Redeker wrote:

(Warning, long mail ahead! Get a coffee and some time first :-)

On Thu, Aug 02, 2007 at 03:34:30PM -0600, Peter Saint-Andre wrote:

Mridul Muralidharan wrote:

The problem essentially is that any place where we have a JID persisted
in the backend (user roster, acl's, affiliations, privacy lists/block
lists, etc), it will become incompatible change.
For example, what used to be [EMAIL PROTECTED] will now become
contact[EMAIL PROTECTED] - causing incompatibilities.

Well we're having a long discussion about this in the jdev room
right now:

http://www.jabber.org/muc-logs/[EMAIL PROTECTED]/2007-08-02.html

I volunteer elmex for posting a summary once we're done. :)


Yes, basically Mridul is completly right, we can't do much about the
already deployed backslashes in JIDs. Especially in 1.0 server rosters.

But...

First we have to wonder whethere there are actually people with
[EMAIL PROTECTED] in their roster, as registering a JID with a \ in
the username is a considerable problem with XMPP 1.0 servers with SASL
and DIGEST-MD5 (see some older message from me in the JID escaping
thread).

Of course that should be further investigaged as old-style IQ auth works
with [EMAIL PROTECTED] and also some jabberd2 servers allow
authentication as [EMAIL PROTECTED] without problems.

But there exists a possibility to migrate our old JIDs to the 1.1 world
and staying interoperable with 1.0 servers:

First: A 1.1 server that is going to communicate with 1.0 server will
escape the JIDs from his userbase when he SENDS to a 1.0 entity.
Escaping can be performed as described in XEP-0106 (after dropping the
silly \20 escaping rule).

That will work great if the 1.1 server has NO old userbase.

If we have for example jabber.org, a large userbase, and there is
actually [EMAIL PROTECTED] as registered user in. And we want to
upgrade to a 1.1 server then we will run into the problems Mridul
pointed out:

1.0 servers have [EMAIL PROTECTED] in their roster, and if we have
now 'stpeter @jabber.org' registering a new account he will collide with
that, because his JID will be escaped to the in the 1.0 servers roster
existing [EMAIL PROTECTED] Bang, we got a collision.

There exists no real easy way to prevent that except just not allowing
'stpeter @jabber.org' to register. To detect a case like this, that a
new user with a colliding JID registers, the 1.1 server needs to keep
track of the old JIDs in his database.

If the 1.1 server knows that [EMAIL PROTECTED] is a JID from the
pre-1.1 times, he can assume that [EMAIL PROTECTED] is already in
some rosters out there. So he MUST NOT allow anyone who might collide
with that to register at jabber.org after the migration to 1.1.

So when upgrading jabber.org could just mark all JIDs with a \ in their
name to be a pre-1.1 JID and disallow anyone to register who might
collide with one of the registered JIDs.

This way ' [EMAIL PROTECTED]' can register if no
'[EMAIL PROTECTED]' existed before (he knows that from his
database with the marks of old JIDs).

If ' [EMAIL PROTECTED]' now wants to talk with '[EMAIL PROTECTED]', it
would look like this:

   message from= [EMAIL PROTECTED] to=[EMAIL PROTECTED] /

As jabber.org (1.1) knows that chrome.pl (1.0) is in fact 1.0 he escapes
like XEP-0106 recommends and sends actually:

   message from=[EMAIL PROTECTED] to=[EMAIL PROTECTED] /

In [EMAIL PROTECTED]'s client will now popup a message from
[EMAIL PROTECTED] and except some weird JID he can talk with him.
Because if he sends a message back:

   message from=[EMAIL PROTECTED] to=[EMAIL PROTECTED] /

Then jabber.org will unescape the to-field and deliver the message to
' [EMAIL PROTECTED]'.

Of course this solution is not a perfect one for the end-users as I will
describe below, but I argue that the incompatibilities will increase
the pressure on developers a bit and on administrators to adapt XMPP
1.1. And thus that might speed up the migration while providing a
compatibility-workaround for maybr 98-99% of the cases, or maybe even
99,% (this needs to be investigated a big IMO, maybe my assumptions
are completly wrong).

So much for the server-to-server interoperability.



Now about 1.1 clients and 1.0 clients. 1.0 clients will have no way
to reach ' [EMAIL PROTECTED]', which is fine, either the user knows
that guy's JID needs to be escaped because he uses an old client, or he
has to upgrade to a client with 1.1 capabilities (what this means is
described below).

Not being able to send a message to ' [EMAIL PROTECTED]' will increase
the pressure on the client developers as stated above.
So 1.0 clients are basically out of luck if the user don't know how to
escape, however, tell em: get a new client.