Re: [Standards] Security consideration for XEP-0198

2014-05-14 Thread Dave Cridland
On 12 May 2014 08:05, Georg Lukas  wrote:

> * Kevin Smith  [2014-05-08 12:34]:
> > Consider the case of a paused client in a MUC. The MUC sends a
> > message, and gets a bounce because the buffer's full. The client
> > resumes the session, but now its out of sync - it thinks its in the
> > MUC and the MUC has removed it due to bounces.
>
> I can see the point, however I encountered a very similar situation
> yesterday - even though in the opposite direction: I sent two messages
> from my client to a MUC, and only the second one arrived. Digging into
> client and server logs revealed that the first one bounced with the
> following error:
>
> 
>xmlns='urn:ietf:params:xml:ns:xmpp-stanzas'/>
> 
>   Server-to-server connection failed: DNS resolution failed
> 
>   
> 
>
> There was apparently a short network outage between the two servers,
> causing one of the messages to be bounced, bringing client and MUC out
> of sync. If the MUC side of this connection had encountered the error,
> the client probably would have been kicked, leading to what you
> describe.
>
>
I'm not sure we discuss what clients should do if they receive an error
"from" the MUC. We should probably give some advice there. I suspect we
should assume the client is no longer in the MUC, but that assumption is
only really safe if the MUC is handling joins properly.


> Furthermore, I am regulary experiencing a MUC-out-of-sync situation when
> a MUC server is restarted, leading my client to believe it is still in
> a room where everybody is silent.
>
>
Then the server isn't M-Link, because I fixed that case. Usually, though,
that's when you join a room that thinks you're in it already, rather than
the other way around - I used to get this a lot when restarting my personal
(and often development) server, which is why I spent a silly amount of time
fixing it.

The problem then is that some servers treat the join (marked specifically
as a join) as updated room presence, and don't hand back the list of
occupants, etc.

If the MUC thinks you're not in the room, and your client thinks it is,
then the MUC treats an updated presence as a groupchat (ie, pre-XEP-0045)
join, and your client just has to notice the presence and history. This is
slightly unpleasant, but the user experience is a lot less confusing.


> My point is: we already have a world where things get out of sync
> because of misbehaving clients, servers, components and networks. Maybe
> it is better to make this problem explicitly acknowledged instead of
> feigning an ideal world (this also somehow reminds me of the "TCP is
> reliable" argument, one layer up).
>
>
The point of 198 is to reduce the scope of this.

TCP *is* reliable, within bounds. 198 makes XMPP reliable within a larger
scope.

On a grander scale, we can look for cases where state is held in common by
multiple entities, and see how to avoid mismatch - the MUC joined state
example above is one case, but there are others (pubsub and presence
subscriptions, etc).

Dave.


Re: [Standards] Security consideration for XEP-0198

2014-05-12 Thread Georg Lukas
* Kevin Smith  [2014-05-08 12:34]:
> Consider the case of a paused client in a MUC. The MUC sends a
> message, and gets a bounce because the buffer's full. The client
> resumes the session, but now its out of sync - it thinks its in the
> MUC and the MUC has removed it due to bounces.

I can see the point, however I encountered a very similar situation
yesterday - even though in the opposite direction: I sent two messages
from my client to a MUC, and only the second one arrived. Digging into
client and server logs revealed that the first one bounced with the
following error:


  

  Server-to-server connection failed: DNS resolution failed

  


There was apparently a short network outage between the two servers,
causing one of the messages to be bounced, bringing client and MUC out
of sync. If the MUC side of this connection had encountered the error,
the client probably would have been kicked, leading to what you
describe.

Furthermore, I am regulary experiencing a MUC-out-of-sync situation when
a MUC server is restarted, leading my client to believe it is still in
a room where everybody is silent.

My point is: we already have a world where things get out of sync
because of misbehaving clients, servers, components and networks. Maybe
it is better to make this problem explicitly acknowledged instead of
feigning an ideal world (this also somehow reminds me of the "TCP is
reliable" argument, one layer up).


Georg
-- 
|| http://op-co.de ++  GCS d--(++) s: a C+++ UL+++ !P L+++ !E W+++ N  ++
|| gpg: 0x962FD2DE ||  o? K- w---() O M V? PS+ PE-- Y++ PGP+ t+ 5 R+  ||
|| Ge0rG: euIRCnet ||  X(+++) tv+ b+(++) DI+++ D- G e h- r++ y?   ||
++ IRCnet OFTC OPN ||_||


signature.asc
Description: Digital signature


Re: [Standards] Security consideration for XEP-0198

2014-05-08 Thread Kevin Smith
On Wed, May 7, 2014 at 10:46 PM, Georg Lukas  wrote:
> * Dave Cridland  [2014-05-07 23:05]:
>> It's probably worth noting, yes. The solution is to request an
>> acknowledgement, and if one isn't forthcoming, to ditch the connection, of
>> course.
>
> It is not that easy, unfortunately. If the client is currently
> disconnected, the ultimate purpose of the stanza queue is to cache
> stanzas until the client reconnects. If you ditch the connection, you
> undermine the purpose of the XEP.
>
> It is wise to have a timeout mechanism for the client not responding to
> ack requests. However, the session should be kept for a defined time
> after that, to allow for a reconnection.
>
> IMHO, there should be a stanza limit per session/per JID, however once
> the limit is reached, new stanzas for that client should be rejected
> with an error without terminating the connection.
>
> If you do terminate the connection, you make the process susceptible to
> DoS attacks against clients on slow connections (or currently in the
> process of reconnecting).

No, if you start bouncing stanzas that you would have delivered to the
client if it was connected, you need to kill the resume and bounce
them all.

Consider the case of a paused client in a MUC. The MUC sends a
message, and gets a bounce because the buffer's full. The client
resumes the session, but now its out of sync - it thinks its in the
MUC and the MUC has removed it due to bounces.

198 resumption is an all-or-nothing job.

/K


Re: [Standards] Security consideration for XEP-0198

2014-05-08 Thread Dave Cridland
On 8 May 2014 02:47, John Williams (johnwi3)  wrote:

> Although not mentioned, I would expect many server implementations would
> choose to impose limits on how much unacknowledged traffic they will
> buffer. XEP-0198 leaves a lot of freedom to the implementer.
>
>
Good. :-)


> The primary goal of XEP-0198 is to optimize reconnection for clients with
> unreliable connection (or hopping from wired/wireless). A resume may fail,
> but if it succeeds quite often then it is a useful optimization.
>
>
The primary goal is to increase reliability in the face of intermittent
connectivity. Reconnection - that is, allowing an XMPP session to span
multiple sequential TCP sessions - is part of that.


> If you assume you have no evidence that the client connection has failed
> (but your buffer has hit its limit), you could simply discard the oldest
> unacked packet and make space for the newer packet.


Eeeek.


> In many cases the client will catch up and no harm is done. If you are not
> so lucky, and you need a stanzas you discarded for a resume request - oh
> well the resume will fail. This simple implementation is sufficient for
> optimizing reconnection most of the time, but occasionally it means someone
> has to take the slower session recreation path.
>
>
That's an interesting case. So you're hoping the client will give you an
 with a sufficiently high value that you would have discarded the
stanzas anyway?


> I don’t believe it was the goal of XEP-0198 to guarantee a stanza is
> either delivered or bounced. I am not sure if you are trying to achieve
> this guarantee by bouncing new stanzas once the buffer overflows.
>
>
Modulo the obvious blurb about whether a stanza needs bouncing on a
delivery failure, that is, kind of, the goal.

The primary goal, as I said, is reliability. That doesn't really mean
"always deliver", but "if delivery fails, we need to know about it". If
every link uses 198, then in principle a message should always be
delivered, or bounced - and moreover, a 184 acknowledgement will itself
reliably make it back.

The Two Generals problem basically says this is impossible, of course, but
what 198 does is try to remove or limit as many corner cases as possible.
The case where the 2G problem kicks in is limited to if we never get an
 and cannot re-establish the link. In this case, the fate of the stanza
is always unknown. The stanza's fate is temporarily unknown while waiting
for an , too.

Quite what to do when the stanza's fate is unknown is an implementation
dependent thing, but typically we eventually stick messages into offline
storage, bounce iqs, and drop presence, and otherwise treat them as if they
weren't delivered, on the assumption that a duplicate is better than an
omission.

What concerns me with your proposal above is that there's simply no way to
bounce, or redirect, a stanza once it's been discarded, so it becomes lost,
and the reliability is degraded as a result. Across one link this probably
doesn't matter, but from an end-to-end perspective I think it does.


> I don’t think this will work well.  This implementation is liable to fail
> to deliver traffic the client, and the client would be totally oblivious.
> Events from pub-sub nodes might have been missed, changes in occupants and
> roles within a chat room, etc
>
>
Right, XMPP has strict ordering requirements, and we don't want to start
bouncing stanzas in the middle of a session, because we lack the protocol
functionality to deal with that.

If a session stops, we have the server emit unavailable presence, which
leaves chatrooms, and so on; and the client, knowing it's now a new session
on a failed resume, will perform resynchronization as needed.


> I believe the only way to make this guarantee is to kill the session upon
> overflow. All the unacked packets are available to bounce, the client will
> have to reconnect and establish a new session (resume would fail), and the
> client would rebuild their state. The price of this guarantee is that it
> could cause a bad experience for users on slow connections, or cause
> problems when there are heavy bursts of traffic.
>
> Dave's suggestion could help detect dead/unresponsive connections sooner,
> but you still have to deal with the problem of buffering stanzas while you
> are waiting. You probably want to send  stanzas anytime your buffer
> creeps up so that you can solicit an  stanza well before you hit your
> overflow point.
>
>
Yes, quite.


> Note:
> xep-0198 allows clients to send  stanzas even if there was no  from
> the server. When you see an  from the client that might not correlate to
> the  you sent. I suppose you could keep waiting for the  that has 'h'
> equal to the  value you would expect for the  you sent.
>

I'm not sure it matters. The  is merely a requirement to send an ,
but any  can be processed independently.

Dave.


Re: [Standards] Security consideration for XEP-0198

2014-05-07 Thread John Williams (johnwi3)


> -Original Message-
> From: Standards [mailto:standards-boun...@xmpp.org] On Behalf Of Georg
> Lukas
> Sent: Wednesday, May 07, 2014 2:46 PM
> To: standards@xmpp.org
> Subject: Re: [Standards] Security consideration for XEP-0198
> 
> * Dave Cridland  [2014-05-07 23:05]:
> > It's probably worth noting, yes. The solution is to request an
> > acknowledgement, and if one isn't forthcoming, to ditch the
> > connection, of course.
> 
> It is not that easy, unfortunately. If the client is currently disconnected, 
> the
> ultimate purpose of the stanza queue is to cache stanzas until the client
> reconnects. If you ditch the connection, you undermine the purpose of the XEP.
> 
> It is wise to have a timeout mechanism for the client not responding to ack
> requests. However, the session should be kept for a defined time after that, 
> to
> allow for a reconnection.
> 
> IMHO, there should be a stanza limit per session/per JID, however once the
> limit is reached, new stanzas for that client should be rejected with an error
> without terminating the connection.

Although not mentioned, I would expect many server implementations would choose 
to impose limits on how much unacknowledged traffic they will buffer. XEP-0198 
leaves a lot of freedom to the implementer. 

The primary goal of XEP-0198 is to optimize reconnection for clients with 
unreliable connection (or hopping from wired/wireless). A resume may fail, but 
if it succeeds quite often then it is a useful optimization.

If you assume you have no evidence that the client connection has failed (but 
your buffer has hit its limit), you could simply discard the oldest unacked 
packet and make space for the newer packet. In many cases the client will catch 
up and no harm is done. If you are not so lucky, and you need a stanzas you 
discarded for a resume request - oh well the resume will fail. This simple 
implementation is sufficient for optimizing reconnection most of the time, but 
occasionally it means someone has to take the slower session recreation path.

I don’t believe it was the goal of XEP-0198 to guarantee a stanza is either 
delivered or bounced. I am not sure if you are trying to achieve this guarantee 
by bouncing new stanzas once the buffer overflows.

I don’t think this will work well.  This implementation is liable to fail to 
deliver traffic the client, and the client would be totally oblivious. Events 
from pub-sub nodes might have been missed, changes in occupants and roles 
within a chat room, etc

I believe the only way to make this guarantee is to kill the session upon 
overflow. All the unacked packets are available to bounce, the client will have 
to reconnect and establish a new session (resume would fail), and the client 
would rebuild their state. The price of this guarantee is that it could cause a 
bad experience for users on slow connections, or cause problems when there are 
heavy bursts of traffic.

Dave's suggestion could help detect dead/unresponsive connections sooner, but 
you still have to deal with the problem of buffering stanzas while you are 
waiting. You probably want to send  stanzas anytime your buffer creeps up so 
that you can solicit an  stanza well before you hit your overflow point.

Note: 
xep-0198 allows clients to send  stanzas even if there was no  from the 
server. When you see an  from the client that might not correlate to the  
you sent. I suppose you could keep waiting for the  that has 'h' equal to 
the  value you would expect for the  you sent.

== Jock Williams ==


Re: [Standards] Security consideration for XEP-0198

2014-05-07 Thread Georg Lukas
* Dave Cridland  [2014-05-07 23:05]:
> It's probably worth noting, yes. The solution is to request an
> acknowledgement, and if one isn't forthcoming, to ditch the connection, of
> course.

It is not that easy, unfortunately. If the client is currently
disconnected, the ultimate purpose of the stanza queue is to cache
stanzas until the client reconnects. If you ditch the connection, you
undermine the purpose of the XEP.

It is wise to have a timeout mechanism for the client not responding to
ack requests. However, the session should be kept for a defined time
after that, to allow for a reconnection.

IMHO, there should be a stanza limit per session/per JID, however once
the limit is reached, new stanzas for that client should be rejected
with an error without terminating the connection.

If you do terminate the connection, you make the process susceptible to
DoS attacks against clients on slow connections (or currently in the
process of reconnecting).


Georg
-- 
|| http://op-co.de ++  GCS d--(++) s: a C+++ UL+++ !P L+++ !E W+++ N  ++
|| gpg: 0x962FD2DE ||  o? K- w---() O M V? PS+ PE-- Y++ PGP+ t+ 5 R+  ||
|| Ge0rG: euIRCnet ||  X(+++) tv+ b+(++) DI+++ D- G e h- r++ y?   ||
++ IRCnet OFTC OPN ||_||


signature.asc
Description: Digital signature


Re: [Standards] Security consideration for XEP-0198

2014-05-07 Thread Dave Cridland
It's probably worth noting, yes. The solution is to request an
acknowledgement, and if one isn't forthcoming, to ditch the connection, of
course.


On 7 May 2014 18:54, Holger Weiß  wrote:

> Server implementations of XEP-0198 (Stream Management) will usually want
> to ensure their outgoing stanza queues cannot grow too large if clients
> never acknowledge their stanzas.  Would it make sense to mention this
> potential issue in the Security Considerations section of XEP-0198?
>
> Holger
>


[Standards] Security consideration for XEP-0198

2014-05-07 Thread Holger Weiß
Server implementations of XEP-0198 (Stream Management) will usually want
to ensure their outgoing stanza queues cannot grow too large if clients
never acknowledge their stanzas.  Would it make sense to mention this
potential issue in the Security Considerations section of XEP-0198?

Holger


smime.p7s
Description: S/MIME cryptographic signature