Re: [Standards] Message-IDs
On 28 Feb 2018, at 14:47, Denver Gingerich wrote: > > On Wed, Feb 28, 2018 at 08:59:01AM +, Kevin Smith wrote: >> On 13 Feb 2018, at 16:57, Simon Friedberger wrote: >>>E3. Simply make the ID: FROM-TIMESTAMP. >>>Here FROM needs to be the eventual FROM after possible >>> rewriting. Can >>>that be done? >>>And TIMESTAMP has to be strictly increasing so should have >>> sub-second >>>resolution. >>>I assume this is impossible because otherwise it would be to >>> easy. But >>>why is it impossible? :) >> >> Because timestamps aren’t monotonic? :) > > Do you mean because most people use Unix time and/or other UTC-based > timestamps (that have leap seconds)? > > If so, this can be mostly solved by using TAI timestamps. Unfortunately, it > is tricky in most OSes to obtain a TAI timestamp, but I found some code that > does this (on many platforms anyway): > > https://ossguy.com/tai.c > > We've used this code for implementing usage tracking in JMP (to ensure a > day's length doesn't vary from day to day - it is always exactly 86,400 > seconds long). For details, see > https://gitlab.com/ossguy/sgx-catapult/commit/31c2cb7c8fbea1ad4cc6753a4343dbfc65552fa5 > . As you might suspect, we'd like to port the above TAI code to Ruby, but > it works ok as-is for now. > > I realize that clock skew could still cause the TAI timestamp that your OS > returns to be non-monotonic (i.e. a machine issue, not an issue with TAI time > itself); I'm not sure if that's a substantial issue for the message IDs being > discussed here. I meant because clock skew is a thing, so relying on the monotonicity doesn’t work. Seems like it shouldn’t be a thing, but is. /K ___ Standards mailing list Info: https://mail.jabber.org/mailman/listinfo/standards Unsubscribe: standards-unsubscr...@xmpp.org ___
Re: [Standards] Message-IDs
On Wed, Feb 28, 2018 at 08:59:01AM +, Kevin Smith wrote: > On 13 Feb 2018, at 16:57, Simon Friedberger wrote: > > E3. Simply make the ID: FROM-TIMESTAMP. > > Here FROM needs to be the eventual FROM after possible > > rewriting. Can > > that be done? > > And TIMESTAMP has to be strictly increasing so should have > > sub-second > > resolution. > > I assume this is impossible because otherwise it would be to > > easy. But > > why is it impossible? :) > > Because timestamps aren’t monotonic? :) Do you mean because most people use Unix time and/or other UTC-based timestamps (that have leap seconds)? If so, this can be mostly solved by using TAI timestamps. Unfortunately, it is tricky in most OSes to obtain a TAI timestamp, but I found some code that does this (on many platforms anyway): https://ossguy.com/tai.c We've used this code for implementing usage tracking in JMP (to ensure a day's length doesn't vary from day to day - it is always exactly 86,400 seconds long). For details, see https://gitlab.com/ossguy/sgx-catapult/commit/31c2cb7c8fbea1ad4cc6753a4343dbfc65552fa5 . As you might suspect, we'd like to port the above TAI code to Ruby, but it works ok as-is for now. I realize that clock skew could still cause the TAI timestamp that your OS returns to be non-monotonic (i.e. a machine issue, not an issue with TAI time itself); I'm not sure if that's a substantial issue for the message IDs being discussed here. Denver https://jmp.chat/ ___ Standards mailing list Info: https://mail.jabber.org/mailman/listinfo/standards Unsubscribe: standards-unsubscr...@xmpp.org ___
Re: [Standards] Message-IDs
On 28 Feb 2018, at 09:35, Jonas Wielicki wrote: > > On Mittwoch, 28. Februar 2018 10:28:01 CET Kevin Smith wrote: >> On 26 Feb 2018, at 15:59, Simon Friedberger wrote: >>> So, lest this discussion just die. Here is a proposal: >> Thanks for the proposal. Bashing follows. >> >>> Client-A generates message-ID based on HASH(connection_counter, >>> server_salt). The connection_counter needs to be maintained only for >>> one connection. The server salt is server generated, anew for each >>> connection and is sent to. >>> >>> Server-A checks that this is correct and uses it for MAM. This >>> should make life easier for clients because they only need to deal >>> with one ID. >> >> I think stopping servers being able to use their own IDs for DB storage is >> probably disadvantageous. Although I see the appeal of a client knowing its >> own MAM IDs, I’m not sure that simply knowing it is sufficient - you also >> need to know where it fits into the order of the archive, if you’re going >> to use it for archive sync, so I’m not sure this is actually buying >> anything, at the cost of of lack of flexibility in server implementations. > > Good point about the order. This essentially means that we need a reflection. > Self-carbons essentially. At which point we can simply let the server > generate > the ID(s). I’m not sure that’s true, as you want to know your ID immediately upon sending - e.g. for following up with LMC you don’t want to wait for roundtrips before you can do that. So I think you want the client to be generating at least some ID used for something. /K ___ Standards mailing list Info: https://mail.jabber.org/mailman/listinfo/standards Unsubscribe: standards-unsubscr...@xmpp.org ___
Re: [Standards] Message-IDs
On Mittwoch, 28. Februar 2018 10:28:01 CET Kevin Smith wrote: > On 26 Feb 2018, at 15:59, Simon Friedberger wrote: > > So, lest this discussion just die. Here is a proposal: > Thanks for the proposal. Bashing follows. > > >Client-A generates message-ID based on HASH(connection_counter, > >server_salt). The connection_counter needs to be maintained only for > >one connection. The server salt is server generated, anew for each > >connection and is sent to. > > > >Server-A checks that this is correct and uses it for MAM. This > >should make life easier for clients because they only need to deal > >with one ID. > > I think stopping servers being able to use their own IDs for DB storage is > probably disadvantageous. Although I see the appeal of a client knowing its > own MAM IDs, I’m not sure that simply knowing it is sufficient - you also > need to know where it fits into the order of the archive, if you’re going > to use it for archive sync, so I’m not sure this is actually buying > anything, at the cost of of lack of flexibility in server implementations. Good point about the order. This essentially means that we need a reflection. Self-carbons essentially. At which point we can simply let the server generate the ID(s). kind regards, Jonas signature.asc Description: This is a digitally signed message part. ___ Standards mailing list Info: https://mail.jabber.org/mailman/listinfo/standards Unsubscribe: standards-unsubscr...@xmpp.org ___
Re: [Standards] Message-IDs
On 26 Feb 2018, at 15:59, Simon Friedberger wrote: > So, lest this discussion just die. Here is a proposal: Thanks for the proposal. Bashing follows. >Client-A generates message-ID based on HASH(connection_counter, >server_salt). The connection_counter needs to be maintained only for >one connection. The server salt is server generated, anew for each >connection and is sent to. > >Server-A checks that this is correct and uses it for MAM. This >should make life easier for clients because they only need to deal >with one ID. I think stopping servers being able to use their own IDs for DB storage is probably disadvantageous. Although I see the appeal of a client knowing its own MAM IDs, I’m not sure that simply knowing it is sufficient - you also need to know where it fits into the order of the archive, if you’re going to use it for archive sync, so I’m not sure this is actually buying anything, at the cost of of lack of flexibility in server implementations. > * Two problems need to be considered here: > o The client needs to maintain a counter. The literal ‘have a counter in memory’ is trivial, although getting the rules for incrementing it right can be difficult - moreso than for SM IDs, which there’s another thread at the moment about people not being able to get right. > Even >though I called it a counter, it does not need to be contiguous. >It just needs to be increasing that the server can easily check >that for a given salt value it is unique. If it’s not contiguous, how is the server going to go about validating the hash of an unknown value? > o The server needs to check the validity of the counter. If the >server is actually replicated and consists of multiple machines >this is not strictly possible. I’m not sure I understand this. If the server salt is local to a node, and the connection counter is local to a connection, which is local to a node, even in a split cluster this should be fine? > However, assuming normal >operations the IDs generated by the client will be fine and if >the servers have any mechanism for eventual consistency a >misbehaving client will be detected. Will they? If the server can’t check the stanza at submission time, I don’t think it can ever (reasonably) check it later. >Server-B gets the message via s2s. It changes the message-ID to a >new one and stores the original as "origin-ID”. That’s going to break errors and all sorts isn’t it? A stanza’s ID needs to be stable or things will break. >Client-B gets a message with only TWO IDs. message-ID is for >referencing locally for MAM, origin-ID is for referencing when >talking to the sender i.e. read receipts. What happens with MUC? That’s an extra entity that may be doing MAM, and will generate new stanza IDs for the fan-out. >If a server generates follow-up messages it makes up a new >sender-ID. It should maybe set a “triggered-by-ID” so the client can >determine that it triggered this message. Maybe this is unnecessary. >The server definitely must send the message it inserted back to the >client to ensure a common view of history. What does ‘generates follow-up messages’ here mean? >If a server changes a message it can keep the sender-ID but it MUST >notify the client who sent the message to make sure that clients >have the same view of the history. What does ‘changes a message’ here mean? There are situations where a message is modified in flight and the sender can’t be told what it’s modified to. > In this proposal stanza-IDs are not required. The message-ID is > authoritative and when rewriting the original message-ID is kept as > origin-ID. I’m not sure they’re not required (see comments on MAM). > From my original mail this solves C1, C2, C3, C4 and C5. I’m not sure it helps with C1. It only helps with C2 by going through and changing every XEP that uses a stanza ID and change it to use an origin-ID, I think? I don’t think it makes a difference to C3 at all, does it? It doesn’t help C4, as the client still needs a bounce to get ordering right, and I don’t see how it handles C5. > Also note, to make this a simpler change the clients could set both > origin-ID and message-ID. The stanza-ID for MAM would turn out to be the > same. This would be very similar to what is probably currently the most > widespread behavior. Except that the origin-ID should be used for > read-receipts, etc. I suspect that just saying in message receipts and in LMC etc. “use the origin-id when present” would achieve much the same thing as this proposal? /K ___ Standards mailing list Info: https://mail.jabber.org/mailman/listinfo/standards Unsubscribe: standards-unsubscr...@xmpp.org ___
Re: [Standards] Message-IDs
On 13 Feb 2018, at 16:57, Simon Friedberger wrote: > During the discussion on the different ID types at the summit I had an > idea for > a possible solution to the problem but not a sufficient understanding of the > problem to even discuss it. I tried to find somebody to discuss it with > in chat > afterwards but nobody was available and I forgot about it. To get it off > my ToDo > list, here is my current understanding. I hope it can be a basis for further > discussion. > > > A) Status-Quo: > Currently there are > A1. stanza-ID: generated by server > A2. origin-ID: generated by client > from https://xmpp.org/extensions/xep-0359.html and > > A3. message-ID: this is the ID-attribute on the stanza > from https://tools.ietf.org/html/rfc6120#section-8.1.3 > > There are also (4.) SM-IDs in stream management but those are > per-stream and > unrelated. > > > B) Use-cases: > B1. MAM https://xmpp.org/extensions/xep-0313.html uses stanza-ID. > B2. MUCs require IDs to detect reflections of own messages. > And reflection is great because it gives everybody the same view > on the > MUC in the presence of things like autopastebin or other rewrites. > B3. Error responses have the same ID-attribute as the original stanza. > > > C) Problems with current situation: > C1. People dislike having so many different IDs. > This is not a problem per se but it does mean implementation > complexity > and confusion. I think confusion I buy - we need to be careful to define things properly. > C2. According to Daniel it is not clear which ID should be used when > referencing things. In other words if he gets a delivery receipt > for an > ID the client might have based that on the origin-ID or the > message-ID. > I'm not sure if this should be considered relevant. People can > always > write broken clients which send back crap. Of course if it happens > unintentionally because of (C1.) fewer IDs would help I don’t think this is particularly unclear, (it’s the id of the stanza - all the other ids are newer inventions with specific contexts), but easy to clarify. > C3. Using origin-ID to detect MUC reflection doesn't always work > because MUCs > may not reflect it. > That's of course unfortunate but should IMHO considered an error > in the > MUC implementation (probably a transport) and fixed there. Mabe. I note that MUCs stripping out non-body payloads is actually a feature in some servers. > C4. Clients require a bounce of their messages to learn the > stanza-id which > is used for MAM. > Why do they need to know? Maybe they want to reference their own > message. They need to know where their stanza sits in the ordering of the archive (and its id) if they want to be able to do sync later. > Do they require this bounce anyway to make sure that their was > on rewriting? Possibly. > C5. Some MUCs rewrite the message-id > Why is this allowed? It is even suggested here: > https://xmpp.org/extensions/xep-0045.html#message Mostly it’s allowed because the spec didn’t say not to do it, and it got moved to Draft, and it was implemented, and so the rules of “don’t make breaking changes unless unavoidable” applied and it couldn’t be sensibly changed. > C6. A global ID to reference messages might be nice. > C7. When referencing a message for example by "liking" it a forgeable ID > could get you to like things you didn't intend to like. > This is a difficult problem because in many cases it requires > malicious > clients and servers and those have a lot of power anyway. Not that much power, relatively. They’re not usually able to rewrite history in a meaningful way, but with this they become able to (look like they) do that. > D) Possible root cause: > People do not trust the message IDs assigned by others and therefore > want to > assign their own. I’m not sure what this is saying - the root cause of *what*? > E) Suggested solutions, including partial solutions: > E1. message-ID and origin-ID should always be the same, as proposed > by Georg > in > https://mail.jabber.org/pipermail/standards/2017-September/033415.html > Some concerns where voiced in that thread the only valid one is > that due > to bad software we need to deal with the situation that they are > different anyway. > There was a privacy concern about the "by=" attribute but > origin-ID does > not actually have that. > According to Daniel and Georg things currently break down anyway > if this > does not hold. > E2. Make the ID verifiable: This is what I had in mind at the summit and > after some discussion yesterday Jonas and Dave basically immediately > came up with the same thing, so it might be re
Re: [Standards] Message-IDs
On Montag, 26. Februar 2018 16:59:46 CET Simon Friedberger wrote: > So, lest this discussion just die. Here is a proposal: > > * > > Client-A generates message-ID based on HASH(connection_counter, > server_salt). The connection_counter needs to be maintained only for > one connection. The server salt is server generated, anew for each > connection and is sent to. > > * > > Server-A checks that this is correct and uses it for MAM. This > should make life easier for clients because they only need to deal > with one ID. > > * Two problems need to be considered here: > o The client needs to maintain a counter. I don't know if there > are cases where the client cannot persist this counter but keeps > a connection. In this case a sufficiently fine grained timestamp > to make it strictly monotonically increasing is suffcient. Even > though I called it a counter, it does not need to be contiguous. > It just needs to be increasing that the server can easily check > that for a given salt value it is unique. > o The server needs to check the validity of the counter. If the > server is actually replicated and consists of multiple machines > this is not strictly possible. However, assuming normal > operations the IDs generated by the client will be fine and if > the servers have any mechanism for eventual consistency a > misbehaving client will be detected. I think this fits the XMPP > model of "robust cooperation". > * > > Server-B gets the message via s2s. It changes the message-ID to a > new one and stores the original as "origin-ID". > > * > > Client-B gets a message with only TWO IDs. message-ID is for > referencing locally for MAM, origin-ID is for referencing when > talking to the sender i.e. read receipts. > > * > > If a server generates follow-up messages it makes up a new > sender-ID. It should maybe set a “triggered-by-ID” so the client can > determine that it triggered this message. Maybe this is unnecessary. > The server definitely must send the message it inserted back to the > client to ensure a common view of history. > > * > > If a server changes a message it can keep the sender-ID but it MUST > notify the client who sent the message to make sure that clients > have the same view of the history. > > In this proposal stanza-IDs are not required. The message-ID is > authoritative and when rewriting the original message-ID is kept as > origin-ID. > > From my original mail this solves C1, C2, C3, C4 and C5. Mostly just by > defining them. This does not give us a global message-ID (C6) or > unforgeable message-IDs (C7). > > > Note, that I would prefer to have a globally unique ID. This is possible > under the assumption that everybody tries to generate unique IDs and > that non-unique IDs and misbehaving parties can be removed from the > system. Essentially, it would look just like this except that the > message-ID would have to include an ID for the originating server. That > would allow recipients to check that connection_counter is increasing > and the server_salt is unique for this server. The latter check might be > hard to perform, though. It can still be solved using timestamps. This > proposal seems much simpler, and it solves most of the problems. > > > Also note, to make this a simpler change the clients could set both > origin-ID and message-ID. The stanza-ID for MAM would turn out to be the > same. This would be very similar to what is probably currently the most > widespread behavior. Except that the origin-ID should be used for > read-receipts, etc. > > > Opinions? I find the overall concept very appealing. Thank you for taking the time to work this out. I think you overestimate some complexities there (which is good) regarding to clustering etc. If a server uses a 128bit random number for the server salt and we enforce the counter to be continuous and monotonic, I don’t see any interaction between cluster nodes needed. Likewise for the state keeping on the client side: If a client can keep a connection, it should be able to keep an 8 byte counter state along with it. What needs to be specified is counter overflow. Could be done with a simple request from the client for a new salt. I don’t see a good way to integrate the date in the message ID though (cc @ Zash). Even if we let the server define a must have prefix which they could incidentally set to the date, a way to handle date changes during a connection would be needed. kind regards, Jonas signature.asc Description: This is a digitally signed message part. ___ Standards mailing list Info: https://mail.jabber.org/mailman/listinfo/standards Unsubscribe: standards-unsubscr...@xmpp.org ___
Re: [Standards] Message-IDs
So, lest this discussion just die. Here is a proposal: * Client-A generates message-ID based on HASH(connection_counter, server_salt). The connection_counter needs to be maintained only for one connection. The server salt is server generated, anew for each connection and is sent to. * Server-A checks that this is correct and uses it for MAM. This should make life easier for clients because they only need to deal with one ID. * Two problems need to be considered here: o The client needs to maintain a counter. I don't know if there are cases where the client cannot persist this counter but keeps a connection. In this case a sufficiently fine grained timestamp to make it strictly monotonically increasing is suffcient. Even though I called it a counter, it does not need to be contiguous. It just needs to be increasing that the server can easily check that for a given salt value it is unique. o The server needs to check the validity of the counter. If the server is actually replicated and consists of multiple machines this is not strictly possible. However, assuming normal operations the IDs generated by the client will be fine and if the servers have any mechanism for eventual consistency a misbehaving client will be detected. I think this fits the XMPP model of "robust cooperation". * Server-B gets the message via s2s. It changes the message-ID to a new one and stores the original as "origin-ID". * Client-B gets a message with only TWO IDs. message-ID is for referencing locally for MAM, origin-ID is for referencing when talking to the sender i.e. read receipts. * If a server generates follow-up messages it makes up a new sender-ID. It should maybe set a “triggered-by-ID” so the client can determine that it triggered this message. Maybe this is unnecessary. The server definitely must send the message it inserted back to the client to ensure a common view of history. * If a server changes a message it can keep the sender-ID but it MUST notify the client who sent the message to make sure that clients have the same view of the history. In this proposal stanza-IDs are not required. The message-ID is authoritative and when rewriting the original message-ID is kept as origin-ID. From my original mail this solves C1, C2, C3, C4 and C5. Mostly just by defining them. This does not give us a global message-ID (C6) or unforgeable message-IDs (C7). Note, that I would prefer to have a globally unique ID. This is possible under the assumption that everybody tries to generate unique IDs and that non-unique IDs and misbehaving parties can be removed from the system. Essentially, it would look just like this except that the message-ID would have to include an ID for the originating server. That would allow recipients to check that connection_counter is increasing and the server_salt is unique for this server. The latter check might be hard to perform, though. It can still be solved using timestamps. This proposal seems much simpler, and it solves most of the problems. Also note, to make this a simpler change the clients could set both origin-ID and message-ID. The stanza-ID for MAM would turn out to be the same. This would be very similar to what is probably currently the most widespread behavior. Except that the origin-ID should be used for read-receipts, etc. Opinions? ___ Standards mailing list Info: https://mail.jabber.org/mailman/listinfo/standards Unsubscribe: standards-unsubscr...@xmpp.org ___
Re: [Standards] Message-IDs
On Dienstag, 13. Februar 2018 21:42:56 CET Simon Friedberger wrote: > >> ... > > You are mixing multiple problems with multiple solutions, which was > > probably in an effort to get the whole picture, but also leads to > > confusion. I personally would like to concentrate on solving C4, where > > you pointed out a promising candidate for a solution: E2 > > Indeed. Mostly because I still don't think that I understand the > complete picture. > For example, if we are only trying to solve C4, is that really worth the > effort? > Does it do anything more than save a round-trip? Yes. The "round-trip" you’re speaking of may be excessively expensive. Essentially, if a client wants to know the stanza-id of a message it sent, it needs to do a MAM query starting with the last known stanza-id and do some matching. There is no other way (because you don’t get carbons for messages you sent yourself). No client is doing this afaik. Clients which do not do this have to resort to some kind of heuristic when syncing MAM at a later point. So we’re solving a "round trip or annoying heuristic" situation. This is worse than it sounds, because it makes clients much more complex (or I am missing something; that would be great.): If a client wants to refer to messages internally by some unique ID, it would be natural to use the stanza-id, because that ID can be used with MAM queries, too. However, that’s not possible if you don’t know the stanza-id for outbound messages. So instead, clients need to add a layer of indirection with yet-another client-internal ID for the message (probably most of the time some type of auto-increment integer). kind regards, Jonas signature.asc Description: This is a digitally signed message part. ___ Standards mailing list Info: https://mail.jabber.org/mailman/listinfo/standards Unsubscribe: standards-unsubscr...@xmpp.org ___
Re: [Standards] Message-IDs
Hi Michal, thank you for your comments. I will address them inline. > I'm really tempted to say that the new message routing (in next gen > XMPP as discussed during summit) > must require the message stanza to have "id" attribute. I personally > think that uuid v4 would enough here. > This, to my knowledge, is hard to guess so a malicious user is > probably not able to guess next ID. > What it can do, though is to "reuse" the same id in other message, > which maybe a bad thing. So from the discussion we had in the summit-MUC it seems like abusing a guessed ID is not possible anyway if senders are properly verified. If anybody thinks otherwise, please speak up! Indeed, reusing IDs for different messages is always possible but can be mitigated by requiring the ID to be a function of the message. > E2. ... > > > Making the id verifiable (in the most efficient way) would be perfect. > I think, here we need to remember that no every client will have SM > enabled, so it may not have the sm-counter. Good point, thanks for bringing it up. This can probably be solved using something like the salt based variant of E2. ___ Standards mailing list Info: https://mail.jabber.org/mailman/listinfo/standards Unsubscribe: standards-unsubscr...@xmpp.org ___
Re: [Standards] Message-IDs
Hi Simon Thanks for refreshing the topic. Few things from me below (perspective of XMPP server developer, MongooseIM). Best regards Michal Piotrowski michal.piotrow...@erlang-solutions.com On 13 February 2018 at 17:57, Simon Friedberger wrote: > Hello List! > > > During the discussion on the different ID types at the summit I had an > idea for > a possible solution to the problem but not a sufficient understanding of > the > problem to even discuss it. I tried to find somebody to discuss it with > in chat > afterwards but nobody was available and I forgot about it. To get it off > my ToDo > list, here is my current understanding. I hope it can be a basis for > further > discussion. > > > A) Status-Quo: > Currently there are > A1. stanza-ID: generated by server > A2. origin-ID: generated by client > from https://xmpp.org/extensions/xep-0359.html and > > A3. message-ID: this is the ID-attribute on the stanza > from https://tools.ietf.org/html/rfc6120#section-8.1.3 > > There are also (4.) SM-IDs in stream management but those are > per-stream and > unrelated. > > > B) Use-cases: > B1. MAM https://xmpp.org/extensions/xep-0313.html uses stanza-ID. > B2. MUCs require IDs to detect reflections of own messages. > And reflection is great because it gives everybody the same view > on the > MUC in the presence of things like autopastebin or other rewrites. > B3. Error responses have the same ID-attribute as the original stanza. > > > C) Problems with current situation: > C1. People dislike having so many different IDs. > This is not a problem per se but it does mean implementation > complexity > and confusion. > I'm really tempted to say that the new message routing (in next gen XMPP as discussed during summit) must require the message stanza to have "id" attribute. I personally think that uuid v4 would enough here. This, to my knowledge, is hard to guess so a malicious user is probably not able to guess next ID. What it can do, though is to "reuse" the same id in other message, which maybe a bad thing. C2. According to Daniel it is not clear which ID should be used when > referencing things. In other words if he gets a delivery receipt > for an > ID the client might have based that on the origin-ID or the > message-ID. > I'm not sure if this should be considered relevant. People can > always > write broken clients which send back crap. Of course if it happens > unintentionally because of (C1.) fewer IDs would help > C3. Using origin-ID to detect MUC reflection doesn't always work > because MUCs > may not reflect it. > That's of course unfortunate but should IMHO considered an error > in the > MUC implementation (probably a transport) and fixed there. I > understand > that it might be difficult in some cases > ( https://lab.louiz.org/louiz/biboumi/issues/3283 ) but as Daniel > already pointed out yesterday it is much easier to fix a transport, > since it knows which protocol it is talking, to instead of working > around it at the end. > In any case the current situation seems to be bad: > > https://wiki.xmpp.org/web/XEP-Remarks/XEP-0045:_Multi-User_ > Chat#Matching_Your_Reflected_Message > C4. Clients require a bounce of their messages to learn the > stanza-id which > is used for MAM. > Why do they need to know? Maybe they want to reference their own > message. > They may need that, for instance, to know where from they can start syncing the archive after being offline. > Do they require this bounce anyway to make sure that their was > on rewriting? > C5. Some MUCs rewrite the message-id > Why is this allowed? It is even suggested here: > https://xmpp.org/extensions/xep-0045.html#message > C6. A global ID to reference messages might be nice. > C7. When referencing a message for example by "liking" it a forgeable > ID > could get you to like things you didn't intend to like. > This is a difficult problem because in many cases it requires > malicious > clients and servers and those have a lot of power anyway. > > > D) Possible root cause: > People do not trust the message IDs assigned by others and therefore > want to > assign their own. > > > E) Suggested solutions, including partial solutions: > E1. message-ID and origin-ID should always be the same, as proposed > by Georg > in > https://mail.jabber.org/pipermail/standards/2017-September/033415.html > Some concerns where voiced in that thread the only valid one is > that due > to bad software we need to deal with the situation that they are > different anyway. > There was a privacy concern about the "by=" attribute but > origin-ID does > not actually have that. > According to Daniel and
Re: [Standards] Message-IDs
On 13.02.2018 21:42, Simon Friedberger wrote: > On 13.02.2018 17:57, Simon Friedberger wrote: >>> C2. According to Daniel it is not clear which ID should be used when >>> referencing things. In other words if he gets a delivery receipt >>> for an >>> ID the client might have based that on the origin-ID or the >>> message-ID. >> Delivery receipts predate xep359 so it is safe to say that the intention >> is that delivery receipts use rfc6120-ids. While it is IMHO obvious from >> reading xep184 that it is based on rfc6120-ids, it can't hurt to specify >> this more explicitly. > But looking at https://xmpp.org/extensions/xep-0045.html#message > the message-ID seems to be rewritten to different values for different > recipients. > How can a client who gets a delivery receipt with such an ID figure out > which > message it is for? You can not reliable figure it out with the current specifications. One possibly option is to extend xep184 receipts to (optionally) include xep359 IDs. Maybe that would even be a backwards compatible change, e.g. clients could check for the xep359 ID in the receipt and fall back to the rfc6120 ID. - Florian signature.asc Description: OpenPGP digital signature ___ Standards mailing list Info: https://mail.jabber.org/mailman/listinfo/standards Unsubscribe: standards-unsubscr...@xmpp.org ___
Re: [Standards] Message-IDs
Hi Florian, thanks for chiming in! On 13.02.2018 17:57, Simon Friedberger wrote: >> C2. According to Daniel it is not clear which ID should be used when >> referencing things. In other words if he gets a delivery receipt >> for an >> ID the client might have based that on the origin-ID or the >> message-ID. > Delivery receipts predate xep359 so it is safe to say that the intention > is that delivery receipts use rfc6120-ids. While it is IMHO obvious from > reading xep184 that it is based on rfc6120-ids, it can't hurt to specify > this more explicitly. But looking at https://xmpp.org/extensions/xep-0045.html#message the message-ID seems to be rewritten to different values for different recipients. How can a client who gets a delivery receipt with such an ID figure out which message it is for? >> E) Suggested solutions, including partial solutions: >> E1. message-ID and origin-ID should always be the same, >> According to Daniel and Georg things currently break down anyway >> if this does not hold. > I don't now why things should break down if this does not hold. I think because it is difficult to match IDs to messages due to the reasons mentioned above. >> C3. Using origin-ID to detect MUC reflection doesn't always work >> because MUCs >> may not reflect it. > A short note: If a MUC service announces support for 'urn:xmpp:sid:0' > then the service is required to reflect the xep359 IDs. So clients are > at least able to determine if the MUC will reflect the xep359 extension > elements (but not if the MUC won't). And client developers should probably refuse to join MUCs that don't. Mandating it in the standard might still be good motivation for transport implementers. >> C5. Some MUCs rewrite the message-id >> Why is this allowed? It is even suggested here: >> https://xmpp.org/extensions/xep-0045.html#message > Hehe, that's an old discussion. Some people argue that the reflected > message is not the initial message and thus, could get a new ID. I also > think that the MUC way wants to enforce unique IDs for reflected > messages, which may not be guaranteed if the MUC service would need to > use the client provided ID. > > No matter what, I doubt that this will change in the future. Although I > have currently a neutral stance, XEP-0045 is to some degree set in > stone, it it unlikely to get such a fundamental change. This is an interesting point. I overlooked that it is exacerbated by the fact that some MUCs split messages so an ID for some messages is simply not available. Hm... What is the correct behavior here? Clearly, having messages with the same ID does not work for referencing, corrections, whatever.. On the other hand if a new ID is generated the client needs to be told that the server just made it say something and it can now expect delivery receipts for that. When hashing the message this is forced. It will change the ID and the client has to know. I don't see how this can be solved without a "bounce" since the bounce isn't one because the server generated the message. >> ... > Sounds like an interesting approach which we should explore. But apparently it doesn't work. xD >> ... > You are mixing multiple problems with multiple solutions, which was > probably in an effort to get the whole picture, but also leads to > confusion. I personally would like to concentrate on solving C4, where > you pointed out a promising candidate for a solution: E2 Indeed. Mostly because I still don't think that I understand the complete picture. For example, if we are only trying to solve C4, is that really worth the effort? Does it do anything more than save a round-trip? ___ Standards mailing list Info: https://mail.jabber.org/mailman/listinfo/standards Unsubscribe: standards-unsubscr...@xmpp.org ___
Re: [Standards] Message-IDs
(Hm..sorry for the screwed up line lengths.) Note that my suggestion: On 13.02.2018 17:57, Simon Friedberger wrote: > E2. Make the ID verifiable: This is what I had in mind at the summit and > after some discussion yesterday Jonas and Dave basically immediately > came up with the same thing, so it might be reasonably > straightforward. > Basically, the client calculates the ID based on some information that > it shares with the server like HASH(stream-id || sm-counter). This > would > allow the server to verify that the client generated a proper ID. > Jonas > suggested HMAC(key=stream-id, msg=sm-counter). If the message is in a > MUC, the MUC server can provide the user with some salt and then a > HASH(message-counter || salt) could be used to ensure that proper > unique > IDs are generated. > This ID is based on there being a party which is in charge of checking > the IDs. If you connect to a malicious MUC with malicious clients they > can still send you whatever. I don't think that is a problem, is it? Does not solve the problem, that a malicious server can send out messages with duplicate IDs. The servers or clients receiving them have no way to check. This could be fixed by including the message body (and whatever else seems appropriate in the hash). Which would leave us something like HASH(message-body || HASH(stream-id || sm-counter)) and HASH(stream-id || sm-counter) would have to be transmitted to remote servers. This does prevent (C7.) > C7. When referencing a message for example by "liking" it a forgeable ID > could get you to like things you didn't intend to like. > This is a difficult problem because in many cases it requires > malicious > clients and servers and those have a lot of power anyway. But I'm not sure it is necessary given that messages are not authenticated anyway. They aren't even for OMEMO. They could theoretically be but the attack still seems a bit academic. Anyway, hashes are generally cheap and it might not hurt to include the entire message in the hash. ___ Standards mailing list Info: https://mail.jabber.org/mailman/listinfo/standards Unsubscribe: standards-unsubscr...@xmpp.org ___
[Standards] Message-IDs
Hello List! During the discussion on the different ID types at the summit I had an idea for a possible solution to the problem but not a sufficient understanding of the problem to even discuss it. I tried to find somebody to discuss it with in chat afterwards but nobody was available and I forgot about it. To get it off my ToDo list, here is my current understanding. I hope it can be a basis for further discussion. A) Status-Quo: Currently there are A1. stanza-ID: generated by server A2. origin-ID: generated by client from https://xmpp.org/extensions/xep-0359.html and A3. message-ID: this is the ID-attribute on the stanza from https://tools.ietf.org/html/rfc6120#section-8.1.3 There are also (4.) SM-IDs in stream management but those are per-stream and unrelated. B) Use-cases: B1. MAM https://xmpp.org/extensions/xep-0313.html uses stanza-ID. B2. MUCs require IDs to detect reflections of own messages. And reflection is great because it gives everybody the same view on the MUC in the presence of things like autopastebin or other rewrites. B3. Error responses have the same ID-attribute as the original stanza. C) Problems with current situation: C1. People dislike having so many different IDs. This is not a problem per se but it does mean implementation complexity and confusion. C2. According to Daniel it is not clear which ID should be used when referencing things. In other words if he gets a delivery receipt for an ID the client might have based that on the origin-ID or the message-ID. I'm not sure if this should be considered relevant. People can always write broken clients which send back crap. Of course if it happens unintentionally because of (C1.) fewer IDs would help C3. Using origin-ID to detect MUC reflection doesn't always work because MUCs may not reflect it. That's of course unfortunate but should IMHO considered an error in the MUC implementation (probably a transport) and fixed there. I understand that it might be difficult in some cases ( https://lab.louiz.org/louiz/biboumi/issues/3283 ) but as Daniel already pointed out yesterday it is much easier to fix a transport, since it knows which protocol it is talking, to instead of working around it at the end. In any case the current situation seems to be bad: https://wiki.xmpp.org/web/XEP-Remarks/XEP-0045:_Multi-User_Chat#Matching_Your_Reflected_Message C4. Clients require a bounce of their messages to learn the stanza-id which is used for MAM. Why do they need to know? Maybe they want to reference their own message. Do they require this bounce anyway to make sure that their was on rewriting? C5. Some MUCs rewrite the message-id Why is this allowed? It is even suggested here: https://xmpp.org/extensions/xep-0045.html#message C6. A global ID to reference messages might be nice. C7. When referencing a message for example by "liking" it a forgeable ID could get you to like things you didn't intend to like. This is a difficult problem because in many cases it requires malicious clients and servers and those have a lot of power anyway. D) Possible root cause: People do not trust the message IDs assigned by others and therefore want to assign their own. E) Suggested solutions, including partial solutions: E1. message-ID and origin-ID should always be the same, as proposed by Georg in https://mail.jabber.org/pipermail/standards/2017-September/033415.html Some concerns where voiced in that thread the only valid one is that due to bad software we need to deal with the situation that they are different anyway. There was a privacy concern about the "by=" attribute but origin-ID does not actually have that. According to Daniel and Georg things currently break down anyway if this does not hold. E2. Make the ID verifiable: This is what I had in mind at the summit and after some discussion yesterday Jonas and Dave basically immediately came up with the same thing, so it might be reasonably straightforward. Basically, the client calculates the ID based on some information that it shares with the server like HASH(stream-id || sm-counter). This would allow the server to verify that the client generated a proper ID. Jonas suggested HMAC(key=stream-id, msg=sm-counter). If the message is in a MUC, the MUC server can provide the user with some salt and then a HASH(message-counter || salt) could be used to ensure that proper unique IDs are generated. This ID is based on there being a party which is in charge of checking the IDs. If you connect to a malicious MUC with malicious client