Re: [Standards] NEW: XEP-0313 (Message Archive Management)
On 25 mei 2012, at 13:52, Kevin Smith ke...@kismith.co.uk wrote: On Fri, May 25, 2012 at 12:42 PM, Thijs Alkemade th...@xnyhps.nl wrote: I've started implementing 0313 in libpurple/Adium, and I think Matthew explained my concerns quite well. Your suggestion assumes that once a client receives an incoming message from the server, everything the client sent before that moment was received by the server successfully (it makes sense to require Carbons to do MAM, but lets assume that Stream Management is not enabled). Suppose the last session ended with these two messages, on a high-latency connection which got interrupted: snip/ If the client thinks message 12345 came before 9876, while the server thinks it's the other way around, then requesting the archive from abcde will duplicate message 12345. Yes. Always requesting based on the uid of the last message that you received will result in receiving from the server duplicates of any messages you have sent since then, and you'll have to not double-store them. 198 means that you know which of your sent stanzas have been processed by the server and does, I think, guarantee your history is complete and you're likely to end up, on average, with ~1 duplicated stanza to deal with on each login. The simple implementation is that you don't store in the cache anything that happened after the last message received from the server - and you know the ordering of your own stanzas vs the stanzas you received based on the ordering of the acks/messages you received from the server. /K [Reviving a pretty old thread.] I've been thinking about this a bit more recently. To summarize, the scenario I mostly consider tricky is: * A user has a conversation on a wifi^H^H^H^Hbad connection. * At some point, the connection is lost. The client doesn't immediately notice, so the user sends n more messages before the client notices it's not connected anymore. * The client logs in again some time later. How should it query the archive? It can query based on the UID of the last archived / on an incoming message, but then it will get its outgoing messages again. It can ignore the first n of those outgoing messages, but not all might have arrived on the server. The only comparison it can do is based on their contents or timestamps, both not very unique or consistent. It can, as Kev suggested, not store the outgoing messages until an incoming message is received, but I don't think users will appreciate their archive being incomplete, even when we can't guarantee those messages were actually received. I propose this: outgoing messages don't only get a UID, but also some session identifier. This SID stays the same for all outgoing messages during one login and the client can obtain it from the server (using an iq at login, for instance). For a client it becomes easy to see which of its messages from the last session made it to the server (it can even flag those that never arrived) and it can just request all those since the last known UID, ignoring all those with the previous SID. An additional benefit is that it becomes easier to group MAM messages by conversation. This could even be done without support on the server: the client just adds a tag to each message with a SID it generated itself. However, it can't verify the SID is unique within the archive, it increases the size of every message and it has no meaning for the recipient of the message. I know the goal of XEP-0313 is to not get as complicated as XEP-0136, but in my opinion the extra complexity makes it much easier to synchronize history consistently. Clients can opt to ignore it, and for servers its just a little extra logic to generate another identifier. Regards, Thijs signature.asc Description: Message signed with OpenPGP using GPGMail
Re: [Standards] NEW: XEP-0313 (Message Archive Management)
On 20 apr. 2012, at 10:32, Kevin Smith wrote: On Thu, Apr 19, 2012 at 6:01 PM, Matthew Wild mwi...@gmail.com wrote: One solution I came up with was for an entity that relays and archives messages to stamp the message with: archived by=capulet.lit id=1234-5678/ or archived by=conference.jabber.org id=8765-4321/. I'd be interested in feedback on this idea. Yes, we need (archiving, rather than stanza) ids stamped on the archived stanzas. However even archived/ doesn't cover the case of the client knowing the id of its *outgoing* messages. The server could echo them back with archived/... but then things start to get a bit muddy. The alternative is to not solve this, and clients should treat the MAM archive as the canonical source of history - (therefore fetching messages from the archive that have already been sent/received by it). A waste of bandwidth if nothing else. You will only need to request (assuming you have carbons) on average less than a single message that's a duplicate, though - as IM is typically send a message/receive a message [yes, there are exceptions where this is potentially very untrue], and you will have the id of the message you received. I've started implementing 0313 in libpurple/Adium, and I think Matthew explained my concerns quite well. Your suggestion assumes that once a client receives an incoming message from the server, everything the client sent before that moment was received by the server successfully (it makes sense to require Carbons to do MAM, but lets assume that Stream Management is not enabled). Suppose the last session ended with these two messages, on a high-latency connection which got interrupted: C: message id='12345' to='example.com' bodyHello/body /message S: message id='9876' from='example.com' bodyHey/body archived id='abcde' by='example.com' / /message If the client thinks message 12345 came before 9876, while the server thinks it's the other way around, then requesting the archive from abcde will duplicate message 12345. On the other hand, if the client requests the archive starting from abcde and does not receive message 12345, it can not know for sure wether 12345 was even received by the server (the spec never mentions it, but in my opinion being able to mark a message as we thought this message was sent, but the server never got it is a necessary part of synchronizing your logs). Not a typical case, sure, but also not something that is very unlikely to ever occur, and I think it's important to keep the client's logs as consistent as possible. I don't really have a good solution to propose, though. Replying to every outgoing message with something that includes the UID it was logged with could work, but it might add quite a bit of overhead. Stream Management could help with the latter problem, but not the former. Regards, Thijs
Re: [Standards] NEW: XEP-0313 (Message Archive Management)
On Fri, May 25, 2012 at 12:42 PM, Thijs Alkemade th...@xnyhps.nl wrote: On 20 apr. 2012, at 10:32, Kevin Smith wrote: On Thu, Apr 19, 2012 at 6:01 PM, Matthew Wild mwi...@gmail.com wrote: One solution I came up with was for an entity that relays and archives messages to stamp the message with: archived by=capulet.lit id=1234-5678/ or archived by=conference.jabber.org id=8765-4321/. I'd be interested in feedback on this idea. Yes, we need (archiving, rather than stanza) ids stamped on the archived stanzas. However even archived/ doesn't cover the case of the client knowing the id of its *outgoing* messages. The server could echo them back with archived/... but then things start to get a bit muddy. The alternative is to not solve this, and clients should treat the MAM archive as the canonical source of history - (therefore fetching messages from the archive that have already been sent/received by it). A waste of bandwidth if nothing else. You will only need to request (assuming you have carbons) on average less than a single message that's a duplicate, though - as IM is typically send a message/receive a message [yes, there are exceptions where this is potentially very untrue], and you will have the id of the message you received. I've started implementing 0313 in libpurple/Adium, and I think Matthew explained my concerns quite well. Your suggestion assumes that once a client receives an incoming message from the server, everything the client sent before that moment was received by the server successfully (it makes sense to require Carbons to do MAM, but lets assume that Stream Management is not enabled). Suppose the last session ended with these two messages, on a high-latency connection which got interrupted: snip/ If the client thinks message 12345 came before 9876, while the server thinks it's the other way around, then requesting the archive from abcde will duplicate message 12345. Yes. Always requesting based on the uid of the last message that you received will result in receiving from the server duplicates of any messages you have sent since then, and you'll have to not double-store them. 198 means that you know which of your sent stanzas have been processed by the server and does, I think, guarantee your history is complete and you're likely to end up, on average, with ~1 duplicated stanza to deal with on each login. The simple implementation is that you don't store in the cache anything that happened after the last message received from the server - and you know the ordering of your own stanzas vs the stanzas you received based on the ordering of the acks/messages you received from the server. /K
Re: [Standards] NEW: XEP-0313 (Message Archive Management)
On Thu, Apr 19, 2012 at 6:01 PM, Matthew Wild mwi...@gmail.com wrote: One solution I came up with was for an entity that relays and archives messages to stamp the message with: archived by=capulet.lit id=1234-5678/ or archived by=conference.jabber.org id=8765-4321/. I'd be interested in feedback on this idea. Yes, we need (archiving, rather than stanza) ids stamped on the archived stanzas. However even archived/ doesn't cover the case of the client knowing the id of its *outgoing* messages. The server could echo them back with archived/... but then things start to get a bit muddy. The alternative is to not solve this, and clients should treat the MAM archive as the canonical source of history - (therefore fetching messages from the archive that have already been sent/received by it). A waste of bandwidth if nothing else. You will only need to request (assuming you have carbons) on average less than a single message that's a duplicate, though - as IM is typically send a message/receive a message [yes, there are exceptions where this is potentially very untrue], and you will have the id of the message you received. I'll also mention here that in my mind archiving and carbons are very related. They are both about replicating history across clients, only that Carbons just works while online. Originally MAM was to allow 'subscribing' to an archive, as a way to receive messages sent/received by other resources while online, and even allow following a MUC room in realtime without joining it. This would be a separate XEP if I submitted it, but now that we have Carbons there would be more than a little overlap there. Thoughts welcomed. I had thoughts on the overlaps and how to deal with them that I started writing up at http://doomsong.co.uk/extensions/render/multiple-clients.html - although my opinions have likely changed in the last two years on the best way to do it. /K
Re: [Standards] NEW: XEP-0313 (Message Archive Management)
On Thu, 2012-04-19 at 18:01 +0100, Matthew Wild wrote: However even archived/ doesn't cover the case of the client knowing the id of its *outgoing* messages. The server could echo them back with archived/... but then things start to get a bit muddy. Thoughts. Say that Carbons shall echo back your outgoing messages, with the archived/ stamp. Or, some cross between that and Delivery Receipts, which just contain the archived/ with the UID. Message Archive Receipts? -- Kim Alvefur z...@zash.se
Re: [Standards] NEW: XEP-0313 (Message Archive Management)
On Thu, 2012-04-19 at 01:12 +, XMPP Extensions Editor wrote: Version 0.1 of XEP-0313 (Message Archive Management) has been released. Abstract: This document defines a protocol to query and control and archive of messages stored on a server. Changelog: Initial version, to much rejoicing. (mw) Finally! Much rejoicing indeed! -- Kim Alvefur z...@zash.se signature.asc Description: This is a digitally signed message part
Re: [Standards] NEW: XEP-0313 (Message Archive Management)
On 19 April 2012 02:12, XMPP Extensions Editor edi...@xmpp.org wrote: Version 0.1 of XEP-0313 (Message Archive Management) has been released. Abstract: This document defines a protocol to query and control and archive of messages stored on a server. Changelog: Initial version, to much rejoicing. (mw) Diff: N/A URL: http://xmpp.org/extensions/xep-0313.html There are some sections still remaining, and some things that need specifying further, which I have begun on. I should be able to submit an updated version by [REDACTED]. One of the substantial changes would be better specifying the use of Result Set Management. Currently only limit is required, but I think full RSM support should be a MUST to allow for accurate paging and queries based on message UIDs. I also have an open question, that perhaps warrants some discussion here... (warning: brain dump ahead) Lots of clients already store local history - and it is expected they will continue to use that, as a cache. MAM allows these clients to fetch history from the archive that happened while they were offline, or messages from other resources (though these can be caught while online with Carbons). The difficult part is how to identify the exact messages that the client doesn't yet have cached. Timestamps are not unique identifiers, as we all know. The problem here is that the client doesn't know the ID of the last message it has in its history, otherwise it could ask MAM for all messages since that ID. Using the timestamp could end up with duplicates, even with accurate clocks (which don't exist). One solution I came up with was for an entity that relays and archives messages to stamp the message with: archived by=capulet.lit id=1234-5678/ or archived by=conference.jabber.org id=8765-4321/. I'd be interested in feedback on this idea. However even archived/ doesn't cover the case of the client knowing the id of its *outgoing* messages. The server could echo them back with archived/... but then things start to get a bit muddy. The alternative is to not solve this, and clients should treat the MAM archive as the canonical source of history - (therefore fetching messages from the archive that have already been sent/received by it). A waste of bandwidth if nothing else. I'll also mention here that in my mind archiving and carbons are very related. They are both about replicating history across clients, only that Carbons just works while online. Originally MAM was to allow 'subscribing' to an archive, as a way to receive messages sent/received by other resources while online, and even allow following a MUC room in realtime without joining it. This would be a separate XEP if I submitted it, but now that we have Carbons there would be more than a little overlap there. Thoughts welcomed. Regards, Matthew
[Standards] NEW: XEP-0313 (Message Archive Management)
Version 0.1 of XEP-0313 (Message Archive Management) has been released. Abstract: This document defines a protocol to query and control and archive of messages stored on a server. Changelog: Initial version, to much rejoicing. (mw) Diff: N/A URL: http://xmpp.org/extensions/xep-0313.html