Re: [Standards] NEW: XEP-0313 (Message Archive Management)

Thijs Alkemade Wed, 12 Jun 2013 12:17:06 -0700

On 25 mei 2012, at 13:52, Kevin Smith <ke...@kismith.co.uk> wrote:

> On Fri, May 25, 2012 at 12:42 PM, Thijs Alkemade <th...@xnyhps.nl> wrote:
>> 
>> 
>> I've started implementing 0313 in libpurple/Adium, and I think
>> Matthew explained my concerns quite well.
>> 
>> Your suggestion assumes that once a client receives an incoming
>> message from the server, everything the client sent before that
>> moment was received by the server successfully (it makes sense to
>> require Carbons to do MAM, but lets assume that Stream Management is
>> not enabled). Suppose the last session ended with these two
>> messages, on a high-latency connection which got interrupted:
>> <snip/>
>> 
>> If the client thinks message 12345 came before 9876, while the
>> server thinks it's the other way around, then requesting the archive
>> from abcde will duplicate message 12345.
> 
> Yes. Always requesting based on the uid of the last message that you
> received will result in receiving from the server duplicates of any
> messages you have sent since then, and you'll have to not double-store
> them. 198 means that you know which of your sent stanzas have been
> processed by the server and does, I think, guarantee your history is
> complete and you're likely to end up, on average, with ~1 duplicated
> stanza to deal with on each login. The simple implementation is that
> you don't store in the cache anything that happened after the last
> message received from the server - and you know the ordering of your
> own stanzas vs the stanzas you received based on the ordering of the
> acks/messages you received from the server.
> 
> /K


[Reviving a pretty old thread.]

I've been thinking about this a bit more recently. To summarize, the scenario I
mostly consider tricky is:

 * A user has a conversation on a wifi^H^H^H^Hbad connection.
 * At some point, the connection is lost. The client doesn't immediately
   notice, so the user sends n more messages before the client notices it's
   not connected anymore.
 * The client logs in again some time later.

How should it query the archive?

It can query based on the UID of the last <archived /> on an incoming message,
but then it will get its outgoing messages again. It can ignore the first n of
those outgoing messages, but not all might have arrived on the server. The
only comparison it can do is based on their contents or timestamps, both not
very unique or consistent. It can, as Kev suggested, not store the outgoing
messages until an incoming message is received, but I don't think users will
appreciate their archive being incomplete, even when we can't guarantee those
messages were actually received.

I propose this: outgoing messages don't only get a UID, but also some session
identifier. This SID stays the same for all outgoing messages during one login
and the client can obtain it from the server (using an iq at login, for
instance). For a client it becomes easy to see which of its messages from the
last session made it to the server (it can even flag those that never arrived)
and it can just request all those since the last known UID, ignoring all those
with the previous SID. An additional benefit is that it becomes easier to
group MAM messages by conversation.

This could even be done without support on the server: the client just adds a
tag to each message with a SID it generated itself. However, it can't verify
the SID is unique within the archive, it increases the size of every message
and it has no meaning for the recipient of the message.

I know the goal of XEP-0313 is to not get as complicated as XEP-0136, but in
my opinion the extra complexity makes it much easier to synchronize history
consistently. Clients can opt to ignore it, and for servers its just a little
extra logic to generate another identifier.

Regards,
Thijs

signature.asc
Description: Message signed with OpenPGP using GPGMail

Re: [Standards] NEW: XEP-0313 (Message Archive Management)

Reply via email to