On Dienstag, 6. März 2018 13:23:21 CET Lazar Otasevic wrote: > Conclusion: > > 1. it is not possible to iterate efficiently backwards by including both > <before>&<after> in the query because once <after> is included in the query > then it iterates forwards. that means when iterating backwards client has > to omit <after> and fetch entire pages and then the last page will mostly > overlap with some of the local messages, which is a waste. > 2. it is not possible to determine "holes" in the archive reliably, because > client can not know what is the last message archive-id, because our own > sent messages have no feedback from the server once the message is archived > what is its archive-id ... that means that client has to fetch ALL messages > from MAM once again just to be sure that holes are filled, even though many > message-bodies are already received/sent during live communication. > > Basically In the current state all our own messages are "holes" in the > local archive, not to mention all kind of "bad network" scenarios, > multi-device and longer offline periods. > > Making separate requests, one for archive ids and one for content would > make: > - much less waste in the sync because only ids would be wasted, and not the > content > - possible to fill multiple holes in one request by fetching content that > is really needed > - make possible for push payloads to contain only message ids (when clients > want to handle encrypted messages locally by fetching them and only them) > currently it is doable by giving to the push one id before the wanted > message
So I might be wrong, but for the sake of it, I think there is a sane and not too complex way to do archive sync with MAM: 1. On startup, you put a "hole marker" at the end of your local archive. A hole marker is essentially just the stanza-id of the last message at the time the marker is created. 2. Iterate over all hole markers, from newest to oldest. Download everything between the last message *before* and the first message *behind* the hole marker. During download, move the hole marker accordingly to deal wtih disconnects while downloading. When finished, remove the hole marker; move on to the next hole marker if there’s any. This should work and should also work with current semantics. I appreciate that there *might* be some overlap between the last page of a hole and already received messages. This is unfortunate, but can trivially be solved by comparing stanza-id (if available locally) or origin-id or message id in that order. (N.B.: once we get self-carbons, we’ll always have the stanza-id) I also wrote that down in more detail here: https://github.com/jabbercat/jabbercat/issues/26#issuecomment-370333729 I think it would be great to have a way to limit the MAM query to an end-ID indeed. Matt, Kevin, any chance we get that in? kind regards, Jonas
signature.asc
Description: This is a digitally signed message part.
_______________________________________________ Standards mailing list Info: https://mail.jabber.org/mailman/listinfo/standards Unsubscribe: standards-unsubscr...@xmpp.org _______________________________________________