On Thu, 2013-05-30 at 18:16 +0100, Matthew Wild wrote: > A more general issue is whether this XEP (or rather the specific > protocol it defines) s necessary at all. I'm not saying it definitely > isn't, but need a little more persuasion. For example it seems that > the primary issue it is working around is that XEP-0136 and XEP-0313 > might not save messages with no body. Might it not be easier to solve > this problem instead?
Sorry I'm late to the party. I have actually been discussing this with Spencer in the office over the last couple of weeks, so maybe I can give some more motivation for why we feel that an XEP is required. I'm not too attached There are actually a bunch of things that Chat Markers is trying to solve: 1) Atomic "Read Receipt" messages (Or "Seen Receipts" or whatever[1]). 1.1.1) Currently, our pre-alpha implementation of "Seen by $user" markers requires each client to keep track of the state machine (xep-0085) for *each* remote resource. Any incoming <active/> notification, or any incoming <received> notification from a resource that is in the 'active' state currently updates the "Seen by $user" marker. 1.2.1) xep-0022 basically solves this problem, but it is marked as obsolete. I didn't feel like digging it out of the grave, but maybe we should re-consider it? 1.3.1) I suggested that we could simply include our state (if active) as a sister element to <received/>, but Spencer pointed out that xep-0184 section 7 states: "When the recipient sends an ack message, it SHOULD ensure that the message stanza contains only one child element". What would break if we did this? 2) State recovery for disconnected clients that come online. 2.1.1) Currently, this is impossible, so our "Seen by $user" marker stays where it is, and messages appear as "Not delivered" when they are retrieved from MAM until a reply is received from the remote party (at which point, we assume that their client has done state recovery from MAM, and mark all messages as received. 2.1.2) xep-0184 section 5.5 Archived Messages states "An entity MUST NOT send an ack message when a user views messages that have been archived or stored on the client or the server (e.g., via Message Archiving [8]), only when first receiving the message." This is annoying, but quite understandable (e.g. what should a client do if it gets <received id=1234/> when it doesn't have any knowledge of <message id=1234/> or when it might have been sent?) 2.3.1) We could allow MAM to store *all messages*, but then then a query for "how many unread messages to I have since $time" returns a hugely inflated answer. The only way to get an accurate count would then be to retrieve *all messages* and classify them. 2.3.2) We could create a clone of the MAM XEP (let's call it Message State Recovery: MSR) that stores everything without a body, and let the clients query that in order to do state recovery. A little benchmarking of our client's <message/> datastore and a simple thought experiment suggests that this will be many times as large as the MAM database in a naive implementation (when Kate sends a message to Pete, her client will send <active/>; <composing/>; (<paused/>; <composing/>)* <body/>... and each of Pete's clients will send <received/> and 0 or more will send <active/>. Note that we would need to bend the rules for xep-0184 (see 2.1.2) for this to be useful. Specifically (after retrieving all messages from MSR and MAM) for each incoming message in MAM that doesn't have a corresponding outgoing entry in MSR, send a receipt anyway. 2.3.3) We could get the server to store markers for "delivered" and "seen" etc. This is what Chat States attempts to do. 3) Efficiency 3.1.1) 2.3.2 and 1.1.1 cover a couple of the obvious problems with what we have now. 3.3.1) If we create a MSR XEP, what is the minimum amount of information that we can store? If we have solved problem 1) then we can make a lot of optimizations. For example, if we used my 1.3.1 proposal, then could simply store the last message of each type? Concretely, could we have a database with the following uniqueness constraint: (sender_barejid, receiver_barejid, sibling_ns, sibling_name) where sibling_ns is the namespace of the element after <received/> in the <message/> stanza, and sibling_name is its name (e.g. 'active' or NULL)? And in reply to Matthew Wild's specific comments: > > For XEP-0136, it appears to be configurable already in the archiving > preferences (surprise surprise!). For XEP-0313, I'm open to discussion > about what it recommends. > We have gone for a Message Archive Management + Message Carbons approach so far, which means that clients only need to know how to unpack <forwarded/>. I don't fancy forcing 3 teams to implement XEP-0136 if I can avoid it. > XEP-0313 intentionally remains silent on most policy decisions like > that. However it seemed sensible at the time that nobody would want to > archive messages without a body, which on the network today are > primarily chat states and notifications of various sorts. The XEP is > still experimental, perhaps we can come up with better rules? I don't > know, that's a discussion for another thread. > see 2.3.1 for why I think that MAM's rules are probably correct, and if anything, we should have a parallel store for messages without bodies. > Forgetting archiving completely for the moment, offline messages might > do enough already, no? XEP-0160 doesn't actually have any > recommendations about what to store or what not to store. It seems > that servers are expected to identify things like chat states already > (XEP-0085 says that servers "SHOULD NOT store them offline"). This > doesn't seem like a good model, but it's what we currently have. > XEP-0160 breaks in any use-case that involves multiple mobile devices per account. I am actually thinking of disabling support for it on our server completely, since all of our supported clients understand Message Archive Management. David. [1] The jdev thread was repeatedly derailed by people querying the semantics of "read", so if I say "read" and it annoys you, translate it to "seen" in your head. There are also use-cases for states like "notified about" and "sent/delivered out-of-band" (e.g. via Apple Push Notifications or SMS) and "acknowledged". I would prefer to avoid going down that particular rabbit-hole yet, but any protocol should be extensible in that direction (the benchmark for extensibility here is the set of SIP status codes 100 Trying (= reached first server), 180 Ringing (= notified about), 200 OK (= acknowledged/accepted)). -- Section numbers are of the form x.y.z) where x = topic, y = status: (1= where are we now, 2= where did we come from, 3= where could we be), z = incrementing integer. Sometimes I have nothing to say about x.2.z.