On Thu, 2013-05-30 at 18:16 +0100, Matthew Wild wrote:
> A more general issue is whether this XEP  (or rather the specific
> protocol it defines) s necessary at all. I'm not saying it definitely
> isn't, but need a little more persuasion. For example it seems that
> the primary issue it is working around is that XEP-0136 and XEP-0313
> might not save messages with no body. Might it not be easier to solve
> this problem instead?

Sorry I'm late to the party. I have actually been discussing this with
Spencer in the office over the last couple of weeks, so maybe I can give
some more motivation for why we feel that an XEP is required. I'm not
too attached

There are actually a bunch of things that Chat Markers is trying to
solve:

1) Atomic "Read Receipt" messages (Or "Seen Receipts" or whatever[1]).
1.1.1) Currently, our pre-alpha implementation of "Seen by $user"
markers requires each client to keep track of the state machine
(xep-0085) for *each* remote resource. Any incoming <active/>
notification, or any incoming <received> notification from a resource
that is in the 'active' state currently updates the "Seen by $user"
marker.

1.2.1) xep-0022 basically solves this problem, but it is marked as
obsolete. I didn't feel like digging it out of the grave, but maybe we
should re-consider it?

1.3.1) I suggested that we could simply include our state (if active) as
a sister element to <received/>, but Spencer pointed out that xep-0184
section 7 states: "When the recipient sends an ack message, it SHOULD
ensure that the message stanza contains only one child element". What
would break if we did this?


2) State recovery for disconnected clients that come online.
2.1.1) Currently, this is impossible, so our "Seen by $user" marker
stays where it is, and messages appear as "Not delivered" when they are
retrieved from MAM until a reply is received from the remote party (at
which point, we assume that their client has done state recovery from
MAM, and mark all messages as received.

2.1.2) xep-0184 section 5.5 Archived Messages states "An entity MUST NOT
send an ack message when a user views messages that have been archived
or stored on the client or the server (e.g., via Message Archiving [8]),
only when first receiving the message."

This is annoying, but quite understandable (e.g. what should a client do
if it gets <received id=1234/> when it doesn't have any knowledge of
<message id=1234/> or when it might have been sent?)

2.3.1) We could allow MAM to store *all messages*, but then then a query
for "how many unread messages to I have since $time" returns a hugely
inflated answer. The only way to get an accurate count would then be to
retrieve *all messages* and classify them.

2.3.2) We could create a clone of the MAM XEP (let's call it Message
State Recovery: MSR) that stores everything without a body, and let the
clients query that in order to do state recovery. 

A little benchmarking of our client's <message/> datastore and a simple
thought experiment suggests that this will be many times as large as the
MAM database in a naive implementation (when Kate sends a message to
Pete, her client will send <active/>; <composing/>; (<paused/>;
<composing/>)* <body/>... and each of Pete's clients will send
<received/> and 0 or more will send <active/>.

Note that we would need to bend the rules for xep-0184 (see 2.1.2) for
this to be useful. Specifically (after retrieving all messages from MSR
and MAM) for each incoming message in MAM that doesn't have a
corresponding outgoing entry in MSR, send a receipt anyway.

2.3.3) We could get the server to store markers for "delivered" and
"seen" etc. This is what Chat States attempts to do.



3) Efficiency
3.1.1) 2.3.2 and 1.1.1 cover a couple of the obvious problems with what
we have now.

3.3.1) If we create a MSR XEP, what is the minimum amount of information
that we can store? If we have solved problem 1) then we can make a lot
of optimizations. For example, if we used my 1.3.1 proposal, then could
simply store the last message of each type?

Concretely, could we have a database with the following uniqueness
constraint:
(sender_barejid, receiver_barejid, sibling_ns, sibling_name)

where sibling_ns is the namespace of the element after <received/> in
the <message/> stanza, and sibling_name is its name (e.g. 'active' or
NULL)?


And in reply to Matthew Wild's specific comments:
> 
> For XEP-0136, it appears to be configurable already in the archiving
> preferences (surprise surprise!). For XEP-0313, I'm open to discussion
> about what it recommends.
> 
We have gone for a Message Archive Management + Message Carbons approach
so far, which means that clients only need to know how to unpack
<forwarded/>. I don't fancy forcing 3 teams to implement XEP-0136 if I
can avoid it.

> XEP-0313 intentionally remains silent on most policy decisions like
> that. However it seemed sensible at the time that nobody would want to
> archive messages without a body, which on the network today are
> primarily chat states and notifications of various sorts. The XEP is
> still experimental, perhaps we can come up with better rules? I don't
> know, that's a discussion for another thread.
> 
see 2.3.1 for why I think that MAM's rules are probably correct, and if
anything, we should have a parallel store for messages without bodies.

> Forgetting archiving completely for the moment, offline messages might
> do enough already, no? XEP-0160 doesn't actually have any
> recommendations about what to store or what not to store. It seems
> that servers are expected to identify things like chat states already
> (XEP-0085 says that servers "SHOULD NOT store them offline"). This
> doesn't seem like a good model, but it's what we currently have.
> 
XEP-0160 breaks in any use-case that involves multiple mobile devices
per account. I am actually thinking of disabling support for it on our
server completely, since all of our supported clients understand Message
Archive Management.


David.


[1] The jdev thread was repeatedly derailed by people querying the
semantics of "read", so if I say "read" and it annoys you, translate it
to "seen" in your head. There are also use-cases for states like
"notified about" and "sent/delivered out-of-band" (e.g. via Apple Push
Notifications or SMS) and "acknowledged". I would prefer to avoid going
down that particular rabbit-hole yet, but any protocol should be
extensible in that direction (the benchmark for extensibility here is
the set of SIP status codes 100 Trying (= reached first server), 180
Ringing (= notified about), 200 OK (= acknowledged/accepted)).

-- 

Section numbers are of the form x.y.z) where x = topic, y = status: (1=
where are we now, 2= where did we come from, 3= where could we be), z =
incrementing integer. Sometimes I have nothing to say about x.2.z.

Reply via email to