Re: [Standards] Stanza ID of outgoing message

Dave Cridland Tue, 29 Sep 2020 08:37:15 -0700

On Tue, 29 Sep 2020 at 14:53, Marvin W <x...@larma.de> wrote:

> On 29.09.20 12:23, Dave Cridland wrote:
> > * UUIDv4 becomes an "advised to use"; the requirement is uniqueness. I
> > think UUIDv4 is a good thing, and people are unlikely to get anything
> > wrong if they use this method, but I don't think there's any
> > interoperability issue with using something different.
> > * Unconvinced we want to (or even can) rely on global uniqueness from
> > foreign entities; so I'm keen on the first bullet-point in (b) but less
> > sold on the second - having servers allocate a MAM id for inbound
> > foreign messages seems uncomplicated and safer.
>
> The reason to enforce UUIDv4 is that this can be verified server-side.
> While it can't be verified that something that looks like a UUIDv4 is
> created from actual randomness, it's probably the easiest in any
> programming language to actually generate a random UUIDv4 instead of
> generating a non-random string that the server-side verification
> considers a valid UUIDv4. If the string can just be anything,
> implementations may reside to things like random-connection-id + counter
> or timestamp. We've seen those in the past. To get new developers to do
> things right, we should make sure that doing the right thing is easier
> than doing wrong things that work "good enough" - even if we reach this
> by artificially increasing the complexity of doing wrong things.
>
>
Note that I sympathise with this argument. However, I would also note there
are plenty of clients and client libraries which do a perfectly good job of
generating random ids already, and forcing them into using UUIDv4 might be
more painful.

> I am not aiming for global uniqueness, just uniqueness within each
> "conversation" (a groupchat or a direct chat) and within each entity.
> Also I am aiming for reducing the IDs a message has, given how much
> confusion we already had because of "which ID to take for this XEP".
>
>
OK, and I fully agree that minimizing the number of ids is very useful
here, but I suspect that servers will always generate their own MAM id for
(at least) foreign messages.

The problem with this approach, as Holger notes, is that:

a) Servers may find checking the uniqueness of the archive id is difficult
to do synchronously. My servers tend to archive synchronously, and a
failure to archive is a failure to send (and the stanza is bounced back),
but others do not.

b) Servers may find that a client-generated id causes problems with their
implementation entirely. At least one server I've worked on eschews pseudo
random ids in favour of a node id and timestamp combination (I mean, it
sorts cleanly, I suppose).

c) Where servers cannot enforce (or choose not to enforce) the uniqueness,
there is likely to be a security issue I've yet to think of.

I'd like to examine ways of mitigating these.

You're correct in suggesting that UUIDv4 is something of a mitigation to
(a) in a well-intended client. I would think that a lookup table might work
for handling (b) and even mitigating (c), if we simply allow the
client-selected id to be an alternate to the server-imposed MAM id.

> On 29.09.20 13:34, Holger Weiß wrote:> BTW, I see how we designed things
> this way so I do understand how this
> > might break existing API layers, but I don't see how this is the wrong
> > layer per se.  Stream management is about acknowledging stanza delivery,
> > and referring to the delivered stanza by ID rather than count seems fine
> > to me.  It's done similarly in other protocols.
>
> I agree with this but from here would conclude that stream management
> *is* the wrong layer to announce the MAM server's stanza-id:
> Stream management works on the stanzas (iq, message, presence) which
> comfortably all have an id attribute on the top element, which is the
> real "stanza id". Thus acking a stanza with an ID instead of the count
> is perfectly fine, but then IMO it should be the id attribute of such
> stanza and not different things for different kinds of stanzas.

A short digression...

198 acks by counting for two reasons:

* It enforced that acking was done in a strict order, so acking of one
stanza implicitly acked all those before it, which simplified resumption.
* Lots of stanzas didn't have ids at the time - really they were only
routinely used for IQ. Hey, it was 2006, we were young and crazy, etc.

Most counting errors are either failure to initialize the count at the
right moment, or else failure to put stanzas through the right parts of the
code to count them - and therefore ack them at all.

The actual counting part turns out to be quite easy.

I've implemented this quite a few times now both on old codebases and newer
ones, and I think if we acked by stanza ids - or even by stanza ids - we'd
simply have different, rather than fewer, problems in implementation. I'm
not even sure that they'd be easier to detect.

If I were willing to introduce a compatibility breakage to make '198 easier
to implement, I'd have the <r/> carry an expected counter as well. But the
risk there is that poor implementations would simply echo it back... And I
might include the current received counter in an <r/>, thus allowing <r/>
to act as an unsolicited (but never solicited) ack itself.

Dave.

_______________________________________________
Standards mailing list
Info: https://mail.jabber.org/mailman/listinfo/standards
Unsubscribe: standards-unsubscr...@xmpp.org
_______________________________________________

Re: [Standards] Stanza ID of outgoing message

Reply via email to