Hi, The purpose of this post is to declare a problem statement for us and how we solved it and to elicit feedback on our solution, as well as encourage this to be considered in any future versioning scheme that may be applied to the PubSub specification.
I'm an engineer co-founding a startup for mission critical realtime location systems using mobile devices like smart phones, and we have chosen XMPP as our realtime message bus. The PubSub and Common Alerting Protocol XEP's fit our need for alert aggregation between individuals in distress and the caretakers of a given geographical area. To set the scene; we have an "alerts" node where CAP alerts are published by someone in distress. These alerts are subsequently updated by caretakers in the field either with additional information or to transition the alert state (such as acknowledgement). These updates occur from occasionally connected mobile clients and we assume that these client connections are both fragile and expensive. It is crucial for us that the collection of items held by a client for a node are synchronized with the server - that is, that each node observer will have precisely the same collection after receiving the same event sequence sent from the server. This is challenging when we have nodes which have many publishers and sometimes items have many publishers. With regards to possible race conditions in updating items between publishers, we are satisfied with "latest wins" logic and is not being considered in this discussion. During development we have discovered cases where desynchronization of the collection of items stored at a node can occur in which the XMPP specifications do not appear to address. For example; C1: <iq id="id1" type="get" to="pubsub.criticalarc.com" from="jah...@criticalarc.com/client1"> <pubsub xmlns="http://jabber.org/protocol/pubsub"> <items node="my_node" /> </pubsub> </iq> C2: <iq id="id2" type="set" to="pubsub.criticalarc.com" from="jah...@criticalarc.com/client2"> <pubsub xmlns="http://jabber.org/protocol/pubsub"> <publish node="my_node"> <item id="item1" /> </publish> </pubsub> </iq> S: <iq id="id2" type="result" to="jah...@criticalarc.com/client2" from="pubsub.criticalarc.com"> /> S: <message id="id3" to="jah...@criticalarc.com/client1"> <pubsub xmlns="http://jabber.org/protocol/pubsub#event"> <items node="my_node"> <item id="item1" /> </items> </pubsub> </message> S: <iq id="id1" type="result" to="jah...@criticalarc.com/client1" from="pubsub.criticalarc.com"> <pubsub xmlns="http://jabber.org/protocol/pubsub"> <items node="my_node"> <item id="item1" /> </items> </pubsub> </iq> 0123456789012345678901234567890123456789012345678901234567890123456789 In this case it is not clear to client1 whether the latest version of item1 is in the event or the result. This case is exacerbated when Result Set Management is used to receive multiple pages of items are interleaved with events updating or removing the same items. I have read previous discussions regarding timestamping and versioning for pubsub. We considered using timestamps, but dismissed them due to very real possibility that this sequence could actually occur within the same time unit, and that if a timestamp needs to be 'massaged' for ordered uniqueness, it's not the best choice. We considered using opaque versioning ala Roster management, but this doesn't actually solve the above scenario because there is no natural ordering to opaque strings, so determining which version is the most recent is still a problem. As such, we have currently solved the problem by using a natural ordering "ver" attribute on each item receives that the client can use to compare. We have also added the ability to request historical events (also versioned) so that when a mobile client connects it can resume its event stream at minimal cost. We did this because there is more in an event stream than just item publications that we would like to know about. (I can expand on this at request). Thoughts and criticisms welcome. Regards, Jahmai.