Re: [Standards] LAST CALL: XEP-0353 (Jingle Message Initiation)

Ralph Meijer Tue, 03 Sep 2019 07:15:55 -0700

On 03/09/2019 15.02, Andrew Nenakhov wrote:

вт, 3 сент. 2019 г. в 17:14, Philipp Hancke <fi...@goodadvice.pages.de<mailto:fi...@goodadvice.pages.de>>:
    0353 was explicitly designed for push (by not including the full
    payload
    due to size constraints) in conjunction with 0357 and should not go to
    MAM (hence no body).

    This has some sad consequences like the lack of a message acting as a
    call data record in the users history.
We're using Processing hints <store> element to make an archive storesuch messages. https://xmpp.org/extensions/xep-0313.html#hints
    The way 0353 is supposed to work is:
    1) you are offline but have a push-enabled client
    (there is the more interesting scenario where some clients of yours are
    online but none does jingle... and you would need to send a push
    notification to your offline client that does... that is a generic
    issue
    however)
    2) you get a push notification with the <propose/> element and know the
    senders full jid, the session id and (FYI) the media types involved

    3) your client requests that session at the sender. If that session
    doesn't exist anymore the sender will respond with a message stanza
    with
    type=error and <item-not-found/> (and potentially the jingle
    <unknown-session/>
No, I think push notifications do not work the way you describe.XEP-0357 says that a published <notification/> MAY contain additionalcustom information, however, all our implementations of Pushnotifications assume that NO additional information is relayed throughthird-party services (ejabberd, as I recall, doesn't even supportpublishing such additional information). Thus, we get NO <propose/> inpush notifications, NO message text, nothing. Just notification to anapp that it should wake up and update its data. Consequently, the onlyways we can get this information are MAM and offline messages/ Sinceoffline messages perform poorly when there are multiple user devices, itleaves us with just MAM.
I strongly oppose any suggestions to make push notifications workdifferently. If we start sending payload about calls via FCM/APNS, whystop at calls? We can just send full message text via push notificationsas Telegram does. And at that point, why messing with XMPP at all? AnFCM-only messenger can be coded in a week, it'll send and receive fullmessage text via FCM, store messages on FCM cloud database and all willwork admirably well.


Hi,

Let me start out saying that, like Philipp, my reading of the current
version of XEP-0353 is also that there is additional information shared
with the client over FCM/APNS. However, I must also say that neither
XEP-0353 nor XEP-357 make it clear how this should work exactly. This is
a problem, because it results in proprietary solutions for doing the
same thing. At minimum this should get another look and more concrete
examples with how existing push services would *actually* be used.

Andrew's description of their use of XEP-0353 introduces new elements in
the existing namespace, and adds a new iq-based exchange. Assuming this
is an experiment, I do want to point out that before it would go into
actual use, those would need to be moved to their own private namespace,
or be included in a next version of this specification (which might
require a new namespace).

With that all out of the way, let me describe what we did at VEON, as I
briefly alluded to in my presentation at the Summit [1]. Our approach
made calls server-mediated. I.e. the Jingle session-initiate was sent to
the callee's bare JID (their 'account'). The server then could send the
IQ on to online resources or via FCM/APNS. The latter notifications
actually did carry payload, including the media descriptions and
transport information (candidates).

The primary reason for doing it like this is speeding up the
negotiation. A receiving client can:

 * start evaluating the payload
 * configure the media library (in our case libwebrtc) to start media
streams on the device
 * re-establish the XMPP connection to the server
 * possibly request credentials for TURN
 * set up a TURN connection
 * retrieve vCards / avatars for the caller
 * etc.

In the mean while it can present the screen for allowing the user to
accept the call. As soon as the slow human user then presses 'accept'
you're immediately good to go and fully establish the call. This saves
several round-trips, and thus many centiseconds compared to XEP-0353 in
its current form, and even more if you have a model that relies on
re-establishing the XMPP connection before starting all of the above.

I want to note that our use cases where the XMPP connection might have
high latency but the actual media flows are local (even within office
networks).

Obviously, we also ran into the issue that notification payloads have
size limits. Thiago Camargo wrote a specialized compression library,
Shogun, to tackle this. It relies on a pre-shared dictionary for better
compression.

[1] <https://test.ralphm.net/publications/xmpp_chat_voip/>

--
Cheers,

ralphm
_______________________________________________
Standards mailing list
Info: https://mail.jabber.org/mailman/listinfo/standards
Unsubscribe: standards-unsubscr...@xmpp.org
_______________________________________________

Re: [Standards] LAST CALL: XEP-0353 (Jingle Message Initiation)

Reply via email to