[Standards] Re: Proposed XMPP Extension: Jingle Remote Control

Goffi Mon, 27 May 2024 01:54:52 -0700

Le mercredi 22 mai 2024, 16:46:47 UTC+2 Marvin W a écrit :
> Hi Goffi,

Hi Marvin,


Seeing the proposition rejected is definitely disappointing, and I would like 
to have a clear statement of the reason why the Council thinks this work is 
"unacceptable".

For now, the biggest criticism I've seen is that this protocol specification 
is… specifying a protocol (which again is required by XEP-0166 for Jingle 
applications). This seems quite arbitrary to me, and I would like to have a 
clear statement on why this specification is "unacceptable" to the Council.

Specially when I've clearly stated several times that I'm open to changing the 
payload format and using an existing protocol if they prove easy to implement, 
flexible, and efficient. The experimental state is made for that.

Thanks in advance.
 
> The use case I'm thinking of has low throughput and only short usage
> time. I might be sending 10 or 20 key events within a short time and
> then nothing for several hours.
> 
> Technically, this can all be done with Jingle, but for just a few keys,
> the overhead of starting a Jingle session just for those keys probably
> adds way more latency than sending those keys via <message> through an
> XMPP server. And using Jingle would require way more complex software
> on both sides.

Alright, I understand better now. That's true; because I have advanced 
features like Jingle and WebRTC implemented in my software, I'm willing to 
build on them, but they are not available everywhere. For ease of 
implementation, it could be indeed interesting to have an <message> based way, 
usable either via server or via Jingle.

On the other hand, this introduces complexity by itself (now the payload can 
go through two different ways), and I'm not sure that, especially for a niche 
feature that most clients won't implement, that we should use an inferior 
solution because hypothetically, in a niche use case within the niche feature, 
a client may not have Jingle implemented.

Jingle is a major stack of XMPP, and it should be implemented in any advanced 
client according to IM compliance suites.

WebRTC is already optional; you can use any streaming transport.

 
> Without a proper specification to send keys, I would do this via non
> standardized body messages. Works, but isn't particularly nice.
> 
> I also noticed that in cases where XEP-0174 Serverless Messaging is
> used, an additional Jingle connection probably doesn't add a lot of
> benefit either.

That's another niche in the niche. That's really highly hypothetical, and even 
if this could be done directly, it doesn't hurt to add an additional Jingle 
connection.

But that said, I'm not firmly opposed to moving to <message> based payload, if 
that can unblock the situation.

> Well, XMPP clients that also speak a ton of other protocols, including
> the one you are just creating.
> 
> My point is that not only does this need to be specified in the
> business rules, but also a ton of other things. There are probably a
> lot of side cases that you don't cover and where I can't reasonably
> expect Council to think about them.

Of course, a proto-XEP is not meant to be perfect at first edition; that's 
exactly what the experimental status is for. And it's not the job of the 
Council to think about side cases - that's what standards@ and feedbacks from 
the whole community are for.

Maybe I got it wrong, but for me, the job of the Council is to keep technical 
stuff on track by ensuring that advancements in XEP statuses are done in order 
(i.e., X independent implementations, Y feedbacks, etc. as stated in relevant 
XEPs), and vetoing things that are really unacceptable (e.g., copyright 
issues, something totally irrelevant, offensive content, etc.). And it's the 
role of the larger community on standard@ to work on technical stuff, side 
cases, ease of implementation, and optimization.

I realize that there isn't a real definition of what should be an "acceptable" 
proto-XEP; maybe this should be specified? Because I've seen proto-XEPs refused 
by some Councils then accepted by others, and this seems quite arbitrary to 
me.

> [SNIP]
> 
> Both Jingle and especially WebRTC come with huge complexity. Your
> WebRTC library and your existing code for working with it might take
> away most of this from you, but that doesn't mean it's not there. By
> using Jingle and WebRTC you're effectively excluding clients, devices
> and platforms that can't easily run libwebrtc or any other popular
> WebRTC implementation.

Again WebRTC is not mandatory in my specification. Any streaming transport can 
be used, as designed by XEP-0166, including in-band via XEP-0261.

So we're just talking about Jingle, and this can be implemented on any 
platform, which is required for advanced IM client according to current 
compliance suit.

> I was already guessing it's not arbitrarily, but probably what made
> sense in your setup and for your usecase. However, not knowing any of
> that it *seems* arbitrary.

Use cases are already explained in the specification. For my current 
implementation, I have implemented a controlling device in a browser and a 
basic one in a CLI (currently sending only keyboard events for now).

I have also implemented a controlled device in a CLI, which works with Wayland 
and desktop portal. The implementation should not be a problem on other 
platforms that I target in the long run (Windows, Mac, Android, iOS, BSD, 
etc.). Actually, it should not be a problem on any platform.

> The RemoteDesktop portal was clearly designed for remote desktop use
> cases, not other remote control cases.

That's incorrect. Despite its name, you can actually only use the Remote 
Desktop portal to send input; the Screen Sharing part is entirely optional 
(and must be explicitly requested).

> However, as you already mention
> that you designed the data sent around what is needed for the
> RemoteDesktop portal, why not send the information directly in a format
> that matches the design of RemoteDesktop portal, instead of a mix of
> Web API interfaces and RemoteDesktop portal?

The data matches, except for keyboard events that are represented using evdev 
codes on Linux, whereas I was looking for a more platform-independent 
solution. The Web API turned out to be the easiest option I've found, but I'm 
open to considering an alternative if needed.

> 
> Also I noticed that the RemoteDesktop portal does not have a notion of
> an independent wheel, the mouse wheel is tied to the pointing device,
> why did you choose to not do it the same way?

No, despite its method names (`NotifyPointerAxis` and 
`NotifyPointerAxisDiscrete`), the wheel device is independent of the pointer, 
actually no pointer coordinates are sent when sending wheel events.

And that makes sense: it's not the pointer coordinate that's important, but 
rather where the focus is. You can change focus with a keyboard, for instance.

As I've said in my previous message, the wheel device, while often associated 
with mice, can also be independent.

> [SNIP]
> 
> 
> The precision on a double (64 bit floating point) remain the same, no
> matter if you scale [0,1] or [0,<screen-width>]. The precision is about
> 15 decimal digits which should be more than enough (you barely see
> screen coordinates with more than 4 decimal digits), even if you do
> calculations on them (which may result in a few bits of precision
> loss).

The issue is not about the number of digits, but the fact that some numbers 
cannot be represented by doubles. The first case I'm thinking of is 1/3, which 
can lead to a rounding error and having the wrong pixel selected at the end. 
Whether or not this is a problem depends on the use cases we want to handle, 
but using pixels directly avoids this issue.

Anyway, using [0,1] is not a bad idea, as it avoids the need to transmit 
screen size and screen size updates. It can be a better solution indeed.

> Assuming you refer the FPS games, those "lock" the cursor position to
> the screen center, so they never have that issue. To correctly
> reproduce this behavior you need a back channel to the controlling app
> so it can know the cursor position and/or lock if it is changed on the
> controlled device.
> 
> (Above might not be correct on all platforms.)
> 
> Also I did not intend to say that you shouldn't support movement
> vectors (like touchpads), I was just saying that absolute pointing
> could be relative to screen size, so that you don't need to know the
> absolute screen size.

Indeed, it may be a better option. I can change that. I'll check how other 
protocols deal with this issue and may use one of them directly.

> The advantage of going down this rabbit hole is:
> a) We improve XMPP for other usecases
> b) You can specify this protocol using XML and use Jingle XML streams.
> As the CBOR<>XML translation will take care of creating the CBOR for
> you, you still get the CBOR for this protocol, but without the need to
> make it explicit. And in cases where people prefer to not use CBOR,
> they can still use this protocol, just with XML. It's a win-win for
> everyone (except that you as the specification author have more work).

We have already EXI (XEP-0322) for that (I don't know how it compares to CBOR 
though).

Again, I'm not against getting rid of CBOR if it is a show stopper for people.


> If going forward, you still want to specify your own
> payload/application protocol (that is, the CBOR thing that is
> transferred with the Jingle streaming transport), I'd like to ask you:
> - To evaluate if a XEP is the right place to specify such a protocol,
> of if it is more a generic thing that could well be used outside XMPP
> and maybe should also be specified elsewhere.

I'll evaluate other specifications.

But yes, a XEP is, in my opinion, definitely the right place to specify a 
protocol. The fact that part of it is a Jingle application doesn't change the 
fact that it's globally an XMPP Extension Protocol. XEP-0166 states that the 
application payload protocol must be specified.

And even if we use XML extensively, XMPP is not about XML. We already use many 
non-XML data formats.

> - If you consider a XEP to be the right place and want to stick with
> your CBOR protocol, I'd like to ask you to split it into two parts: 1.
> the payload protocol (sections 8 and 9 of the proposal) and 2. The
> Jingle signaling protocol (sections 5 to 7 of your proposal). This way
> the protocol can be used and referenced easily for use outside of
> Jingle context.

I'm willing to strike a balance between efficiency, ease of implementation, and 
flexibility. I don't care if it's CBOR or anything else. I've heard your 
argumentation, and will consider using <message>, or another existing 
protocol.

It will take time, though; I'm busy with other things at the moment, and my 
current implementation is working well. If anybody is interested in 
implementing this specification anytime soon, please contact me - I can try to 
re-order my priorities.

> If you feel it's possible to transition to a <message> based approach,
> this can of course be a single XEP (that will barely have anything to
> do with Jingle except for anecdotal mentioning that it can be used with
> Jingle XML stream or serverless messaging for lower latency).

Got it. I'll evaluate the various options we've discussed.

Thank you for your time and detailed feedback - it's much appreciated.


> Best,
> Marvin

Best,
Goffi

signature.asc
Description: This is a digitally signed message part.

_______________________________________________
Standards mailing list -- standards@xmpp.org
To unsubscribe send an email to standards-le...@xmpp.org

[Standards] Re: Proposed XMPP Extension: Jingle Remote Control

Reply via email to