Re: [Standards] Entity Capabilities 2.0
Hi Christian, Thanks again for your input. Comments inline. On Dienstag, 13. Februar 2018 23:20:54 CET Christian Schudt wrote: > Hi Jonas, > > > You are referring to the processing entities side? The entity is free to > > choose from the set as it desires. The order of elements inside the hash > > set is undefined. It could for example iterate a list of hash functions > > in descending order of preference and look for hashes in the hash set. It > > could also first check if any of the hashes is in the cache and prefer > > that. > > > > I see this could be clarified, but I’m hesitant to make a normative > > statement on the behaviour. Some suggestions can be put into the XEP > > though. Do you see a reason to make this normative? > > See, you made a good point here: First check if any of the hashes is in the > cache. I forgot about it in my implementation and that’s why I think it > could be beneficial to have something defined in the specification. > > Think about the same capabilities sent by two different entities, but with a > different hash order each. > abc > def > > > > def > abc > > > If an implementation would just pick the first hash each time it processes a > presence with Caps, it will likely end up with two service discovery > requests and two hashes in the cache, although one would be enough. I honestly didn’t think about that, since my implementation does ignore the order between the hash elements and simply tries them in the order of the local hash function preference. I’ll include a few implementation notes on this topic in the XEP, thank you very much. > The spec should recommend to first iterate over all hashes and check for > each hash, if it’s already known (cached). > > I *think* I outlined the integration with XEP-0115 in the XEP already. Can > > you be more specific on where you would like guidance? > > You write (or I understand): If there are also `115 Caps, you may use them > to get the disco#info from the cache. In the next sentence: you must not > use data from the 115 cache, if there are also 390 Caps in the presence. > *confusing*? I assume you’re referring to §7.2, Upgrading from XEP-0115. I think you missed the "without verification" qualifier in the last sentence which should make things clearer: *if* a XEP-0115 cache is available and *no* XEP-0390 caps hashes are received, an entity MAY choose to still optimize the query by falling back to the XEP-0115 behaviour. However, if XEP-0390 caps hashes ARE available, the entity MUST use them to verify the data obtained from a XEP-0115 cache (if they have no XEP-0390 match and went down the fallback route). I think a rewording will make this much clearer. > Generally I thought about some guidance about what I’ve worked out during my > implementation: Some ideas about a common interface and some business rules > for processing both Caps Extensions in the same presence. I’m not sure if the XEP is the right place for that, honestly. I’d like to restrict it to describe protocol, not to implementation details which are solely in the realm of the Software Engineering side of things (which is in contrast to the things we discussed above, about the Cache implementation). > In the same sense > how XEP0191 describes the relation to XEP0016 and recommends to use the > same backend storage and defines a clear mapping [1]. But this is behaviour which is visible to other entities. A client might implement XEP-0016 *and* XEP-0191, or one client may implement XEP-0191 and one client may implement XEP-0016 and to gain some meaningful interoperation between the two approaches it is kinda needed. It is not visible to other entities if you use a common interface for your XEP-0115 and XEP-0390 implementations or not. We can of course make a huge section of implementation suggestions, but I don’t think this is the right place to do that. > Maybe also some words about if XEP0115 cache can be mixed with XEP0390 > cache. In my implementation I use the same cache for both. > Maybe you also have some recommendations what to use as cache keys. > In my implementation I currently use something like: „sha1(XEP# + algo + > bas64(hash)). Not sure if it’s good. I’m using the disco#info @node values (Capability Hash Node in XEP-0390 terminology) as cache keys for the (caps -> disco#info) cache. They are different and already include the hash function and hash, and there’s no need to run SHA1 on top of that. thanks and kind regards, Jonas [1]: https://github.com/xnyhps/capsdb/ signature.asc Description: This is a digitally signed message part. ___ Standards mailing list Info: https://mail.jabber.org/mailman/listinfo/standards Unsubscribe: standards-unsubscr...@xmpp.org ___
Re: [Standards] Entity Capabilities 2.0
On Montag, 12. Februar 2018 09:10:47 CET Jonas Wielicki wrote: > On Montag, 12. Februar 2018 00:41:54 CET Christian Schudt wrote: > > - Generally I am unsure if using the "xml:lang" and „name" from the > > identities is a good idea at all, because these two attributes should not > > change the capabilities of an entity. Name and language is just for > > humans. > > I.e. if a server sends german identities for one user and english > > identities for the next user (depending on their client settings/stream > > header), the server still has the same identities, which should result in > > the same verification string, shouldn’t it? > > First of all, I think previously, an entity answering a disco#info request > always sent all translated identities, so that would not have been an issue. > > You’re touching on a more general thing though which I’d like to discuss. We > could separate the hash into three hashes, one for identities, one for > features and one for forms (or maybe two: identities and forms+features). > > This has the upside that human readable identifiers don’t interfere with > protocol data (features/forms) in many cases (I think the identities are > more rarely used in protocols, but I might be wrong). The obvious downside > is that we need to transfer more data in the presence (twice or thrice the > amount for ecaps2). > > I’d like to know what you people think of it. Since this is still > Experimental, I’d be fine with bumping the namespace and getting this done. > But I’m afraid that the bandwidth costs will outweigh the advantages. We > have ~100 bytes for a 256 bit hashsum (including wrapper XML). We would end > up with more than half a kilobyte (~0.6 kB) for ecaps2 if we split the > hashes and assume that each entity uses two hash functions with 256 bits > each (which I think is a reasonable assumption). If we have caps > optimization, the impact would probably be neglectible, but I’m not sure if > we can assume that. > > I’d like to get input from you folks on that. I had some off-list input on this. First, Evgeny pointed out that the work which is in progress on MUC bare-presence [1] has uncovered that caps don’t really work well for the MUC case. A MUCs disco#info contains for example the number of occupants currently in the room, which may fluctuate a lot (thus causing lots of traffic if caps are used completely) [2]. Second, Florian Schmaus questioned my approach of splitting the hashes and asked for use-cases where this makes sense. I think I can come up with two use cases off the top of my head, both with varying relevance depending on which metric you want to optimize. - The MUC use case from above. Granted, this isn’t in any spec yet, but it would be great to have. Daniel noted that having the disco#info form of MUCs is useful to detect (a part of) the configuration which is relevant to (IMO reasonable) UX choices in Conversations. However, obviously if the occupant count is in there, the use of a caps hash is rather defeated in this case. - Clients sometimes include XEP-0232 (Software Information) and other forms in their disco#info. This might be high-cardinality information which may thrash (overloads/fills) entity caches. I used the (a bit dated) capsdb [3] and ran the numbers: Total items in capsdb: 1602 Distinct hashes: 1558 (i.e. XEP-0115/XEP-0390 as-is) Distinct identity+features: 1140 Distinct forms: 450 This is less of a saving than I expected; however, the capsdb is rather dated. I wonder whether the saving is larger nowadays if there are more clients which implement XEP-0232 or other similar things. Splitting the hashes could also allow entities to explicitly opt-out of one of the two hashes; an entity with a disco#info form which changes in real-time could opt-out of sending the form hash altogether (instead of sending a hash equivalent to "no form"); thus signalling to peers that if disco#info form data is desired, it needs to be queried freshly. All over all, I’m not sure if those two use-cases warrant the increase of bandwidth use by a factor of approximately two for caps2. I’m still hoping for more feedback on this, thanks! kind regards, Jonas [1]: The idea is to let MUCs emit a presence from the bare JID after the client joined to send them caps and avatar info etc. [2]: They work around that currently by not including the form in the caps and omitting the form data from disco#info queries against caps disco#info nodes. [3]: https://github.com/xnyhps/capsdb/ signature.asc Description: This is a digitally signed message part. ___ Standards mailing list Info: https://mail.jabber.org/mailman/listinfo/standards Unsubscribe: standards-unsubscr...@xmpp.org ___
Re: [Standards] Entity Capabilities 2.0
Hi Jonas, > You are referring to the processing entities side? The entity is free to > choose from the set as it desires. The order of elements inside the hash set > is undefined. It could for example iterate a list of hash functions in > descending order of preference and look for hashes in the hash set. It could > also first check if any of the hashes is in the cache and prefer that. > > I see this could be clarified, but I’m hesitant to make a normative statement > on the behaviour. Some suggestions can be put into the XEP though. Do you see > a reason to make this normative? > See, you made a good point here: First check if any of the hashes is in the cache. I forgot about it in my implementation and that’s why I think it could be beneficial to have something defined in the specification. Think about the same capabilities sent by two different entities, but with a different hash order each. abc def def abc If an implementation would just pick the first hash each time it processes a presence with Caps, it will likely end up with two service discovery requests and two hashes in the cache, although one would be enough. The spec should recommend to first iterate over all hashes and check for each hash, if it’s already known (cached). > I *think* I outlined the integration with XEP-0115 in the XEP already. Can > you > be more specific on where you would like guidance? You write (or I understand): If there are also `115 Caps, you may use them to get the disco#info from the cache. In the next sentence: you must not use data from the 115 cache, if there are also 390 Caps in the presence. *confusing*? Generally I thought about some guidance about what I’ve worked out during my implementation: Some ideas about a common interface and some business rules for processing both Caps Extensions in the same presence. In the same sense how XEP0191 describes the relation to XEP0016 and recommends to use the same backend storage and defines a clear mapping [1]. Maybe also some words about if XEP0115 cache can be mixed with XEP0390 cache. In my implementation I use the same cache for both. Maybe you also have some recommendations what to use as cache keys. In my implementation I currently use something like: „sha1(XEP# + algo + bas64(hash)). Not sure if it’s good. [1]: https://xmpp.org/extensions/xep-0191.html#privacy > Any element in XML can have an xml:lang attribute. It is specified in the XML > standard on how the value of xml:lang propagates. > > Yes, at it propagates down the tree, unless another element (e.g. the > presence) overrides it. For example: > > > […] > > > > > > […] > > > In this case, the identity element would be assumed to have the language > "en". > If the xml:lang on the query was missing, it would be "de" and so on. > Oh wow, understood. Let’s see what will happen about the language... — Christian ___ Standards mailing list Info: https://mail.jabber.org/mailman/listinfo/standards Unsubscribe: standards-unsubscr...@xmpp.org ___
Re: [Standards] Entity Capabilities 2.0
On Montag, 12. Februar 2018 13:53:40 CET Evgeny Khramtsov wrote: > Mon, 12 Feb 2018 09:12:02 +0100 > > Jonas Wielicki wrote: > > Could you please be specific which cache you’d like to see properly > > invalidated? Do you mean the (hash -> disco#info) cache or the > > (entity JID -> hash / disco#info) cache? > > A server can change configuration in runtime at any time, potentially > changing its disco#info. How to notify local clients about that? How to > notify clients from remote servers? How to notify connected servers after > all? Thanks, I was thinking about client caps mostly. This is valuable input. So obviously we need a way to push updates to the peers. One way would be to use a nonza, another would be to use a message with caps payload. Irrespective of the transport used, I think the following business rule would work: * When a server has changed its disco#info and thus the Capability Hash Set, the server MUST send a push update to all of its clients and s2s connections. This doesn’t keep remote clients up-to-date. I’m not sure whether, if at all, we need this. Are there compelling use-cases? As for the transport, I generally see three options: 1. A nonza seems most reasonable from the scope point of view, because the push update should not be propagated to another stream. However, nonzas need to be negotiated. I am not sure if we want to go down that route of complexity here, especially for clients (where we’re keen on saving round-trips anyways and the complexities involved with specifying the negotiation state after SM resumption etc.). Opinions? 2. Otherwise, I’d say we simply use a addressed to the entity which is to receive the update. In case of s2s links, that would be the domain of the peer server. In case of c2s links, that would be the full JID of the client. The message would contain a single element. Type headline. This could be send unsolicitedly, I think; doesn’t need to be stored in the archive or carbon-copied either (full JID addressing makes this conformant with New Message Routing Rules; since the stanza is generated on the server, the server can take whatever measures are needed to avoid having it end up in the archives or something like that). 3. Using a full pub-sub service on the domain feels excessive, but then again, we’re already going down that route for accounts. It would allow for peers to be able to opt-in to the notifications, and it would allow remote clients to stay up-to-date. I’d like to hear your (all) input. kind regards, Jonas signature.asc Description: This is a digitally signed message part. ___ Standards mailing list Info: https://mail.jabber.org/mailman/listinfo/standards Unsubscribe: standards-unsubscr...@xmpp.org ___
Re: [Standards] Entity Capabilities 2.0
On Mon, Feb 12, 2018 at 10:53 AM, Evgeny Khramtsov wrote: > A server can change configuration in runtime at any time, potentially > changing its disco#info. How to notify local clients about that? How to > notify clients from remote servers? How to notify connected servers after all? Sounds like a directed presence is in order. Or a pubsub node on the main domain (though that makes it too tempting to pull in all of pubsub). I wouldn't mind seeing the XEP explicitly choosing an approach. ___ Standards mailing list Info: https://mail.jabber.org/mailman/listinfo/standards Unsubscribe: standards-unsubscr...@xmpp.org ___
Re: [Standards] Entity Capabilities 2.0
Mon, 12 Feb 2018 09:12:02 +0100 Jonas Wielicki wrote: > Could you please be specific which cache you’d like to see properly > invalidated? Do you mean the (hash -> disco#info) cache or the > (entity JID -> hash / disco#info) cache? A server can change configuration in runtime at any time, potentially changing its disco#info. How to notify local clients about that? How to notify clients from remote servers? How to notify connected servers after all? ___ Standards mailing list Info: https://mail.jabber.org/mailman/listinfo/standards Unsubscribe: standards-unsubscr...@xmpp.org ___
Re: [Standards] Entity Capabilities 2.0
On Montag, 12. Februar 2018 08:45:23 CET Evgeny Khramtsov wrote: > Mon, 12 Feb 2018 00:41:54 +0100 > > Christian Schudt wrote: > > - I am also missing a cache which maps entities to capabilities, i.e. > > JIDs to disco#info objects. This is the whole point of the XEP (to be > > able to know an entity’s abilities without service discovery). This > > cache should be probably be non-persistent. The "Capability Hash > > Cache“ (hash -> disco#info) is actually only the > > intermediate/auxiliary cache. > > I think the XEP doesn't solve the main problem of cache invalidation > and that's why I think it's pointless and I'm not going to implement it > until the invalidating rules are described. Could you please be specific which cache you’d like to see properly invalidated? Do you mean the (hash -> disco#info) cache or the (entity JID -> hash / disco#info) cache? kind regards, Jonas signature.asc Description: This is a digitally signed message part. ___ Standards mailing list Info: https://mail.jabber.org/mailman/listinfo/standards Unsubscribe: standards-unsubscr...@xmpp.org ___
Re: [Standards] Entity Capabilities 2.0
On Montag, 12. Februar 2018 00:41:54 CET Christian Schudt wrote: > - Generally I am unsure if using the "xml:lang" and „name" from the > identities is a good idea at all, because these two attributes should not > change the capabilities of an entity. Name and language is just for humans. > I.e. if a server sends german identities for one user and english > identities for the next user (depending on their client settings/stream > header), the server still has the same identities, which should result in > the same verification string, shouldn’t it? First of all, I think previously, an entity answering a disco#info request always sent all translated identities, so that would not have been an issue. You’re touching on a more general thing though which I’d like to discuss. We could separate the hash into three hashes, one for identities, one for features and one for forms (or maybe two: identities and forms+features). This has the upside that human readable identifiers don’t interfere with protocol data (features/forms) in many cases (I think the identities are more rarely used in protocols, but I might be wrong). The obvious downside is that we need to transfer more data in the presence (twice or thrice the amount for ecaps2). I’d like to know what you people think of it. Since this is still Experimental, I’d be fine with bumping the namespace and getting this done. But I’m afraid that the bandwidth costs will outweigh the advantages. We have ~100 bytes for a 256 bit hashsum (including wrapper XML). We would end up with more than half a kilobyte (~0.6 kB) for ecaps2 if we split the hashes and assume that each entity uses two hash functions with 256 bits each (which I think is a reasonable assumption). If we have caps optimization, the impact would probably be neglectible, but I’m not sure if we can assume that. I’d like to get input from you folks on that. kind regards, Jonas signature.asc Description: This is a digitally signed message part. ___ Standards mailing list Info: https://mail.jabber.org/mailman/listinfo/standards Unsubscribe: standards-unsubscr...@xmpp.org ___
Re: [Standards] Entity Capabilities 2.0
Hi Christian, First of all, thank you for your thorough feedback. My comments are inline below. On Montag, 12. Februar 2018 00:41:54 CET Christian Schudt wrote: > Hi, > > I’ve implemented Entity Capabilities 2.0 (XEP-0390) and like to share some > thoughts about it here and in the following link. > > I think it could be interesting for library developers as well as the > author(s) of XEP-0390: > > http://babbler-xmpp.blogspot.de/2018/02/experimenting-with-entity-capabiliti > es.html > > > Generally, XEP-0390 is really well and comprehensively written, but I’ve > found some issues during development, which I’d like to address: > > - How to pick the first Capability Hash from a Capability Hash Set. Is it > random? Is there a preferred hash algorithm order? You are referring to the processing entities side? The entity is free to choose from the set as it desires. The order of elements inside the hash set is undefined. It could for example iterate a list of hash functions in descending order of preference and look for hashes in the hash set. It could also first check if any of the hashes is in the cache and prefer that. I see this could be clarified, but I’m hesitant to make a normative statement on the behaviour. Some suggestions can be put into the XEP though. Do you see a reason to make this normative? (I’m still unsure I understood you correctly.) > - It’s not specified > what to do if a hash algorithm is not understood by a processing entity. > I’ve implemented it in the way that it’s ignored and the next hash is > tried. Yes, ignoring is the way to go. I can make that clearer. > - § 6.2 Rules for Processing Entities is relatively short. As > outlined in my blogpost this section could be more verbose: Integration > with XEP-0115. Picking the first hash. When to do a disco#info query. I *think* I outlined the integration with XEP-0115 in the XEP already. Can you be more specific on where you would like guidance? When to do the disco#info query is essentially up to the processing entity. In my implementation, I do the query only when the application first asks for the information, to save bandwidth. Again, I’m not sure if there should be normative language on this. > - I > am also missing a cache which maps entities to capabilities, i.e. JIDs to > disco#info objects. This is the whole point of the XEP (to be able to know > an entity’s abilities without service discovery). This cache should be > probably be non-persistent. The "Capability Hash Cache“ (hash -> > disco#info) is actually only the intermediate/auxiliary cache. I see. I makes sense to specify this in the XEP, indeed. > - It is > said, that the disco#info element can have an xml:lang element, but it’s > not specified in XEP-0030. What about it? Any element in XML can have an xml:lang attribute. It is specified in the XML standard on how the value of xml:lang propagates. > - What about the xml:lang element > in the stream header, i.e. the sessions default language? If it is set, is > it also an „implicit language“ used during construction of the Caps String? Yes, at it propagates down the tree, unless another element (e.g. the presence) overrides it. For example: […] […] In this case, the identity element would be assumed to have the language "en". If the xml:lang on the query was missing, it would be "de" and so on. > - Generally I am unsure if using the "xml:lang" and „name" from the > identities is a good idea at all, because these two attributes should not > change the capabilities of an entity. Name and language is just for humans. > I.e. if a server sends german identities for one user and english > identities for the next user (depending on their client settings/stream > header), the server still has the same identities, which should result in > the same verification string, shouldn’t it? I will send a separate email for this, thank you for bringing it up. > - Decomposing a Capability Hash > Node is not needed afaik. Having it specified is slightly distracting > because you think you need it. It may be needed when a client processes disco#info queries for those nodes, depending on the implementation. Of course, an implementation may simply use them as opaque strings, but it may also handle them separatedly. Since it’s trivial to get wrong, but also trivial to write it down correctly, I thought it’d make sense to have it there. > - The ‚var‘ element will be before the > ‚‘ element in the verification string. Therefore it would be > clearer if the description first mentioned the ‚var‘ attribute and then the > ‚‘. Will do, thanks. > - Typos found: „sevre“, „Cabability“ Thanks. kind regards, Jonas signature.asc Description: This is a digitally signed message part. ___ Standards mailing list Info: https://mail.jabber.org/mailman/listinfo/standards Unsubscribe: standards-unsubscr...@xmpp.org _
Re: [Standards] Entity Capabilities 2.0
Mon, 12 Feb 2018 00:41:54 +0100 Christian Schudt wrote: > - I am also missing a cache which maps entities to capabilities, i.e. > JIDs to disco#info objects. This is the whole point of the XEP (to be > able to know an entity’s abilities without service discovery). This > cache should be probably be non-persistent. The "Capability Hash > Cache“ (hash -> disco#info) is actually only the > intermediate/auxiliary cache. I think the XEP doesn't solve the main problem of cache invalidation and that's why I think it's pointless and I'm not going to implement it until the invalidating rules are described. And for those who think cache polution is a problem can request/store features per JID: there is absolutely no need to introduce yet another incompatibility just for a very tiny problem which can be solved in the other way. ___ Standards mailing list Info: https://mail.jabber.org/mailman/listinfo/standards Unsubscribe: standards-unsubscr...@xmpp.org ___
[Standards] Entity Capabilities 2.0
Hi, I’ve implemented Entity Capabilities 2.0 (XEP-0390) and like to share some thoughts about it here and in the following link. I think it could be interesting for library developers as well as the author(s) of XEP-0390: http://babbler-xmpp.blogspot.de/2018/02/experimenting-with-entity-capabilities.html Generally, XEP-0390 is really well and comprehensively written, but I’ve found some issues during development, which I’d like to address: - How to pick the first Capability Hash from a Capability Hash Set. Is it random? Is there a preferred hash algorithm order? - It’s not specified what to do if a hash algorithm is not understood by a processing entity. I’ve implemented it in the way that it’s ignored and the next hash is tried. - § 6.2 Rules for Processing Entities is relatively short. As outlined in my blogpost this section could be more verbose: Integration with XEP-0115. Picking the first hash. When to do a disco#info query. - I am also missing a cache which maps entities to capabilities, i.e. JIDs to disco#info objects. This is the whole point of the XEP (to be able to know an entity’s abilities without service discovery). This cache should be probably be non-persistent. The "Capability Hash Cache“ (hash -> disco#info) is actually only the intermediate/auxiliary cache. - It is said, that the disco#info element can have an xml:lang element, but it’s not specified in XEP-0030. What about it? - What about the xml:lang element in the stream header, i.e. the sessions default language? If it is set, is it also an „implicit language“ used during construction of the Caps String? - Generally I am unsure if using the "xml:lang" and „name" from the identities is a good idea at all, because these two attributes should not change the capabilities of an entity. Name and language is just for humans. I.e. if a server sends german identities for one user and english identities for the next user (depending on their client settings/stream header), the server still has the same identities, which should result in the same verification string, shouldn’t it? - Decomposing a Capability Hash Node is not needed afaik. Having it specified is slightly distracting because you think you need it. - The ‚var‘ element will be before the ‚‘ element in the verification string. Therefore it would be clearer if the description first mentioned the ‚var‘ attribute and then the ‚‘. - Typos found: „sevre“, „Cabability“ Kind regards, — Christian ___ Standards mailing list Info: https://mail.jabber.org/mailman/listinfo/standards Unsubscribe: standards-unsubscr...@xmpp.org ___