Re: [PubSub] Collection Nodes (XEP-0248)

Brian Cully Tue, 24 Aug 2010 14:21:15 -0700

On 24-Aug-2010, at 15:28, Dave Cridland wrote:

> On Tue Aug 24 19:26:11 2010, Brian Cully wrote:
>>      I've finally been able to do some work on this, and am close to sending 
>> out a new version. Before I do, however, I'd like to know if there are any 
>> outstanding issues people have had with it. I've gone through my old mail 
>> and have a start on some things:
>>      * Usefulness of notification depth choices
>>      * Access control
>>      * SHIM header issues with both SubID and Collection headers
>>      * Item retrieval on collection nodes
> My open issues, having implemented most of this now, include:
> 
> - Retrieve subscriptions - should we return all subscriptions affecting the 
> node (ie, include any which would cause a notification), or just the "local" 
> subscriptions.


        I think this should only return subscriptions on the collection itself, 
since it's the least ambiguous and other mechanisms exist for finding 
subscriptions.

> - Access control requirements for adding/removing nodes. (I picked owner 
> required on collection).

        The version of 0248 up now has the following additional slots on the 
node config form, which I intend to keep:

                - pubsub#children_association_policy
                - pubsub#children_association_whitelist
                - pubsub#children_max

        They seem reasonable. I think they were added shortly before it went 
deferred, though, since I don't remember them when I put collection support in 
ejabberd. I ended up going with owners-only, too, but the association policy is 
a better idea.

> - What does publishing an item to a collection do?

        Fails with <bad-request/> and some pubsub condition should probably be 
added. The spec is pretty clear that collections MUST NOT contain published 
items, which I agree with.

>>      I've covered access control with by using the collection's access model 
>> and adding a note to "Security Considerations" saying it could be a bad 
>> thing to allow, for instance, open access on a collection node which has 
>> closed or authorize children (but this can also be a useful thing, too).
> I did this by validating each node distinctly, so that if you have a 
> whitelist node inside a collection, an event on that node won't be seen by 
> subscribers to the collection unless they are able to subscribe directly. It 
> seemed logical, and easy enough.

        There are pros and cons to both approaches. I think, but maybe I'm over 
thinking, that notifications on a closed node can be routed through an open one 
has configuration benefits. For instance, a service can control access to a set 
of nodes via a collection choke-point. Imagine nodes A, B, and C connected to a 
collection node Z. A, B, and C are closed, Z is open. Clients cannot directly 
subscribe to A, B, or C, but can to Z. At some point the service determines 
that node C should be completely private and simply dissociates it from Z. If 
the service had to deal with subscriptions to C it leads to more complicated 
logic on the owner's end. Personally, I think if you want to make sure a node 
is closed you don't allow it to associate to an open node (but this is at the 
owner, not service, level). It seems straightforward and less likely to induce 
headdesking.

>>      I'm on the fence about SHIM headers. I think we need a new one because 
>> of the limitations of the schema, or perhaps we should omit the "Collection" 
>> header when SubIDs are extant because it's redundant in that case.
> I'm not sure it actually is. Are SubID's mandated to be unique across a 
> service?

        Ah, true. IIRC, they only need to be unique to a node. Argh. I hate 
this problem.

> In any case, I hate SHIM, so I'd be open to an alternative.

        Likewise, but all I've heard that seems viable so far is a new SHIM 
header that combines collection and subid attributes.

>>      Item retrieval is tricky. I think it's a highly valuable thing for both 
>> client simplicity and access control. It's good to be able to say "if you 
>> can get a notification about it, you can retrieve it in the same way you 
>> subscribed to it." However, the existing schema for item retrieval allows 
>> only one <items/> element in the query response. If we could allow more than 
>> one then it becomes fairly simple.
> 
> I'm not convinced of the need, it seems astonishingly complex - I'd have to 
> involve recursion, which I'd much prefer to avoid.

        It's no more recursive than delivering a publish, retract, etc., 
notification - you have to traverse a tree either way (at least you do in the 
full DAG "multi-collections" case). I do actually find this very useful, and 
use it in our product. We can have hundreds of leaf nodes associated to a 
collection and iterating over each of them in the client to get their items on 
start up is painful, complex, and hideously slow. Doing it once on the 
collection fixed all this at the cost of breaking strict schema compatibility.

-bjc

Re: [PubSub] Collection Nodes (XEP-0248)

Reply via email to