On 24-Aug-2010, at 15:28, Dave Cridland wrote: > On Tue Aug 24 19:26:11 2010, Brian Cully wrote: >> I've finally been able to do some work on this, and am close to sending >> out a new version. Before I do, however, I'd like to know if there are any >> outstanding issues people have had with it. I've gone through my old mail >> and have a start on some things: >> * Usefulness of notification depth choices >> * Access control >> * SHIM header issues with both SubID and Collection headers >> * Item retrieval on collection nodes > My open issues, having implemented most of this now, include: > > - Retrieve subscriptions - should we return all subscriptions affecting the > node (ie, include any which would cause a notification), or just the "local" > subscriptions.
I think this should only return subscriptions on the collection itself, since it's the least ambiguous and other mechanisms exist for finding subscriptions. > - Access control requirements for adding/removing nodes. (I picked owner > required on collection). The version of 0248 up now has the following additional slots on the node config form, which I intend to keep: - pubsub#children_association_policy - pubsub#children_association_whitelist - pubsub#children_max They seem reasonable. I think they were added shortly before it went deferred, though, since I don't remember them when I put collection support in ejabberd. I ended up going with owners-only, too, but the association policy is a better idea. > - What does publishing an item to a collection do? Fails with <bad-request/> and some pubsub condition should probably be added. The spec is pretty clear that collections MUST NOT contain published items, which I agree with. >> I've covered access control with by using the collection's access model >> and adding a note to "Security Considerations" saying it could be a bad >> thing to allow, for instance, open access on a collection node which has >> closed or authorize children (but this can also be a useful thing, too). > I did this by validating each node distinctly, so that if you have a > whitelist node inside a collection, an event on that node won't be seen by > subscribers to the collection unless they are able to subscribe directly. It > seemed logical, and easy enough. There are pros and cons to both approaches. I think, but maybe I'm over thinking, that notifications on a closed node can be routed through an open one has configuration benefits. For instance, a service can control access to a set of nodes via a collection choke-point. Imagine nodes A, B, and C connected to a collection node Z. A, B, and C are closed, Z is open. Clients cannot directly subscribe to A, B, or C, but can to Z. At some point the service determines that node C should be completely private and simply dissociates it from Z. If the service had to deal with subscriptions to C it leads to more complicated logic on the owner's end. Personally, I think if you want to make sure a node is closed you don't allow it to associate to an open node (but this is at the owner, not service, level). It seems straightforward and less likely to induce headdesking. >> I'm on the fence about SHIM headers. I think we need a new one because >> of the limitations of the schema, or perhaps we should omit the "Collection" >> header when SubIDs are extant because it's redundant in that case. > I'm not sure it actually is. Are SubID's mandated to be unique across a > service? Ah, true. IIRC, they only need to be unique to a node. Argh. I hate this problem. > In any case, I hate SHIM, so I'd be open to an alternative. Likewise, but all I've heard that seems viable so far is a new SHIM header that combines collection and subid attributes. >> Item retrieval is tricky. I think it's a highly valuable thing for both >> client simplicity and access control. It's good to be able to say "if you >> can get a notification about it, you can retrieve it in the same way you >> subscribed to it." However, the existing schema for item retrieval allows >> only one <items/> element in the query response. If we could allow more than >> one then it becomes fairly simple. > > I'm not convinced of the need, it seems astonishingly complex - I'd have to > involve recursion, which I'd much prefer to avoid. It's no more recursive than delivering a publish, retract, etc., notification - you have to traverse a tree either way (at least you do in the full DAG "multi-collections" case). I do actually find this very useful, and use it in our product. We can have hundreds of leaf nodes associated to a collection and iterating over each of them in the client to get their items on start up is painful, complex, and hideously slow. Doing it once on the collection fixed all this at the cost of breaking strict schema compatibility. -bjc