As the dust settles on the recent Collections and Sets work, I decided to write up a short description of what every Chandler developers should know about Collections. The idea of a query that automatically updates a list of items, and notifies subscribers of changes, has been central to Chandler from the beginning. Our design and implementation has evolved many times, influenced by what we have learned through experience. Although some of what I describe here might change slightly I think the basic ideas will remain unchanged.

The new Collections are a replacement for repository.query.Query, which was used by ItemCollections. In the old ItemCollection world that most of you are probably familiar with an itemCollection was made up of a query that specified a set of items, modified by adding in a list of inclusion items and removing a list of exclusion items. The final results were cached in a ref collection that was usually accessed like an array. We ran into a number of problems using ItemCollections. For example, when one ItemCollection, e.g. the "All" item collection fed its results into a new filtered ItemCollection, e.g. the subset of calendar events, there were problems propagating changes and notifications. Also we learned that the majority of ItemCollections in Chandler were simply ordered lists of items, and the notion of order in ItemCollections was not always maintained.

In the new Collections world we have a number of different types of Collections:

KindCollection: all the items of a particular kind.

ListCollection: an explicit list of items.

FilteredCollection: all items in another source Collection that match a Python expression. You must manually specify a list of attributes which Items must have to be considered for filtering by the expression. In the future we may limit what Python code FilteredCollections may use.

UnionCollection: the union of two or more source Collections

IntersectionCollection: the intersection of two or more source Collections

DifferenceCollection: the difference between to source Collections

InclusionExclusionCollection: a collection similar to our old ItemCollection, that implements some convenience methods to access inclusions, exclusions, the source Collection, and methods to add and remove items. The InclusionExclusionCollection, is made up of a union collection, difference collection, 2 list collections and a source collection as follows:

InclusionExclusionCollection  = ((source - exclusions) + inclusions).

To illustrate the power of Collections consider the new "All" Collection:

allCollection = ((((Notes - (Events filtered by (isGenerated = True)) - Trash) - allExclusions) + allInclusions)

allCollection is an InclusionExclusionCollection. Notes and Events are KindCollections. allInclusions, allExclusions and Trash are ListCollection.

There isn't any code necessary to exclude generated events or item in the trash from the "All" Collection, which simplifies the design. It's also easy to update the rules for what is contained in the "All" Collection without having to update a bunch of code. So if you find yourself writing a bunch of code to make sure items end up in the right Collections in the sidebar or elsewhere, you could probably avoid it completely by setting up the right Collections to start with.

You can subscribe to a collection by adding an item to notify to the collection's subscribers attribute. By default, the method "onCollectionEvent" is called on items that are subscribed, however, you can specify a different method name in the collectionEventHandler attribute of your item that is notified.

Collections are not dependent on Blocks, but Blocks are the main user of Collections.

That finishes the overview. For those that want to understand more detail or the implementation, read on.

Collections are Items that provide a thin wrapper on repository Set attribute values, where most of the work actually takes place. We need this wrapper for a few reasons. First it's difficult to manage lots of references to an attribute, which is why Blocks, ContentItems, etc. are not attributes. Second, the Item implements the support for notifications. Finally, Set attributes require arguments that refer to other Sets in order to create them. These arguments aren't known when the Collection Item is created. This creates an awkward need to delay creation of the Set attribute. The Item provides Python magic to handle this awkward delay creation. A further limitation of Sets is that they are immutable, which means that changing a node in a Collection tree is not supported. It may be possible to add more Python magic the Item that destroys and re-create the correct Sets when one node changes.

These disadvantages imposed by making Sets an attribute made some of us think that making Sets an Item would have been a better choice. The counter argument was that we would face the same limitations even if Sets were Items. There might also be situations where using Sets as attributes would have a advantage, even though they are used that way today.

Collections have the same kind of index that ItemCollections had. If you never index into a Collection it won't have an index. If you index into it, you'll get an index. The index you get is determined by an attribute on Collection. By default you'll get an ordered index, where the order is the same as the iteration order of the Collection. If the index attribute is the name of an attribute, you'll get an index sortedd by that attribute.

Unlike ItemCollections, collections, except for ListCollections, don't cache their results.

Most Collections are used as contents for Blocks. As in the past, when the Block is rendered it subscribes to notifications, and when it's unrendered it unsubscribes to notifications. This is a simple optimization to minimize the number of notifications, since only blocks that are visible on the screen need to be notified to update themselves.

KindSets and FilteredSets maintain their indexes by using repository monitors. We use that same mechanism to notify subscribers. Notifications for Items coming and going to Collections are synchronous. This doesn't work for changes to attributes on Items in other views, so instead we we use an asynchronous notification. In order to get these notifications it's necessary to poll for them. Each time OnIdle is called we do a repository update and poll for these notifications. Each time a notification is received, the block that gets the notification is added to a list of dirty blocks. At the end of OnIdle, the list of dirty blocks is updated on the screen and removed from the list of dirty blocks. This has the benefit of accumulating all of the changes to data fairly quickly, and only redrawing the affected part of the screen when there's nothing left to do.

Finally, we plan to implement a nestable "Freeze/Thaw" methods to temporarily ignore and enable notifications, which will further improve performance.

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

Open Source Applications Foundation "Dev" mailing list
http://lists.osafoundation.org/mailman/listinfo/dev

Reply via email to