On Jun 16, 1:37 am, Armin Ronacher <[EMAIL PROTECTED]> wrote: > Abstract > ======== > > This PEP proposes an ordered dictionary as a new data structure for > the ``collections`` module, called "odict" in this PEP for short. The > proposed API incorporates the experiences gained from working with > similar implementations that exist in various real-world applications > and other programming languages. > > Rationale > ========= > > In current Python versions, the widely used built-in dict type does > not specify an order for the key/value pairs stored. This makes it > hard to use dictionaries as data storage for some specific use cases. > > Some dynamic programming languages like PHP and Ruby 1.9 guarantee a > certain order on iteration. In those languages, and existing Python > ordered-dict implementations, the ordering of items is defined by the > time of insertion of the key. New keys are appended at the end, keys > that are overwritten and not moved. > > The following example shows the behavior for simple assignments: > > >>> d = odict() > >>> d['parrot'] = 'dead' > >>> d['penguin'] = 'exploded' > >>> d.items() > > [('parrot', 'dead'), ('penguin', 'exploded')] > > That the ordering is preserved makes an odict useful for a couple of > situations: > > - XML/HTML processing libraries currently drop the ordering of > attributes, use a list instead of a dict which makes filtering > cumbersome, or implement their own ordered dictionary. This affects > ElementTree, html5lib, Genshi and many more libraries. > > - There are many ordererd dict implementations in various libraries > and applications, most of them subtly incompatible with each other. > Furthermore, subclassing dict is a non-trivial task and many > implementations don't override all the methods properly which can > lead to unexpected results. > > Additionally, many ordered dicts are implemented in an inefficient > way, making many operations more complex then they have to be. > > - PEP 3115 allows metaclasses to change the mapping object used for > the class body. An ordered dict could be used to create ordered > member declarations similar to C structs. This could be useful, for > example, for future ``ctypes`` releases as well as ORMs that define > database tables as classes, like the one the Django framework ships. > Django currently uses an ugly hack to restore the ordering of > members in database models. > > - Code ported from other programming languages such as PHP often > depends on a ordered dict. Having an implementation of an > ordering-preserving dictionary in the standard library could ease > the transition and improve the compatibility of different libraries. > > Ordered Dict API > ================ > > The ordered dict API would be mostly compatible with dict and existing > ordered dicts. (Note: this PEP refers to the Python 2.x dictionary > API; the transfer to the 3.x API is trivial.) > > The constructor and ``update()`` both accept iterables of tuples as > well as mappings like a dict does. The ordering however is preserved > for the first case: > > >>> d = odict([('a', 'b'), ('c', 'd')]) > >>> d.update({'foo': 'bar'}) > >>> d > > collections.odict([('a', 'b'), ('c', 'd'), ('foo', 'bar')]) > > If ordered dicts are updated from regular dicts, the ordering of new > keys is of course undefined again unless ``sort()`` is called. > > All iteration methods as well as ``keys()``, ``values()`` and > ``items()`` return the values ordered by the the time the key-value > pair was inserted: > > >>> d['spam'] = 'eggs' > >>> d.keys() > > ['a', 'c', 'foo', 'spam']>>> d.values() > > ['b', 'd', 'bar', 'eggs']>>> d.items() > > [('a', 'b'), ('c', 'd'), ('foo', 'bar'), ('spam', 'eggs')] > > New methods not available on dict: > > ``odict.byindex(index)`` > > Index-based lookup is supported by ``byindex()`` which returns > the key/value pair for an index, that is, the "position" of a > key in the ordered dict. 0 is the first key/value pair, -1 > the last. > > >>> d.byindex(2) > ('foo', 'bar') > > ``odict.sort(cmp=None, key=None, reverse=False)`` > > Sorts the odict in place by cmp or key. This works exactly > like ``list.sort()``, but the comparison functions are passed > a key/value tuple, not only the value. > > >>> d = odict([(42, 1), (1, 4), (23, 7)]) > >>> d.sort() > >>> d > collections.odict([(1, 4), (23, 7), (42, 1)]) > > ``odict.reverse()`` > > Reverses the odict in place. > > ``odict.__reverse__()`` > > Supports reverse iteration by key. > > Questions and Answers > ===================== > > What happens if an existing key is reassigned? > > The key is not moved but assigned a new value in place. This is > consistent with existing implementations and allows subclasses to > change the behavior easily:: > > class movingcollections.odict): > def __setitem__(self, key, value): > self.pop(key, None) > odict.__setitem__(self, key, value) > > What happens if keys appear multiple times in the list passed to the > constructor? > > The same as for regular dicts: The latter item overrides the > former. This has the side-effect that the position of the first > key is used because the key is actually overwritten: > > >>> odict([('a', 1), ('b', 2), ('a', 3)]) > collections.odict([('a', 3), ('b', 2)]) > > This behavior is consistent with existing implementations in > Python, the PHP array and the hashmap in Ruby 1.9. > > Why is there no ``odict.insert()``? > > There are few situations where you really want to insert a key at > an specified index. To avoid API complication, the proposed > solution for this situation is creating a list of items, > manipulating that and converting it back into an odict: > > >>> d = odict([('a', 42), ('b', 23), ('c', 19)]) > >>> l = d.items() > >>> l.insert(1, ('x', 0)) > >>> odict(l) > collections.odict([('a', 42), ('x', 0), ('b', 23), ('c', 19)]) > > Example Implementation > ====================== > > A poorly performing example implementation of the odict written in > Python is available: > > `odict.py <http://dev.pocoo.org/hg/sandbox/raw-file/tip/ > odict.py>`_ > > The version for ``collections`` should be implemented in C and use a > linked list internally. > > Other implementations of ordered dicts in various Python projects or > standalone libraries, that inspired the API proposed here, are: > > - `odict in Babel`_ > - `OrderedDict in Django`_ > - `The odict module`_ > - `ordereddict`_ (a C implementation of the odict module) > - `StableDict`_ > - `Armin Rigo's OrderedDict`_ > > .. _odict in > Babel:http://babel.edgewall.org/browser/trunk/babel/util.py?rev=374#L178 > .. _OrderedDict in Django: > http://code.djangoproject.com/browser/django/trunk/django/utils/datas... > .. _The odict module:http://www.voidspace.org.uk/python/odict.html > .. _ordereddict:http://www.xs4all.nl/~anthon/Python/ordereddict/ > .. _StableDict:http://pypi.python.org/pypi/StableDict/0.2 > .. _Armin Rigo's > OrderedDict:http://codespeak.net/svn/user/arigo/hack/pyfuse/OrderedDict.py > > Future Directions > ================= > > With the availability of an ordered dict in the standard library, > other libraries may take advantage of that. For example, ElementTree > could return odicts in the future that retain the attribute ordering > of the source file. > > Copyright > ========= > > This document has been placed in the public domain.
In Python 3.0 dict.items(), dict.keys() and dict.values() return set- like views of the dictionary. http://www.python.org/dev/peps/pep-3106/. You should consider doing the same thing for odict. Of course, the views would be list-like, not set-like. At the very least, I would like to see a discussion about it. Matt -- http://mail.python.org/mailman/listinfo/python-list