On 07/02/2010 06:11 AM, Steven D'Aprano wrote: > I would like to better understand some of the design choices made in > collections.defaultdict. > > Firstly, to initialise a defaultdict, you do this: > > from collections import defaultdict > d = defaultdict(callable, *args) > > which sets an attribute of d "default_factory" which is called on key > lookups when the key is missing. If callable is None, defaultdicts are > *exactly* equivalent to built-in dicts, so I wonder why the API wasn't > added on to dict rather than a separate class that needed to be imported. > That is: > > d = dict(*args) > d.default_factory = callable
That's just not what dicts, a very simple and elementary data type, do. I know this isn't really a good reason. In addition to what Chris said, I expect this would complicate the dict code a great deal. > > If you failed to explicitly set the dict's default_factory, it would > behave precisely as dicts do now. So why create a new class that needs to > be imported, rather than just add the functionality to dict? > > Is it just an aesthetic choice to support passing the factory function as > the first argument? I would think that the advantage of having it built- > in would far outweigh the cost of an explicit attribute assignment. > The cost of this feature would be over-complication of the built-in dict type when a subclass would do just as well > > > Second, why is the factory function not called with key? There are three > obvious kinds of "default values" a dict might want, in order of more-to- > less general: > > (1) The default value depends on the key in some way: return factory(key) I agree, this is a strange choice. However, nothing's stopping you from being a bit verbose about what you want and just doing it: class mydict(defaultdict): def __missing__(self, key): # ... the __missing__ method is really the more useful bit the defaultdict class adds, by the looks of it. -- Thomas > (2) The default value doesn't depend on the key: return factory() > (3) The default value is a constant: return C > > defaultdict supports (2) and (3): > > defaultdict(factory, *args) > defaultdict(lambda: C, *args) > > but it doesn't support (1). If key were passed to the factory function, > it would be easy to support all three use-cases, at the cost of a > slightly more complex factory function. E.g. the current idiom: > > defaultdict(factory, *args) > > would become: > > defaultdict(lambda key: factory(), *args) > > > (There is a zeroth case as well, where the default value depends on the > key and what else is in the dict: factory(d, key). But I suspect that's > well and truly YAGNI territory.) -- http://mail.python.org/mailman/listinfo/python-list