Hi everyone, I know that the subject of mutable objects as dictionary keys has been discussed a number of times in this forum (see for instance "freezing" of classes), but I would love to hear the thoughts of the experts on the approach below.
The use case that I encounter frequently is the classification of objects according to certain rules or properties: Say, I have objects A, B, C, ... (e.g. instances of a class, dict, etc), I can write d = {} d.setdefault(A,[]).append(A) d.setdefault(B,[]).append(B) ... so objects that map to the same hash key will end up in the same bucket. (A "real world" example I had to deal with recently was for instance the construction of a directed acyclic graph from many directed trees where identical subtrees needed to be identified.) The easiest way is of course to define custom __hash__() and __eq__() methods, but this breaks if objects are mutated (accidentally) after having been inserted into the dictionary. The "best" working approach I came up with so far is to generate an "immutable view" V of a mutable object O according to my classification rule, delegate O.__hash__ and O.__eq__ to V, and make sure that the V is memoized and cannot (easily) be altered later, even when O is mutated: def hashable_mutable_factory(mutable,rule): class _mutable(mutable): def __init__(self,*args,**kw): self._view_cache = {} super(_mutable,self).__init__(*args,**kw) def _view(self): id_ = id(self) if not self._view_cache.has_key(id_): self._view_cache[id_] = rule(self) return self._view_cache[id_] def __hash__(self): return hash(self._view()) def __eq__(self,other): return self._view() == other._view() return _mutable E.g.: >>> hashable_dict = hashable_mutable_factory(dict,lambda obj: >>> frozenset(obj.iteritems())) >>> h = hashable_dict(a=1,b=2) >>> d = {} >>> d[h] = 'foo' >>> d {{'a': 1, 'b': 2}: 'foo'} >>> h['c'] = 'bar' >>> d {{'a': 1, 'c': 'bar', 'b': 2}: 'foo'} >>> g = hashable_dict(a=1,b=2) >>> h {'a': 1, 'c': 'bar', 'b': 2} >>> g {'a': 1, 'b': 2} >>> id(g) == id(h) False >>> g == h True I slightly favor the factory function idiom above over defining the rule in a super class (this would have to be done for each mutable type and rule function separately), especially since I read that future versions of python (2.6 ?, 3.0 ?) will contain class decorators and allow syntax like class A(*bases): pass Is there a better approach? Any comments are appreciated. I have been seriously using Python for one year know, mostly in the context of graph algorithms etc., and it has always been a delightful coding experience! Best regards, Andreas
-- http://mail.python.org/mailman/listinfo/python-list