"""Hash consing is a technique for implementing immutable conses (or whatever compound data structure) where you store all the living conses in a weak hash table, indexed by their children. The cons routine looks in this table whenever it is called; if it finds it is being asked to make another cell that is the same as an existing cell, it simply returns the existing cell.
It's basically memoizing of 'cons', but its primary purpose is to save space, not time. It's kind of funny as memoizing goes, because of the weak hash table. (It also has the sort of interesting side effect that 'is' and '==' are the same thing, or in Lisp language, eq and equal are the same.) So here's hash consing for Python hashable objects, such as tuples and strings, using the new weak references available in Python 2.1. It's unfortunately very inconvenient, because it has to return objects of the 'wrapper' class, which have a .contents attribute containing the original object. If it doesn't do this, the wrapper object goes away soon, defeating the point of hash consing. If the wrapper objects were nice happy proxy objects, this would be much less inconvenient. I don't understand why the deletion code works and my previous version didn't. This lets you do something like this: >>> from hashcons import hashcons >>> a = hashcons('hi there') >>> a.contents 'hi there' >>> b = hashcons('hi there') >>> a is b 1 >>> c = hashcons((a, a)) >>> d = hashcons((a, b)) >>> c is d 1 """ import weakref __all__ = ['hashconsclass', 'hashcons'] class wrapper: "Tuples and strings aren't weakly referenceable, but class instances are." def __init__(self, contents): self.contents = contents def __hash__(self): return hash(self.contents) def __cmp__(self, other): return cmp(self.contents, other) class hashconsclass: "Hey, you might want one table per thread." def __init__(self): self.table = {} def __call__(self, contents): rv = wrapper(contents) tmpwr = weakref.ref(rv) if not self.table.has_key(tmpwr): def removeref(o, table=self.table): del table[o] mywr = weakref.ref(rv, removeref) self.table[mywr] = mywr return self.table[tmpwr]() # instantiate one hashcons table for convenience hashcons = hashconsclass()