"""Hash consing is a technique for implementing immutable conses (or
whatever compound data structure) where you store all the living
conses in a weak hash table, indexed by their children.  The cons
routine looks in this table whenever it is called; if it finds it is
being asked to make another cell that is the same as an existing
cell, it simply returns the existing cell.

It's basically memoizing of 'cons', but its primary purpose is to save
space, not time.  It's kind of funny as memoizing goes, because of the
weak hash table.  (It also has the sort of interesting side effect
that 'is' and '==' are the same thing, or in Lisp language, eq and
equal are the same.)

So here's hash consing for Python hashable objects, such as tuples
and strings, using the new weak references available in Python 2.1.

It's unfortunately very inconvenient, because it has to return
objects of the 'wrapper' class, which have a .contents attribute
containing the original object.  If it doesn't do this, the wrapper
object goes away soon, defeating the point of hash consing.

If the wrapper objects were nice happy proxy objects, this would
be much less inconvenient.

I don't understand why the deletion code works and my previous version
didn't.

This lets you do something like this:
>>> from hashcons import hashcons
>>> a = hashcons('hi there')
>>> a.contents
'hi there'
>>> b = hashcons('hi there')
>>> a is b
1
>>> c = hashcons((a, a))
>>> d = hashcons((a, b))
>>> c is d
1

"""

import weakref

__all__ = ['hashconsclass', 'hashcons']

class wrapper:
    "Tuples and strings aren't weakly referenceable, but class instances are."
    def __init__(self, contents): self.contents = contents
    def __hash__(self):           return hash(self.contents)
    def __cmp__(self, other):     return cmp(self.contents, other)

class hashconsclass:
    "Hey, you might want one table per thread."
    def __init__(self):
        self.table = {}
    def __call__(self, contents):
        rv = wrapper(contents)
        tmpwr = weakref.ref(rv)
        if not self.table.has_key(tmpwr):
            def removeref(o, table=self.table):
                del table[o]
            mywr = weakref.ref(rv, removeref)
            self.table[mywr] = mywr
        return self.table[tmpwr]()
        

# instantiate one hashcons table for convenience
hashcons = hashconsclass()


Reply via email to