I'm trying to make a thread safe object cache without locking. The objects are cached by the id of the data dict given in __new__. Objects are removed from the cache as soon as they are no longer referenced. The type of the data must be a Python dict (comes from an external source).
Here's what I've got so far: import time from weakref import ref cache = {} # {id(data): ref_to_Foo(data), ...} class Foo(object): __slots__ = ('__weakref__', '_data') def __new__(cls, data): new_self = object.__new__(cls) def remove(wr, _cache=cache, _data=data): try: del _cache[id(_data)] except KeyError: pass # should never happen while 1: self = cache.setdefault(id(data), ref(new_self, remove))() try: self._data = data return self # this is new_self or a cached instance except AttributeError: # happens rarely: got a 'dead' cached instance time.sleep(0) # small delay before retrying # Usage example: data = {'a':1, 'b':2} a = Foo(data) b = Foo(data) assert a is b I've got couple of assumptions based on I think the code is working correctly: a) Python's dict is thread safe, i.e. setdefault can't replace an existing value until the key has been removed - no matter how many threads are calling it simultaneously. Additionally, only one thread succeeds in setting the new value. b) Unreferenced ref(new_self, remove) objects are garbage collected without the callbacks being called. c) The remove() keeps a reference to the data dict. That should prevent Python from reusing the object id too soon. Are these assumptions correct? Do you have ideas for improvements (especially regarding speed)? I'm restricted to Py2.3+ compatibility. Thanks, Timo -- http://mail.python.org/mailman/listinfo/python-list