A few months back, the default content types for Plone (ATContentTypes) switched to using AnnotationStorage instead of AttributeStorage for the storage of some attributes. Formerly, properties on a persistent archetype object were stored as normal object attributes. Now they are stored in an OOBtree referenced by an attribute named '__annotation__'.
To make a long story short, the current implementation of historicalRevision in Zope's OFS/History.py calls oldstate() in ZODB/Connection.py. The serializer then calls getState() in ZODB/serialize.py (class ObjectReader), which sets up an unpickler to handle persistent references by overriding _persistent_load(). Unfortunately, when the _persistent_load subroutine comes across a persistent reference, it either loads the CURRENT referenced object from the ZODB (using the oid and ZODB/Connection.py's get()), or loads the CURRENT referenced object from cache. It does not take 'tid' into account when it loads persistent references. In order for Zope's "history" tab to work for anything other than a very simple object (with no persistent references), it needs to "deeply" copy objects out of the ZODB. In other words, the persistent references we pull back for a historical revision of an object should be from the same 'tid' as the original object. My initial thinking was to use _txn_time in Connection.py to set an upper bound for which revisions could be pulled back, then simply use the connection and deserializer as we normally would to pull back the appropriate revisions of everything. Upon further inspection, though, it looks to be pretty complex. _setstate_noncurrent is only called by _load_before_or_conflict, which is called by _setstate based on _invalidated, which I wouldn't want to touch. There would also potentially be issues with caching (although we could probably use a fresh cache for the "history" connection so it wouldn't conflict with the normal cache). My second attempt to resolve this problem is aimed at the serializer instead. I wrote a working proof of concept that basically overrides the normal (de)serializer to: 1. use a "tid" when pulling back historical persistent references 2. get rid of the cache -- we don't care about caching when pulling back historical revisions of objects My proof of concept is implemented as TimeTraveler.py, attached below. Although it is written as an independent class, it could presumably subclass class ObjectReader in ZODB/serialize.py -- since that's where it gets most of its code. Anyway, I have a few questions: 1. Does it make more sense to patch the serializer (as I've done) or try to override the max tid on a connection object in order to pull back "deep" historical revisions? Or are there better ways? 2. What issues might we face using the proof of concept attached below? 3. Would this be better implemented as a patch to Zope, or as a separate standalone class for pulling back historical revisions of persistent objects? 4. Does the Zope community see this as a critical issue that needs to be resolved? 5. Looking forward, what is the best way to "deep copy" a historical revision to the current revision (or better yet, create a new transaction that changes the current current revision to contain data from the historical revision)? Could I somehow use _p_changed=1 on an old object to make it so it's automagically copied to the current revision? The code attached below was tested against a clean Zope 2.8.5 w/ Plone 2.1.2 (debian 'unstable' packages). I tested things using "zopectl debug". Thanks in advance for any feedback you can provide... Matt Hahnfeld [EMAIL PROTECTED] --- import OFS.History import ZODB.broken import cPickle import cStringIO # TimeTraveler # ZODB (deep) historical objects -- proof of concept # Matt Hahnfeld 3/10/06 # # Most of this code was stolen from ZODB/Connection.py # or ZODB/serialize.py. This should probably subclass # ZODB.serialize.ObjectReader. # # Usage: # # get tid from p._p_jar._storage.history(p._p_oid, size=10) # # from TimeTraveler import TimeTraveler # p = app.my_plone_site.my_page # tt = TimeTraveler(p,'\x03c\xff\xcd\x15X\x80\xcc') # p_old = tt.get() # # p_old will be a deep copy of p for the tid specified. class TimeTraveler: def __init__(self, obj, tid): self._obj = obj self._tid = tid self._conn = self._obj._p_jar self._storage = self._conn._storage self._factory = self._conn._db.classFactory def get(self): obj = self._get_object(self._obj._p_oid) return obj.__of__(self._obj.aq_parent) def _get_object(self, oid): print 'in getobj' pickle = self._storage.loadSerial(oid, self._tid) unpickler = self._get_unpickler(pickle) klass = unpickler.load() if isinstance(klass, tuple): # Here we have a separate class and args. # This could be an old record, so the class module ne a named # refernce klass, args = klass print 'klass is '+str(klass) if isinstance(klass, tuple): # Old module_name, class_name tuple klass = self._get_class(*klass) if args is None: args = () else: # Definitely new style direct class reference args = () if issubclass(klass, ZODB.broken.Broken): # We got a broken class. We might need to make it # PersistentBroken if not issubclass(klass, ZODB.broken.PersistentBroken): klass = ZODB.broken.persistentBroken(klass) obj = klass.__new__(klass, *args) state = unpickler.load() obj.__setstate__(state) # set other attributes obj._p_jar=OFS.History.HystoryJar(self._conn) obj._p_oid=oid obj._p_serial=self._tid obj._p_changed=0 return obj def _get_unpickler(self, pickle): file = cStringIO.StringIO(pickle) unpickler = cPickle.Unpickler(file) unpickler.persistent_load = self._persistent_load factory = self._factory conn = self._conn def find_global(modulename, name): print 'in find global\n' return factory(conn, modulename, name) unpickler.find_global = find_global return unpickler def _get_class(self, module, name): return self._factory(self._conn, module, name) def _persistent_load(self, oid): print 'in persistent load\n' if isinstance(oid, list): # weakref [oid] = oid obj = WeakRef.__new__(WeakRef) obj.oid = oid obj.dm = self._conn return obj elif isinstance(oid, tuple): oid = oid[0] return self._get_object(oid) _______________________________________________ For more information about ZODB, see the ZODB Wiki: http://www.zope.org/Wikis/ZODB/ ZODB-Dev mailing list - ZODB-Dev@zope.org http://mail.zope.org/mailman/listinfo/zodb-dev