On Nov 6, 2008, at 7:24 PM, Shane Hathaway wrote: > Jim Fulton wrote: >> I've posted a new proposal: >> >> http://wiki.zope.org/ZODB/ExternalGC >> >> That addresses multi-database garbage collection and can also be >> useful in other situations. >> >> Comments are welcome. Absent objections, I may start working on this >> fairly soon. > > I see where you're going with this. The "Sample (naive)" > implementation > would be very expensive with large databases; do you have ideas on how > it might be done more efficiently?
Sure. First, you don't need a good set. You can just remove good oids from the starting set, which becomes the bad set. I'd store the oids on disk as a oid->flag mapping, or maybe even as a set. An advantage of making this external is that we can innovate on the external gc independent of the zodb release, although, eventually, we'd include a built-in gc tool. Another bonus is that, in the presence of replication, the analysis phase can be performed against a secondary storage, keeping load off the primary until the final deletion step. Jim -- Jim Fulton Zope Corporation _______________________________________________ For more information about ZODB, see the ZODB Wiki: http://www.zope.org/Wikis/ZODB/ ZODB-Dev mailing list - ZODB-Dev@zope.org http://mail.zope.org/mailman/listinfo/zodb-dev