On Apr 27, 2006, at 10:13 AM, Chimezie Ogbuji wrote:
This came about as a result of the recent modifications to the MySQL store to use a different SQL schema to (amongst other things) intern identifiers and values. The problem now is that identifiers / values that aren't referred to by triples need to be 'pruned' by a garbage collection process. Currently, it does this explicitely within the remove function (which is the only change that would require this). I was going to move the garbage collection process to the commit function, but since it doesn't have any effect on the integrity of the data (just the efficient use of space), it seems to make more sense for an application to determine when this should happen instead of the store making that decision. Ofcourse, this problem is limited to this store only, but i think it suggests that there should be an additional method in the Store API (called upkeep, perhaps) that is defined to do nothing by default (same as the other database management interfaces: commit and rollback). Store's *could* overide this to do any specific 'house keeping' they need to. In this particular case, the removal of interned identifiers / values that aren't referenced by any triples would be triggered in this function. Any thought?
Sounds good. The Sleepycat store implementation at the moment is leaking space due to unused interned identifiers as well -- so it too could make use of such a method. It seems any store implementation that does interning (also IOMemory based store implementations) could make use of the new method.
Any other ideas on a name for the method? pack? gc? Is there already a fairly standard term for this type of operation in the database world? Else I'm fine with upkeep.
Chimezie _______________________________________________ Dev mailing list [email protected] http://rdflib.net/mailman/listinfo/dev
Daniel Krech, http://eikeon.com/ _______________________________________________ Dev mailing list [email protected] http://rdflib.net/mailman/listinfo/dev
