On May 9, 2011, at 7:57 AM, Israel Ben Guilherme Fonseca wrote: > I didn't get why issuing selects for the children objects when > passive_deletes=False. Wouldn't be better just issue direct deletes, and > maybe using subselects in the where clause of these deletes (for nested > associations) when approriate? It would solve the overhead problem of the > selecting large collections, and it would mimic the ON DELETE CASCADE that is > expected to exist when using passive_delete=True for databases that don't > support this feature. > > Thanks in advance for the explanation,
ON DELETE CASCADE is provided by all databases that SQLAlchemy supports - see "referential integrity" in this chart: http://en.wikipedia.org/wiki/Comparison_of_relational_database_management_systems#Fundamental_features the only exception being, MySQL MyISAM. SQLite added foreign keys some releases ago (they are optional but can be enabled). Given that ON DELETE CASCADE is already provided by all databases and should be used when lots of cascading deletes are needed, the ratio of usefulness to effort, which would be significant in that it involves a significantly more complex approach within the unit of work internals as well as a lot of new tests, doesn't place a feature like this in high priority. It would not be possible for this behavior to be used in all cases, it would only be an optimizing case when its possible. Consider the case where cycles exist - parent->child->subchild->subsubchild, and suppose some rows in "child" reference "subsubchild". The UOW detects the potential for cycles based on the graph of mappings, and when it has access to all the individual rows (like the database does when ON DELETE CASCADE works) breaks cycles into individual groups so that rows are deleted in the proper order. A query like "DELETE FROM subsubchild WHERE parent_id in (SELECT id from subchild where parent_id in (SELECT id from child where parent_id=x))" otherwise would fail. The current behavior also has the advantage that objects presently in the Session, but without their collection-based relationships loaded and linking them together in memory, are appropriately updated state-wise, as their collection membership is determined before being marked cascaded members as "deleted" after a flush. While passive_deletes=True turns this off, some applications with passive_deletes=False may be relying upon this. Changing the cascade behavior to not be aware of individual rows when cycles don't exist mean that the state management of individual objects in a session will change based on mappings. An application someday removes a relationship that was linking "subsubchild" to "child", and suddenly the Session begins to not mark "subsubchild" objects as deleted during a cascade, instead waiting until commit() is called and all attributes are expired. This is a subtle side effect arising from seemingly unrelated mapping adjustments - this makes it tougher for us to make this new optimization a default behavior. Whereas the difference in behavior between passive_deletes=True|False is much easier to understand and anticipate. So there's potential for surprises, new complexity, many more tests needed, feature is only an optimization, and will probably have to remain optional in all cases, all of which is redundant versus pretty much every database's own ability to do so more efficiently and predictably via ON DELETE CASCADE. And you can even use query.delete() if you really need to delete lots of things quickly and you don't have CASCADE immediately available. This also might be a good addition for the FAQ which is currently being cleaned up. -- You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To post to this group, send email to sqlalchemy@googlegroups.com. To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en.