[sqlalchemy] Queries issued with 'passive_deletes'

2011-05-09 Thread Israel Ben Guilherme Fonseca
I didn't get why issuing selects for the children objects when
passive_deletes=False. Wouldn't be better just issue direct deletes, and
maybe using subselects in the where clause of these deletes (for nested
associations) when approriate? It would solve the overhead problem of the
selecting large collections, and it would mimic the ON DELETE CASCADE that
is expected to exist when using passive_delete=True for databases that don't
support this feature.

Thanks in advance for the explanation,

Israel

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.



Re: [sqlalchemy] Queries issued with 'passive_deletes'

2011-05-09 Thread Michael Bayer

On May 9, 2011, at 7:57 AM, Israel Ben Guilherme Fonseca wrote:

 I didn't get why issuing selects for the children objects when 
 passive_deletes=False. Wouldn't be better just issue direct deletes, and 
 maybe using subselects in the where clause of these deletes (for nested 
 associations) when approriate? It would solve the overhead problem of the 
 selecting large collections, and it would mimic the ON DELETE CASCADE that is 
 expected to exist when using passive_delete=True for databases that don't 
 support this feature.
 
 Thanks in advance for the explanation,

ON DELETE CASCADE is provided by all databases that SQLAlchemy supports - see 
referential integrity in this chart:

http://en.wikipedia.org/wiki/Comparison_of_relational_database_management_systems#Fundamental_features

the only exception being, MySQL MyISAM.   SQLite added foreign keys some 
releases ago (they are optional but can be enabled).

Given that ON DELETE CASCADE is already provided by all databases and should be 
used when lots of cascading deletes are needed, the ratio of usefulness to 
effort, which would be significant in that it involves a significantly more 
complex approach within the unit of work internals as well as a lot of new 
tests, doesn't place a feature like this in high priority.

It would not be possible for this behavior to be used in all cases, it would 
only be an optimizing case when its possible.Consider the case where cycles 
exist - parent-child-subchild-subsubchild, and suppose some rows in child 
reference subsubchild.   The UOW detects the potential for cycles based on 
the graph of mappings, and when it has access to all the individual rows (like 
the database does when ON DELETE CASCADE works)  breaks cycles into individual 
groups so that rows are deleted in the proper order.   A query like DELETE 
FROM subsubchild WHERE parent_id in (SELECT id from subchild where parent_id in 
(SELECT id from child where parent_id=x)) otherwise would fail.   

The current behavior also has the advantage that objects presently in the 
Session, but without their collection-based relationships loaded and linking 
them together in memory, are appropriately updated state-wise, as their 
collection membership is determined before being marked cascaded members as 
deleted after a flush.While passive_deletes=True turns this off, some 
applications with passive_deletes=False may be relying upon this.  Changing the 
cascade behavior to not be aware of individual rows when cycles don't exist 
mean that the state management of individual objects in a session will change 
based on mappings.  An application someday removes a relationship that was 
linking subsubchild to child, and suddenly the Session begins to not mark 
subsubchild objects as deleted during a cascade, instead waiting until 
commit() is called and all attributes are expired.This is a subtle side 
effect arising from seemingly unrelated mapping adjustments - this makes it 
tougher for us to make this new optimization a default behavior.Whereas the 
difference in behavior between passive_deletes=True|False is much easier to 
understand and anticipate.

So there's potential for surprises, new complexity, many more tests needed, 
feature is only an optimization, and will probably have to remain optional in 
all cases, all of which is redundant versus pretty much every database's own 
ability to do so more efficiently and predictably via ON DELETE CASCADE.And 
you can even use query.delete() if you really need to delete lots of things 
quickly and you don't have CASCADE immediately available.   

This also might be a good addition for the FAQ which is currently being cleaned 
up.

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.



Re: [sqlalchemy] Queries issued with 'passive_deletes'

2011-05-09 Thread Israel Ben Guilherme Fonseca
Well, now it does make sense. :)

Thanks for the explanation.

2011/5/9 Michael Bayer mike...@zzzcomputing.com


 On May 9, 2011, at 7:57 AM, Israel Ben Guilherme Fonseca wrote:

  I didn't get why issuing selects for the children objects when
 passive_deletes=False. Wouldn't be better just issue direct deletes, and
 maybe using subselects in the where clause of these deletes (for nested
 associations) when approriate? It would solve the overhead problem of the
 selecting large collections, and it would mimic the ON DELETE CASCADE that
 is expected to exist when using passive_delete=True for databases that don't
 support this feature.
 
  Thanks in advance for the explanation,

 ON DELETE CASCADE is provided by all databases that SQLAlchemy supports -
 see referential integrity in this chart:


 http://en.wikipedia.org/wiki/Comparison_of_relational_database_management_systems#Fundamental_features

 the only exception being, MySQL MyISAM.   SQLite added foreign keys some
 releases ago (they are optional but can be enabled).

 Given that ON DELETE CASCADE is already provided by all databases and
 should be used when lots of cascading deletes are needed, the ratio of
 usefulness to effort, which would be significant in that it involves a
 significantly more complex approach within the unit of work internals as
 well as a lot of new tests, doesn't place a feature like this in high
 priority.

 It would not be possible for this behavior to be used in all cases, it
 would only be an optimizing case when its possible.Consider the case
 where cycles exist - parent-child-subchild-subsubchild, and suppose some
 rows in child reference subsubchild.   The UOW detects the potential for
 cycles based on the graph of mappings, and when it has access to all the
 individual rows (like the database does when ON DELETE CASCADE works)
  breaks cycles into individual groups so that rows are deleted in the proper
 order.   A query like DELETE FROM subsubchild WHERE parent_id in (SELECT id
 from subchild where parent_id in (SELECT id from child where parent_id=x))
 otherwise would fail.

 The current behavior also has the advantage that objects presently in the
 Session, but without their collection-based relationships loaded and linking
 them together in memory, are appropriately updated state-wise, as their
 collection membership is determined before being marked cascaded members as
 deleted after a flush.While passive_deletes=True turns this off, some
 applications with passive_deletes=False may be relying upon this.  Changing
 the cascade behavior to not be aware of individual rows when cycles don't
 exist mean that the state management of individual objects in a session will
 change based on mappings.  An application someday removes a relationship
 that was linking subsubchild to child, and suddenly the Session begins
 to not mark subsubchild objects as deleted during a cascade, instead
 waiting until commit() is called and all attributes are expired.This is
 a subtle side effect arising from seemingly unrelated mapping adjustments -
 this makes it tougher for us to make this new optimization a default
 behavior.Whereas the difference in behavior between
 passive_deletes=True|False is much easier to understand and anticipate.

 So there's potential for surprises, new complexity, many more tests needed,
 feature is only an optimization, and will probably have to remain optional
 in all cases, all of which is redundant versus pretty much every database's
 own ability to do so more efficiently and predictably via ON DELETE CASCADE.
And you can even use query.delete() if you really need to delete lots of
 things quickly and you don't have CASCADE immediately available.

 This also might be a good addition for the FAQ which is currently being
 cleaned up.

 --
 You received this message because you are subscribed to the Google Groups
 sqlalchemy group.
 To post to this group, send email to sqlalchemy@googlegroups.com.
 To unsubscribe from this group, send email to
 sqlalchemy+unsubscr...@googlegroups.com.
 For more options, visit this group at
 http://groups.google.com/group/sqlalchemy?hl=en.



-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.