Patch for #6870: sending pre_delete before caching of related models

Carl Gieringer Mon, 17 May 2010 14:01:51 -0700

Hi Djangonauts,

I'd like to call attention to a patch[0] I submitted last week for
#6870[1].  From one perspective, this ticket is a version of the "My
models with foreign keys are deleted when their related model deletes
because of Django's ON CASCADE DELETE behavior."  But from another
perspective this ticket is a (the only?) ticket that correctly
identifies a bug with the usage of the pre_delete signal, and other
tickets are distinguished as feature requests: e.g., adding ON DELETE
behaviors, or "Django should set nullable foreign key fields to NULL".


The Bug

django.db.models.Model.delete first gathers self._collect_sub_objects
to cache all related objects that will be deleted by this model's
deletion.  (This cache includes the model itself.)  Model.delete then
calls django.db.models.query.delete_objects, which calls the
pre_delete signal for each of the objects before deleting them.
Performing the caching before sending the signal breaks the signal's
promise with receivers that they will learn about the deletion "pre"
any steps taken towards deletion.

Describing this as a bug depends upon the characterization of
pre_delete, because hypothetically the signal could offer either a
weak or a strong contract with receivers.  A strong promise to
receivers would imply that they will get first crack at a model when
Django learns that the model will be deleted.  This is what I assumed
pre_delete was for when I learned about Django's signals.  This strong
version of the promise would require that Django take no steps towards
deleting the object before calling the signal (especially not
permanent steps like caching related object for deletion.)  This
strong promise would result in the most versatile framework, because
it allows for inversion of control.  Inversion of control is
particularly important in a plug-and-play framework like Django where
many different applications come together and must coexist in a single
project.

On the contrary, pre_delete could also offer a weak promise of just
letting receivers know that deletion is occurring, but restricting
what receivers can do to modify the results of the deletion.  This is
what the signal currently does with respect to related models.
(Although I don't think this restriction is intentional...maybe just
an artifact of how it was originally coded up.)  I hope that consensus
among the developers will be that the strong version of the promise is
the more robust development choice.

This Bug is Distinct from Feature Requests

The bug I've described is a design issue.  When you combine it with
Django's documented behavior of ON DELETE CASCADE, though, you get the
common complaint that related models are deleted and there's nothing a
pre_delete receiver can do about it.  At least two features have been
requested to address this issue.  One is that Django should
automatically set nullable foreign key fields to NULL whenever the
keyed object is deleted.  (#9308[2].  It says this is fixed but I'm
not experiencing it.)  Another is to implement several ON DELETE/
UPDATE behaviors that are set at field creation (#7539[3]).  Even if
both of these features were implemented, it would not change the fact
that right now the pre_delete signal as-used only offers a weak
promise to receivers, and that there may arise in the future another
usage where this weak promise causes frustration to programmers.

My Patch Modifies _collect_sub_objects to Notify Signal Receivers

In order to notify receivers of the pre_delete signal (or any signal
for that matter), we need to 1) be at a point where we have identified
the model that needs to send the signal, and 2) be at a point in the
code before we have taken steps to complete the deletion (or whatever
action the signal is relevant to.)  The best place for this is at the
beginning of Model's _collect_sub_objects method.  At the beginning of
the method, self's related models have not been inspected to add them
to seen_objs, but we have identified which model (namely self) will be
the target of the action.  Since _collect_sub_objects recurses along
relations, it will have the opportunity to send the signal for each
related model as soon as it recurses upon the related model.

The only thing I don't like about my solution is that it piggybacks
sending signals along with collecting related models rather than
sending the signals just before the action is applied.  Another
alternative might be to traverse all of the related models first to
send the signals, and then traverse them all again (those that remain
related, that is.)  But it would be inefficient to perform the same
traversal twice.  So I think sending the signal from within
_collect_sub_objects is a good compromise.

Another consideration is what alternatives there are to this
solution.  I found two.  One alternative (#8168[4]) suggests adding
another signal, e.g. pre_pre_delete.  Personally I think one
pre_delete signal is enough if only it were sent in the right place.
If a pre_pre_delete signal were implemented, I don't know why anyone
would use the current pre_delete signal.  Another alternative
(#6108[5]) suggests sending a list of the objects to-be-deleted along
with the pre_delete signal and allowing the signal receiver to modify
the list.  There are some problems with that alternative, though. 1)
one model may send the pre_delete signal to its receivers, but then
another model in the same batch of to-be-deleted models sends the
pre_delete signal to another receiver which removes the first model
from the list and now even the weak promise of notification of
deletion has been broken for the first model's receivers.  This
doesn't happen in my solution because the signal is sent before any
related models are added to the cache of seen_objs, and so if a
receiver removes related models, they won't send the signal.  2)
Django seems to use the less-is-more approach to signals by just
sending the basic information needed with a signal.  #6108 seems to
depart from that norm.  Besides, it's possible for the receiver to
reconstruct any information that hypothetically might be passed with
the signal, so I say keep the protocol simple and let the receivers
figure out what they need.

Specifics on the Patch

The patch adds an argument called "pre_signal" to
_collect_sub_objects. The argument should be one of Django's signals
from django.db.models.signals, and it will be called at the beginning
of the method (if the model was not autocreated.)  The argument is
called pre_signal because although _collect_sub_objects is presently
only used for finding related objects for deletion, it could possibly
be used for some other purpose for which a different pre_xx signal
would be sent.

An alternative to sending a signal as an argument could be sending a
callback.  This could easily be used to send the signal
(callback=lambda model: pre_delete.send(model.__class__, model)), but
it might be overkill because maybe the only thing it would ever be
used for would be to send a pre_delete signal.

Anyhoo, I love using Django, I'm deeply indebted to all of you who
have worked on it.  Thank you.  I'm sorry my introduction to the
mailing list is a short novel, but I hope that this patch results in a
useful addition to the framework.

-Carl

[0] 
http://code.djangoproject.com/attachment/ticket/6870/6870.pre_delete.r13248.diff
[1] http://code.djangoproject.com/ticket/6870
[2] http://code.djangoproject.com/ticket/9308
[3] http://code.djangoproject.com/ticket/7539
[4] http://code.djangoproject.com/ticket/8168
[5] http://code.djangoproject.com/ticket/6108

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-develop...@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.

Patch for #6870: sending pre_delete before caching of related models

Reply via email to