Re: Related manager remove() and clear() methods - backwards incompatible changes

Anssi Kääriäinen Thu, 21 Nov 2013 03:38:27 -0800

On Wednesday, November 20, 2013 5:17:11 PM UTC+2, Shai Berger wrote:
>
> Hi, 
>
> On Saturday 16 November 2013 21:02:00 Anssi Kääriäinen wrote: 
> > Any feedback for pre/post_update idea? 
> > 
> As Loic said, the signals sound like they can be useful in a variety of 
> situations. A couple of notes, though: 
>
> > pre_update listeners get a queryset that isn't executed. 
>
> The "query" part, only, I assume; then they should also somehow get the 
> intended update parameters. Perhaps the actual keyword arguments to 
> update(). 
>
> > Accessing that queryset might be costly 
>
> Accessing objects from that queryset may be a footgun -- if they somehow 
> stay 
> in memory after the update is executed, they will become stale. There 
> should 
> be stern warnings about this in the documentation, or even -- and this is 
> just 
> thinking outloud, I haven't considered all the evil effects -- holding 
> weakrefs 
> to all these objects and using the new model reload mechanism to refresh 
> them 
> after the update. 
>

The current idea is to only add post_update with pk_set and kwargs to
update as data.

Unfortunately such a signal doesn't cover all usage cases - for example if
you want to do database change tracking using signals, you can't do it with
the above approach. We (as in I and Loic Bistuer) have tried a lot of
different approaches to the update signals problem. It seems no matter what
idea is tried it always fails in some cases, or adds a lot of API/code
complexity.

Why not post_update with a queryset? If you need a queryset it is easy to
construct inside the signal from the pk_set. But if you need the pk_set
only you can't get that back from the queryset in any cheap way. In
addition the queryset must be constructed with pk__in=pk_set, and such
querysets can be expensive or impossible to execute (sqlite more than 1000
parameters exception for example). Finally the pk_set calculation will be
cheap on databases which implement RETURNING.

Why not pre_update? It turns out this one will be expensive to implement in
a way that is race-free. We must execute .select_for_update() query before
pre_update, and after that we need to use
pk__in=pk_set_from_select_from_update for the update query. Add in the
pk__in=large_set problem and this doesn't feel like a good addition.

The real problem here is that no matter how we design our signals framework
it will be either leaky, or horrendously expensive. Just imagine what it
means to update a field for a large table if you will need to fetch both
pre/post images for each row. A signals framework that actually works for
database auditing would need to come with a disclaimer: "Not usable in most
projects due to performance reasons".

So, now the question is, is the suggested post_update signal a good
addition to Django?

- Anssi

--
You received this message because you are subscribed to the Google Groups
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers.
To view this discussion on the web visit
https://groups.google.com/d/msgid/django-developers/ada90f81-d526-48cc-871a-6c227ca66b1c%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Re: Related manager remove() and clear() methods - backwards incompatible changes

Reply via email to