#32244: ORM inefficiency: ModelFormSet executes a single-object SELECT query per
formset instance when saving/validating
-------------------------------------+-------------------------------------
     Reporter:  Lushen Wu            |                    Owner:  nobody
         Type:                       |                   Status:  closed
  Cleanup/optimization               |
    Component:  Database layer       |                  Version:  3.1
  (models, ORM)                      |
     Severity:  Normal               |               Resolution:  wontfix
     Keywords:  formsets             |             Triage Stage:
                                     |  Unreviewed
    Has patch:  0                    |      Needs documentation:  0
  Needs tests:  0                    |  Patch needs improvement:  0
Easy pickings:  0                    |                    UI/UX:  0
-------------------------------------+-------------------------------------
Comment (by rowleyd):

 I had the same questions as powderflask and Lushen Wu, so I was about to
 try to resurrect this conversation with a code example, but I found some
 answers in the process. So I'm leaving a comment that will hopefully
 clarify the questions being asked, and save others the trouble I went
 through to find some clarity on this topic.

 The questions in this ticket are probably coming from this kind of
 context: we have a ModelFormSet (let's say a list of Books), and we create
 a queryset that we pass in to retrieve a list of 100+ Books from a remote
 database and initialize the forms and their corresponding instances.
 Everything is peachy until you POST the form, and notice that right after
 it runs that query, it then runs a set of queries (one for each form in
 the formset) re-retrieving all of the same instances that we already have
 from the first query. It's confusing where all these extra, redundant
 queries are coming from, and after some investigation, it turns out that
 it's not something that I did wrong - they are the result of
 formset.is_valid() attempting to "validate" a hidden field in each form
 that represents the PK for each form's instance. So the questions in this
 ticket are basically asking 1) why is this "validation" of the PK field
 necessary, 2) why not just use the queryset I provided, and 3) how to
 skip/avoid the latter set of seemingly unnecessary and redundant queries?

 I think the misunderstanding at the heart of these questions is a simple
 oversight: when the form is POSTed, the formset is initialized from the
 POST data, **NOT** the queryset. Therefore, the hidden PK field validation
 is necessary because the POST data might contain stale PK values (e.g.,
 checking for forms whose model/instance no longer exists in the database).
 So skipping the validation (e.g., Lushen Wu's solution above) should be
 avoided since it opens the door to potentially worse issues. And the
 answer to resolving the redundant queries is to skip the first query
 if/when the form is POSTed (i.e., don't evaluate and pass a queryset into
 BaseModelFormSet because it won't be used to initialize the formset forms
 anyway).

 So now we're back to where we started, with the original question of
 whether there is a more efficient way to do the validation on the hidden
 PK field, instead of executing a separate query for every form in the
 formset. I'll just throw out a few thoughts since I have to move on after
 spending too much time in this rabbit hole: Carlton suggested a workaround
 (see the video he mentioned), but it didn't work for me. Looking back, I
 think it was probably because the suggestion was to set 'choices' on the
 ModelChoiceField in the ModelForm constructor; but in this case, the
 hidden PK ModelChoiceField doesn't yet exist at that point (I think it's
 added later in BaseModelFormset.add_fields()?). So maybe this would work
 if we iterate over the forms and set 'choices' **after** add_fields() is
 finished. But even if so, I think this is taking it a bit too far (i.e.,
 using somewhat obscure hooks to tweak a hidden, internal ModelChoiceField
 that the user didn't create and which they have little control over).
 Since this affects practically every user out there of
 modelformset_factory(), it might be worth making BaseModelFormSet
 implement this general workaround under the hood by default. (i.e., reuse
 the same queryset/choices for all of the hidden PK fields that it
 creates?) Alternatively, if the user passes a queryset to
 BaseModelFormSet, maybe it would make sense to use that queryset to
 validate the hidden PK fields (i.e., limit the choices of the hidden PK
 field to this given queryset)? If not, it could be helpful to generate an
 error/warning that the user is passing in a queryset argument that will be
 ignored.
-- 
Ticket URL: <https://code.djangoproject.com/ticket/32244#comment:17>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

-- 
You received this message because you are subscribed to the Google Groups 
"Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion visit 
https://groups.google.com/d/msgid/django-updates/0107019a94a2fcbd-7637db50-e188-4e16-b9b9-44b6bc5017fb-000000%40eu-central-1.amazonses.com.

Reply via email to