ModelForm validation of foreign keys - extra database queries and performance bottleneck

2017-07-24 Thread johan de taeye

I have a model that has a foreign key relation to a number of other objects.
When saving an instance of this model from the admin (or a ModelForm), I 
see plenty of extra and redundant database calls. 
For a single record it wouldn't make much of a difference, but when using 
the same ModeForm to do some batch upload these become the bottleneck in 
the process.

Has anyone bumped into the same performance bottleneck? 
Has anyone developed some solution for this?

By logging all database queries and some digging in the code, here's my 
analysis of what is happening:

   1. Open the admin editing screen for a single record.  
   I leave all fields to the original value, except for a field (not one of 
   the foreign key fields)
   2. When saving the record, the first query reads the existing record:
 select field1, field2, field3,  from mytable;
   3. During the form/model validation, I get an extra database query for 
   each of the foreign key fields.
   It is generated from the to_python method 
   of django.forms.models.ModelChoiceField:
 select field_a, field_b, field_c, field, ... from related_table 
   where pk = 'id_from_first_query';
   4. During the form/model validation, I get another database query for 
   each of the foreign key fields.
   It verifies that the values actually exists in the database:
select (1) from related_table where pk = 'value from form'; 

The queries in step 3 and 4 are redundant if the field hasn't changed. The 
first query gives enough data to allow us to verify that the new form value 
and the foreign key field on the existing instance are equal. I am using 
django 1.11.

The same queries, except 2, are executed when I create a new record. The 
queries in step 4 are redundant then - we just retrieved the values from 
the database.

Looking forward to any insights and hints...


Johan

-- 
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-users+unsubscr...@googlegroups.com.
To post to this group, send email to django-users@googlegroups.com.
Visit this group at https://groups.google.com/group/django-users.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-users/48bfca91-8750-4ad0-9c50-1ffc3c2fdb5f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: ModelForm validation of foreign keys - extra database queries and performance bottleneck

2017-07-25 Thread James Schneider
On Jul 24, 2017 4:09 AM, "johan de taeye"  wrote:


I have a model that has a foreign key relation to a number of other objects.
When saving an instance of this model from the admin (or a ModelForm), I
see plenty of extra and redundant database calls.
For a single record it wouldn't make much of a difference, but when using
the same ModeForm to do some batch upload these become the bottleneck in
the process.

Has anyone bumped into the same performance bottleneck?
Has anyone developed some solution for this?

By logging all database queries and some digging in the code, here's my
analysis of what is happening:

   1. Open the admin editing screen for a single record.
   I leave all fields to the original value, except for a field (not one of
   the foreign key fields)
   2. When saving the record, the first query reads the existing record:
 select field1, field2, field3,  from mytable;
   3. During the form/model validation, I get an extra database query for
   each of the foreign key fields.
   It is generated from the to_python method of django.forms.models.
   ModelChoiceField:
 select field_a, field_b, field_c, field, ... from related_table
   where pk = 'id_from_first_query';
   4. During the form/model validation, I get another database query for
   each of the foreign key fields.
   It verifies that the values actually exists in the database:
select (1) from related_table where pk = 'value from form';

The queries in step 3 and 4 are redundant if the field hasn't changed. The
first query gives enough data to allow us to verify that the new form value
and the foreign key field on the existing instance are equal. I am using
django 1.11.

The same queries, except 2, are executed when I create a new record. The
queries in step 4 are redundant then - we just retrieved the values from
the database.

Looking forward to any insights and hints...


You should look at modifying the query set that your view is using to pull
the main object to include select_related() calls. I don't know if you're
using function -based views or class-based views,  so I can't comment
further on implementation.

https://docs.djangoproject.com/en/1.11/ref/models/querysets/#select-related

The extra calls are likely occurring when the related fields are being
accessed during validation, etc.

Also keep in mind that Django (or any other framework) has no idea whether
or not fields have "changed" in a form submission without pulling the
original set of values to compare against, so expect the object to be
pulled at least once on every request.

-James

-- 
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-users+unsubscr...@googlegroups.com.
To post to this group, send email to django-users@googlegroups.com.
Visit this group at https://groups.google.com/group/django-users.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-users/CA%2Be%2BciUjjFKSCa-Sdi6yBd-sdLoaPCFmvs-XWCDKYsorKskUAQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: ModelForm validation of foreign keys - extra database queries and performance bottleneck

2017-07-25 Thread johan de taeye

I've already tried the select_related in my queryset. No change at all.  

>>Also keep in mind that Django (or any other framework) has no idea 
whether or not fields have "changed" in a form submission without pulling 
the original set of values to compare against, so expect the object to be 
pulled at least once on every request. 

The first query already retrieves the primary key of the original object, 
or the complete object if selected_related is added.  Using select_related 
can keep things to a single query, and we should be good - it's the number 
of db queries that is executed which is drastically reducing the 
performance.


I believe I'll need a custom version of the ModelChoiceField class.


Op dinsdag 25 juli 2017 10:48:41 UTC+2 schreef James Schneider:
>
>
>
> On Jul 24, 2017 4:09 AM, "johan de taeye"  > wrote:
>
>
> I have a model that has a foreign key relation to a number of other 
> objects.
> When saving an instance of this model from the admin (or a ModelForm), I 
> see plenty of extra and redundant database calls. 
> For a single record it wouldn't make much of a difference, but when using 
> the same ModeForm to do some batch upload these become the bottleneck in 
> the process.
>
> Has anyone bumped into the same performance bottleneck? 
> Has anyone developed some solution for this?
>
> By logging all database queries and some digging in the code, here's my 
> analysis of what is happening:
>
>1. Open the admin editing screen for a single record.  
>I leave all fields to the original value, except for a field (not one 
>of the foreign key fields)
>2. When saving the record, the first query reads the existing record:
>  select field1, field2, field3,  from mytable;
>3. During the form/model validation, I get an extra database query for 
>each of the foreign key fields.
>It is generated from the to_python method 
>of django.forms.models.ModelChoiceField:
>  select field_a, field_b, field_c, field, ... from related_table 
>where pk = 'id_from_first_query';
>4. During the form/model validation, I get another database query for 
>each of the foreign key fields.
>It verifies that the values actually exists in the database:
> select (1) from related_table where pk = 'value from form'; 
>
> The queries in step 3 and 4 are redundant if the field hasn't changed. The 
> first query gives enough data to allow us to verify that the new form value 
> and the foreign key field on the existing instance are equal. I am using 
> django 1.11.
>
> The same queries, except 2, are executed when I create a new record. The 
> queries in step 4 are redundant then - we just retrieved the values from 
> the database.
>
> Looking forward to any insights and hints...
>
>
> You should look at modifying the query set that your view is using to pull 
> the main object to include select_related() calls. I don't know if you're 
> using function -based views or class-based views,  so I can't comment 
> further on implementation. 
>
> https://docs.djangoproject.com/en/1.11/ref/models/querysets/#select-related
>
> The extra calls are likely occurring when the related fields are being 
> accessed during validation, etc. 
>
> Also keep in mind that Django (or any other framework) has no idea whether 
> or not fields have "changed" in a form submission without pulling the 
> original set of values to compare against, so expect the object to be 
> pulled at least once on every request. 
>
> -James 
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-users+unsubscr...@googlegroups.com.
To post to this group, send email to django-users@googlegroups.com.
Visit this group at https://groups.google.com/group/django-users.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-users/5cc0b017-2759-4026-ab6e-41f5a6cb3dd3%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.