This may be very basic thing to ask, but there is no harm to double check: are you using select_related instead of all for those two specific Models?

Check out https://django-debug-toolbar.readthedocs.io/en/stable/index.html if you aren't using it yet, it will help you to get that very easily (I'm assuming you have a view to list all those instances for those Models).


Em 11/06/2017 13:15, Miika Huusko escreveu:
Thanks for response!

Yep, timing with "time" is not the best way to go to compare SQL query times. The reason I added that "time" test is that in Stack Overflow I was asked to confirm that data transfer is not factor here. I timed SQL queries with "EXPLAIN ANALYZE" and "\timing" first. That don't take data transfer into account, so, "time" as quick test that data transfer is not a problem.

About timing: as an example I can reduce the problem to serializing only Items and related Photos. It results only two queries. For example for a dataset of about 40k items results:

django.db.backends: (1.252) SELECT "todo_item"."version", *... all item properties ...* FROM "todo_item" WHERE ("todo_item"."project_id" = '...' AND "todo_item"."deleted_at" IS NULL);

django.db.backends: (0.883) SELECT "photos_photo"."version", *... all item properties ...* FROM "photos_photo" WHERE "photos_photo"."note_id" IN (349527, 349528, 349529, *... and rest of the 40k IDs ...* );


Quite simple quries. Timing shown in django.db.backends logs is what those queries take when executed manually with psql (1252 ms + 883 ms). That results simple profiling info:

Database lookup               | 20.4447s
Serialization                 | 3.3821s
Django request/response       | 0.3419s
API view                      | 0.1988s
Response rendering            | 0.4591s

That's only from a single query and query times vary of course. Still, the difference between how long it takes query data from DB and how long Django process it is just that huge. The part I don't understand is that it takes about 20 seconds to run list(self.get_queryset()) while those two queries take about 2 seconds in SQL. There is some serious effort and time put there by Django. Those two queries are only queries that are run during list(self.get_queryset()) according to django.db.backend logs. "list" is there to force query execution to separate DB lookup time and serialization time.

Adding a new Model recovering information from a view/stored procedure on the database is a good idea. Of course, I would like to first understand what might be wrong in current models to not make same mistakes again. Is there something one should consider when making related items like Photos and Events related to Items? That use case is quite simple and still result Django to use 18 seconds to process SQL query response. There is lot of data of course, but I have thought that returning 50k objects should not be a problem for Python / Django even though ORM always adds some overhead.




        
        

--
You received this message because you are subscribed to the Google Groups "Django 
users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-users+unsubscr...@googlegroups.com.
To post to this group, send email to django-users@googlegroups.com.
Visit this group at https://groups.google.com/group/django-users.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-users/b25172ca-da2b-e645-cf70-f06b11329bba%40gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to