#17276: Slow anti-join query against Postgres
-------------------------------------+-------------------------------------
     Reporter:  dmitry@…             |                    Owner:  nobody
         Type:                       |                   Status:  new
  Cleanup/optimization               |                  Version:  1.3
    Component:  Database layer       |               Resolution:
  (models, ORM)                      |             Triage Stage:  Accepted
     Severity:  Normal               |      Needs documentation:  0
     Keywords:  orm, postgres,       |  Patch needs improvement:  0
  join, anti-join                    |                    UI/UX:  0
    Has patch:  0                    |
  Needs tests:  0                    |
Easy pickings:  0                    |
-------------------------------------+-------------------------------------
Description changed by ramiro:

Old description:

> For background on this ticket please see the following two discussions:
>
>  * http://stackoverflow.com/questions/8190474/postgres-slow-outer-query-
> when-using-a-primary-key
>  * https://groups.google.com/group/django-
> users/browse_thread/thread/7d13f2d8748b4f9f
>
> Basically, in a many-to-many mappings between models Student and Course,
> if I want to find all instances of Students that aren't registered for
> classes, I would issue the following Django query:
>
> Student.objects.filter(course__isnull=True)
>
> Django translates this into the following query:
>
> SELECT "student"."id", "student"."name"
>    FROM "student"
>       LEFT OUTER JOIN "course_students"
>          ON ("student"."id" = "course_students"."student_id")
>       LEFT OUTER JOIN "course"
>          ON ("course_students"."course_id" = "course"."id")
>    WHERE "course"."id" IS NULL
>
> The problem is that the way the WHERE clause is generated is very
> inefficient (at least when used with Postgres). Changing WHERE to
> "course_students"."student_id" IS NOT NULL yields orders of magnitude
> improved query plan.  Here's the difference I'm seeing on real data:
>
> Django way: (cost=1072479.36..6256437.79 rows=1 width=145)
> Hand-crafted: (cost=1518.71..1533.35 rows=1 width=145)
>
> I'm attaching a sample project with the model already set up.  To see the
> generated SQL query, simply run "python manage.py anti-join."

New description:

 For background on this ticket please see the following two discussions:

  * http://stackoverflow.com/questions/8190474/postgres-slow-outer-query-
 when-using-a-primary-key
  * https://groups.google.com/group/django-
 users/browse_thread/thread/7d13f2d8748b4f9f

 Basically, in a many-to-many mappings between models Student and Course,
 if I want to find all instances of Students that aren't registered for
 classes, I would issue the following Django query:

 {{{
 Student.objects.filter(course__isnull=True)
 }}}

 Django translates this into the following query:
 {{{
 SELECT "student"."id", "student"."name"
    FROM "student"
       LEFT OUTER JOIN "course_students"
          ON ("student"."id" = "course_students"."student_id")
       LEFT OUTER JOIN "course"
          ON ("course_students"."course_id" = "course"."id")
    WHERE "course"."id" IS NULL
 }}}
 The problem is that the way the WHERE clause is generated is very
 inefficient (at least when used with Postgres). Changing WHERE to
 "course_students"."student_id" IS NOT NULL yields orders of magnitude
 improved query plan.  Here's the difference I'm seeing on real data:

 * Django way: (cost=1072479.36..6256437.79 rows=1 width=145)
 * Hand-crafted: (cost=1518.71..1533.35 rows=1 width=145)

 I'm attaching a sample project with the model already set up.  To see the
 generated SQL query, simply run "python manage.py anti-join."

--

-- 
Ticket URL: <https://code.djangoproject.com/ticket/17276#comment:2>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

-- 
You received this message because you are subscribed to the Google Groups 
"Django updates" group.
To post to this group, send email to django-updates@googlegroups.com.
To unsubscribe from this group, send email to 
django-updates+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-updates?hl=en.

Reply via email to