Is there a recommended approach to anti-join queries? Here's the
query I'm having an issue with:
Branch.objects.filter(branchgroup__isnull=True)[:1]
where branchgroup is a ManytoMany relationship to another model. I
want to get a set of Branch objects that aren't mapped to any
BranchGroups. This query gets translated to the following (a
simplified equivalent):
SELECT "scm_branch"."id"
FROM "scm_branch"
LEFT OUTER JOIN "scm_branchgroup_branches"
ON ("scm_branch"."id" =
"scm_branchgroup_branches"."branch_id")
LEFT OUTER JOIN "scm_branchgroup"
ON ("scm_branchgroup_branches"."branchgroup_id" =
"scm_branchgroup"."id")
WHERE "scm_branchgroup"."id" IS NULL
LIMIT 1
This query is very slow: Limit (cost=1072479.36..6256437.79 rows=1
width=145)
However, a slightly modified, but functionally equivalent query:
SELECT "scm_branch"."id"
FROM "scm_branch"
LEFT OUTER JOIN "scm_branchgroup_branches"
ON ("scm_branch"."id" =
"scm_branchgroup_branches"."branch_id")
LEFT OUTER JOIN "scm_branchgroup"
ON ("scm_branchgroup_branches"."branchgroup_id" =
"scm_branchgroup"."id")
WHERE "scm_branchgroup_branches"."branch_id" IS NULL
LIMIT 1
Is orders of magnitude faster: Limit (cost=1518.71..1533.35 rows=1
width=145)
The difference is with the WHERE clause. Django generates WHERE
"scm_branchgroup"."id" IS NULL, but a properly optimized query should
use WHERE "scm_branchgroup_branches"."branch_id" IS NULL. This is
because Postgres recognizes the second query as a anti-join query and
can do a lot of optimization.
The basic question is: How do I make Django generate a faster query?
Thanks
Dmitry
--
You received this message because you are subscribed to the Google Groups
"Django users" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to
[email protected].
For more options, visit this group at
http://groups.google.com/group/django-users?hl=en.