#31883: exclude(pk__in=[None]) returns no objects
-------------------------------------+-------------------------------------
               Reporter:  Adam       |          Owner:  nobody
  Sołtysik                           |
                   Type:  Bug        |         Status:  new
              Component:  Database   |        Version:  3.1
  layer (models, ORM)                |
               Severity:  Normal     |       Keywords:  exclude in nulls
           Triage Stage:             |      Has patch:  0
  Unreviewed                         |
    Needs documentation:  0          |    Needs tests:  0
Patch needs improvement:  0          |  Easy pickings:  0
                  UI/UX:  0          |
-------------------------------------+-------------------------------------
 The problem: `Model.objects.exclude(pk__in=[None])` always returns 0
 objects.

 This behaviour is very surprising (especially in more complicated cases)
 and is essentially useless, so it should be considered a bug.
 `Model.objects.filter(pk__in=[None])` and `Model.objects.filter(pk=None)`
 also (correctly) return no objects, and `Model.objects.exclude(pk=None)`
 always (correctly) returns all objects.

 The issue has already been raised on the internet, see e.g.
 https://stackoverflow.com/questions/47058497/django-queryset-exclude-why-
 are-all-excluded, but I haven't found a bug report in Django. I know this
 is caused by how `NULL`s are handled in SQL, but broken SQL is certainly
 not a reason for Django's ORM to be broken the same way.

 It could be fixed by filtering out the `None`s and splitting the query, so
 that the first example would be equivalent to
 `Model.objects.exclude(pk__in=[]).exclude(pk=None)`, which does correctly
 return all objects. The same should be done in the `filter` case, so that
 `Model.objects.filter(field__in=[something, None]` would also find objects
 with `field=None`, which it currently ignores (this is quite unexpected as
 well, and one needs to use two `Q` objects to get around this).
 Alternatively, to avoid a breaking change, `None`s could be just ignored
 in `exclude` too.

 There is also a case with passing a queryset instead of a list, e.g.
 `Model.objects.exclude(pk__in=Model2.objects.values('field'))`. Even if
 there are no `NULL`s and it works correctly, there is another problem: it
 is terribly slow when the tables are large (at least in PostgreSQL). Both
 problems could be solved by always using a `JOIN` instead of `field [NOT]
 IN (SELECT ...)`, like if `related_name` were used:
 `Model.objects.filter(model2=None)`. Or, if this is too complicated, the
 inner query could receive `WHERE field IS NOT NULL` and `NULL`s should be
 handled separately (or ignored). Or the queryset could be simply converted
 to a list right away (which would of course make things even slower, but
 at least it would be clear what really happens).

-- 
Ticket URL: <https://code.djangoproject.com/ticket/31883>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

-- 
You received this message because you are subscribed to the Google Groups 
"Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-updates+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-updates/050.7751d92617df5cd7ad994a6468f3a705%40djangoproject.com.

Reply via email to