#22014: `prefetch_related` recursion protection does not cover all cases
-------------------------------+----------------------
     Reporter:  StillNewb      |      Owner:  nobody
         Type:  Bug            |     Status:  new
    Component:  Uncategorized  |    Version:  master
     Severity:  Normal         |   Keywords:  prefetch
 Triage Stage:  Unreviewed     |  Has patch:  0
Easy pickings:  0              |      UI/UX:  0
-------------------------------+----------------------
 This issue related to #17014

 Commit (see line 1655):
 
https://github.com/django/django/commit/64da8eec3077eaaba5cfa62775523da772f4de44
 #diff-5b0dda5eb9a242c15879dc9cd2121379R1655

 == Summary:  ==
 `prefetch_related` collects additional prefetch lookups made by querysets
 that are created during prefetching, and handle them to in addition to
 lookups defined by user.
 Sometimes it can casue infinite recursion, so for preventing that, there
 needed  some mechanism that ensures that prefetches that was already done,
 would not be performed again.

 == Problems in the code: ==
 Now that mechanism stores descriptors of relational fields and checks
 against them every iteration.
 Python descriptor is shared against instances of classes. So descriptor
 represents relational field of model, not field of instance.
 For same relation and different sets of instances there are different
 prefetch queries, so descriptors cant be used as identifiers for facts
 that lookup was already prefetched.
 And I also would add that passing descriptors around is very much
 unpythonic and not healthy practice in general. Descriptors are advanced
 properties, they must serve for same purpose as properties - hiding some
 logic in attribute assigment/deletion/retrieving.

 Reason here - is to identify lookups while traversing data relationships
 and never repeat them.
 In code this reason is not well exporessed:

 * There is `done_queries` and `followed_descriptors` - two places for same
 thing.
 * Check is against descriptors, but in `done_queries` lookup paths is
 being added.
 * Check against lookup type (auto lookup or normal) is redundant, there is
 no difference between auto-lookups and normal lookups in matter of which
 allowed to spam, they must be checked for which was already done
 uniformly.

 **Specific problem** is in the check condition for adding in
 `done_lookups` - `not (lookup in auto_lookups and descriptor in
 followed_descriptors)` - intention here was "if this is not autolookup,
 descriptor for which is already seen, and presumably lookup path for it
 was added to done_queries - not adding it to done_queries. So autolookup-
 spam will be stopped".
 But what if descriptor for field, throug lookup-spam assumed to flow, was
 already added while performing different lookup with different
 data/parameters? If that lookup will be autolookup - it will never be
 added to done_queries and would be allowed to be performed infinity number
 of times.

 There is too much unneeded complexity in that code.

 ps.
 Ive made the patch with two tests.

-- 
Ticket URL: <https://code.djangoproject.com/ticket/22014>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

-- 
You received this message because you are subscribed to the Google Groups 
"Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-updates+unsubscr...@googlegroups.com.
To post to this group, send email to django-updates@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-updates/052.53261a1b3e0a3efc393b76323404d280%40djangoproject.com.
For more options, visit https://groups.google.com/groups/opt_out.

Reply via email to