#10847: `QuerySet.values` doesn't remove extra selections.
---------------------------------------------------+------------------------
          Reporter:  mrmachine                     |         Owner:  nobody     
                     
            Status:  new                           |     Milestone:  1.1        
                     
         Component:  Database layer (models, ORM)  |       Version:  SVN        
                     
        Resolution:                                |      Keywords:  queryset 
extra select values sql
             Stage:  Accepted                      |     Has_patch:  1          
                     
        Needs_docs:  0                             |   Needs_tests:  0          
                     
Needs_better_patch:  1                             |  
---------------------------------------------------+------------------------
Changes (by russellm):

  * needs_better_patch:  0 => 1

Comment:

 Alex - this patch isn't quite right, but the reason requires some
 explanation.

 Here's a more complete test case that reveals the flaw:
 {{{
 #1
 >>> TestObject.objects.extra(select={'extra': 1}).values('pk')
 [{'pk': 1}]

 #2
 >>> TestObject.objects.extra(select={'extra':
 1}).values('pk').query.as_sql()
 ('SELECT "extra_regress_testobject"."id" FROM "extra_regress_testobject"',
 ())

 #3
 >>>
 TestObject.objects.filter(pk__in=TestObject.objects.extra(select={'extra':
 1}).values('pk'))
 [<TestObject: TestObject: first,second,third>]

 #4
 >>> TestObject.objects.values('pk').extra(select={'extra': 1})
 [{'pk': 1}]

 #5
 >>> TestObject.objects.values('pk').extra(select={'extra':
 1}).query.as_sql()
 ('SELECT "extra_regress_testobject"."id" FROM "extra_regress_testobject"',
 ())

 #6
 >>>
 
TestObject.objects.filter(pk__in=TestObject.objects.values('pk').extra(select={'extra':
 1}))
 [<TestObject: TestObject: first,second,third>]
 }}}

 Look closely - in 1-3, the values() comes after the extra(). These three
 queries pass fine.

 However in 4-6, the extra() comes after the pk(). This is where the
 strangeness starts: 4 passes (as a result of the changes from #10256), but
 5 and 6 don't.

 This is due to a problem that I originally thought was just annoying and
 cosmetic, but this ticket reveals that it is actually significant. The
 problem is this - the as_sql() call for any query with an extra clause
 doesn't produce correct SQL until it is executed. This is because the
 final list of extra columns isn't evaluated until the iterator is
 evaluated. This has two consequences:
  1. If you try to debug print an SQL query with an extra clause, the
 output will be misleading
  2. If you use an SQL query with an extra clause as an inner query, you
 will get errors, because the inner SQL will be wrong.

 There is hope, however. The fix lies in changing the way the list of
 active extra columns are evaluated. Currently, there is a single 'extra'
 dictionary; when the ValueQuerySet is evaluated, this extra dictionary is
 trimmed to the right length. This means that if the query isn't evaluated,
 the list of active extra columns is wrong. Your patch implements the trim
 in the setup phase, which works fine when the values() comes after the
 extra(), but not if the extra() comes after the values().

 I had a similar problem with aggregates, but I was able to fix that
 problem by taking a different approach. Rather than having a single
 dictionary of aggregates, there is a definition dictionary, and a mask of
 currently active annotations. When you call values(), you set the mask;
 when you call annotate(), you add fields to the dictionary. This way, at
 any time you can evaluate a property that contains the current active
 annotation list by applying the mask to the field definition list.

 The masking behaviour for extra() is slightly different to that for
 annotate() - if an annotate follows the values clause, it is added to the
 mask; this isn't the case for extra. However, this basic masking approach
 should make it possible to get an accurate query at any time, without the
 need for evaluation.

 If you want to take a swing at this, it should be a fairly simple change
 (albiet a fairly widespread one). There is also the potential for some
 clashes with contrib.gis. If you haven't got the time (or inclination) to
 work on this, I should be able to take a look at it next week some time.

-- 
Ticket URL: <http://code.djangoproject.com/ticket/10847#comment:3>
Django <http://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Django updates" group.
To post to this group, send email to django-updates@googlegroups.com
To unsubscribe from this group, send email to 
django-updates+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/django-updates?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to