Re: [Django] #9342: query optimization issue

2008-10-14 Thread Django
#9342: query optimization issue
--+-
  Reporter:  [EMAIL PROTECTED]  | Owner:  nobody
Status:  new  | Milestone:
 Component:  Uncategorized|   Version:  1.0   
Resolution:   |  Keywords:
 Stage:  Unreviewed   | Has_patch:  0 
Needs_docs:  0|   Needs_tests:  0 
Needs_better_patch:  0|  
--+-
Comment (by [EMAIL PROTECTED]):

 I run this hack against the Django unit tests
 (...\tests\regressiontests\queries\models.py) and it fails against
 ManyToMany type tests.
 {{{
 --
 File "C:\Python25\Lib\site-
 packages\tests\regressiontests\queries\models.py", line ?, in
 regressiontests.queries.models.__test__.API_TESTS
 Failed example:
 Item.objects.exclude(tags__name='t4').order_by('name').distinct()
 Expected:
 [, , ]
 Got:
 [, ]
 --
 File "C:\Python25\Lib\site-
 packages\tests\regressiontests\queries\models.py", line ?, in
 regressiontests.queries.models.__test__.API_TESTS
 Failed example:
 Item.objects.exclude(tags__name='t4').order_by('name').distinct().reverse()
 Expected:
 [, , ]
 Got:
 [, ]
 --
 File "C:\Python25\Lib\site-
 packages\tests\regressiontests\queries\models.py", line ?, in
 regressiontests.queries.models.__test__.API_TESTS
 Failed example:
 Item.objects.exclude(tags__name='t1').order_by('name')
 Expected:
 [, ]
 Got:
 []
 --
 File "C:\Python25\Lib\site-
 packages\tests\regressiontests\queries\models.py", line ?, in
 regressiontests.queries.models.__test__.API_TESTS
 Failed example:
 Item.objects.exclude(tags__name='t1').exclude(tags__name='t4')
 Expected:
 []
 Got:
 []
 --
 File "C:\Python25\Lib\site-
 packages\tests\regressiontests\queries\models.py", line ?, in
 regressiontests.queries.models.__test__.API_TESTS
 Failed example:
 Item.objects.exclude(tags__name='t1',
 name='one').order_by('name').distinct()
 Expected:
 [, , ]
 Got:
 [, ]
 --
 File "C:\Python25\Lib\site-
 packages\tests\regressiontests\queries\models.py", line ?, in
 regressiontests.queries.models.__test__.API_TESTS
 Failed example:
 Item.objects.filter(name__in=['three',
 'four']).exclude(tags__name='t1').order_by('name')
 Expected:
 [, ]
 Got:
 []
 --
 File "C:\Python25\Lib\site-
 packages\tests\regressiontests\queries\models.py", line ?, in
 regressiontests.queries.models.__test__.API_TESTS
 Failed example:
 Item.objects.exclude(~Q(tags__name='t1', name='one'))
 Expected:
 []
 Got:
 [, ]
 --
 File "C:\Python25\Lib\site-
 packages\tests\regressiontests\queries\models.py", line ?, in
 regressiontests.queries.models.__test__.API_TESTS
 Failed example:
 Item.objects.filter(~Q(tags__name='t1', name='one'), name='two')
 Expected:
 []
 Got:
 [, ]
 --
 File "C:\Python25\Lib\site-
 packages\tests\regressiontests\queries\models.py", line ?, in
 regressiontests.queries.models.__test__.API_TESTS
 Failed example:
 Item.objects.exclude(~Q(tags__name='t1', name='one'), name='two')
 Expected:
 [, , ]
 Got:
 [, , ]

 }}}

-- 
Ticket URL: 
Django 
The Web framework for perfectionists with deadlines.
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django updates" group.
To post to this group, send email to django-updates@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-updates?hl=en
-~--~~~~--~~--~--~---



Re: [Django] #9342: query optimization issue

2008-10-10 Thread Django
#9342: query optimization issue
--+-
  Reporter:  [EMAIL PROTECTED]  | Owner:  nobody
Status:  new  | Milestone:
 Component:  Uncategorized|   Version:  1.0   
Resolution:   |  Keywords:
 Stage:  Unreviewed   | Has_patch:  0 
Needs_docs:  0|   Needs_tests:  0 
Needs_better_patch:  0|  
--+-
Changes (by anonymous):

  * needs_better_patch:  => 0
  * needs_tests:  => 0
  * needs_docs:  => 0

Comment:

 After looking at this more, I think the fault possibly lays in
 Query::get_from_clause (...\django\db\models\sql\query.py).

 The code iterates over an array of joins for known alias fields; checking
 whether its internal counter indicates that the alias is referenced or
 not. If the number of references is zero, it just skips that join. The
 problem is that that alias is possibly referenced in another join; but
 that is not being checked.

 I added a check and if such a case exists, the internal reference counter
 is increased for such an alias. I'm not sure if this is the  correct patch
 but it is a start:
 {{{
 def get_from_clause(self):
 """
 Returns a list of strings that are joined together to go after the
 "FROM" part of the query, as well as a list any extra parameters
 that
 need to be included. Sub-classes, can override this to create a
 from-clause via a "select", for example (e.g. CountQuery).

 This should only be called after any SQL construction methods that
 might change the tables we need. This means the select columns and
 ordering must be done first.
 """
 result = []
 qn = self.quote_name_unless_alias
 qn2 = self.connection.ops.quote_name
 first = True
 for alias in self.tables:
 if not self.alias_refcount[alias]:
 #START HACK: No sure if it is a problem with
 self.alias_refcount somewhere else or this code.
 #  Need to iterate over self.alias_map to ensure that
 another alias map does not
 #  reference this alias.
 for alias_check in self.tables:
 if alias != alias_check:
 try:
 name, rhs, join_type, lhs, lhs_col, col,
 nullable = self.alias_map[alias_check]
 except KeyError:
 continue
 if alias in [rhs,lhs]:
 self.ref_alias(alias)
 #END HACK

 if not self.alias_refcount[alias]:
 continue
 try:
 name, alias, join_type, lhs, lhs_col, col, nullable =
 self.alias_map[alias]
 except KeyError:
 # Extra tables can end up in self.tables, but not in the
 # alias_map if they aren't in a join. That's OK. We skip
 them.
 continue
 alias_str = (alias != name and ' %s' % alias or '')
 if join_type and not first:
 result.append('%s %s%s ON (%s.%s = %s.%s)'
 % (join_type, qn(name), alias_str, qn(lhs),
qn2(lhs_col), qn(alias), qn2(col)))
 else:
 connector = not first and ', ' or ''
 result.append('%s%s%s' % (connector, qn(name), alias_str))
 first = False
 for t in self.extra_tables:
 alias, unused = self.table_alias(t)
 # Only add the alias if it's not already present (the
 table_alias()
 # calls increments the refcount, so an alias refcount of one
 means
 # this is the only reference.
 if alias not in self.alias_map or self.alias_refcount[alias]
 == 1:
 connector = not first and ', ' or ''
 result.append('%s%s' % (connector, qn(alias)))
 first = False
 return result, []

 }}}

 One thing to note about this initial patch attempt; the SQL is *correct*
 but joins the missing table twice, once in the where clause and once in
 the sub-select. Here is what the SQL would look like:
 {{{
 SELECT "tableB"."key_id"
 FROM "tableB" INNER JOIN "tableA" ON ("tableB"."key_id" =
 "tableA"."key_id")
  WHERE NOT ("tableB"."key_id" IN (SELECT U2."key_id" FROM "tableB" U0
   INNER JOIN
 "tableA" U1 ON (U1."key_id" = U0."key_id")
   INNER JOIN
 "tableC" U2 ON (U1."key_id" = U2."key_id") WHERE UPPER(U2."data"::text) =
 UPPER('test') ))

 }}}

-- 
Ticket URL: 
D

[Django] #9342: query optimization issue

2008-10-10 Thread Django
#9342: query optimization issue
-+--
 Reporter:  [EMAIL PROTECTED]  |   Owner:  nobody
   Status:  new  |   Milestone:
Component:  Uncategorized| Version:  1.0   
 Keywords:   |   Stage:  Unreviewed
Has_patch:  0|  
-+--
 The generated SQL for the following example is incorrect. It appears that
 an optimization to reduce the number of table joins is at fault.

 Example:
 {{{
 class tableA(models.Model):
 key = models.AutoField(primary_key=True)

 class tableB(models.Model):
 key = models.ForeignKey(tableA)

 class tableC(models.Model):
 key = models.OneToOneField(tableA, primary_key=True)
 data = models.CharField(max_length=40, null=True)
 }}}
 The following filter will produce incorrect SQL:
 {{{
 qs = tableB.objects.exclude(key__tablec__data__iexact='test')
 qs.count()
 }}}
 The generated SQL will be something like:
 {{{
 SELECT "tableB"."key_id"
 FROM "tableB"
  WHERE NOT ("tableB"."key_id" IN (SELECT U2."key_id" FROM "tableB" U0
   INNER JOIN
 "tableC" U2 ON (U1."key_id" = U2."key_id") WHERE UPPER(U2."data"::text) =
 UPPER('test') ))
 }}}

 In the sub-select, tableA is being referenced as alias "U1" but is not
 defined. The non-optimized SQL would have been:

 {{{
 SELECT "tableB"."key_id"
 FROM "tableB"
  WHERE NOT ("tableB"."key_id" IN (SELECT U2."key_id" FROM "tableB" U0
   INNER JOIN
 "tableA" U1 ON (U1."key_id" = U0."key_id")
   INNER JOIN
 "tableC" U2 ON (U1."key_id" = U2."key_id") WHERE UPPER(U2."data"::text) =
 UPPER('test') ))
 }}}

 Tracing through the source, it seems that code in functions:
 {{{
 Query::add_filter(self, filter_expr, connector=AND, negate=False,
 trim=False,
 can_reuse=None, process_extras=True)

 and

 Query::setup_joins(self, names, opts, alias, dupe_multis, allow_many=True,
 allow_explicit_fk=False, can_reuse=None, negate=False,
 process_extras=True)
 }}}

 possibly are not agreeing.

-- 
Ticket URL: 
Django 
The Web framework for perfectionists with deadlines.
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django updates" group.
To post to this group, send email to django-updates@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-updates?hl=en
-~--~~~~--~~--~--~---