Re: Strange behavior with defer and select_related

2014-08-14 Thread Zach Snow
Upon reviewing the code in `django/db/models/sql/compiler.py` it seems 
clear that the list of deferred fields is per-table instead of per-table 
alias  -- see `deferred_column_names`, 
https://github.com/django/django/blob/f0b358880a6825d667c037757caac470bc526a1f/django/db/models/sql/compiler.py#L683
 
which for the above example will generate something like the following:

{ 'defer_company': set([u'creator_id', 'name', u'id']) }

Whereas I would expect:

{ 'T3': set([u'creator_id', 'name', u'id']) }

(As the `only_load` dictionary created by this method only lists tables for 
which the entire set of columns should not be loaded, we wouldn't see an 
entry for 'defer_company' at all). It seems that impossible to generate the 
correct SQL without tracking deferred fields per-table alias, but it wasn't 
clear to me that the information necessary to determine the table alias to 
which each deferred field "path" (of the form 
'field1__field2__...__fieldN') should be applied is available in the 
relevant place in the code.

-Zach

On Thursday, August 14, 2014 11:59:08 AM UTC-5, Zach Snow wrote:
>
> Hi there. I'm running into some strange behavior when using `defer` with 
> `select_related` and I wanted to find out if I'm simply misunderstanding 
> what should be going on, or if indeed I have stumbled onto a bug.
>
> Summary: Using `defer` on a table that has been re-included in a query via 
> `select_related` affects the fields returned by "original" table.
>
> Consider in the following example that each `Company` might have many 
> users, and each company might have been created by a user owned by the 
> company itself or a different company. Now we'd like to query the database 
> for a list of companies, along with enough information about the `creator` 
> to display, say, their name and company name.
>
> (For the sake of the example the number of fields is very small, but in 
> practice I ran into this bug because doing multiple `select_related`s on a 
> wide table actually caused me to hit the 1664 column limit in Postgres).
>
> # This is in an app "defer".
> from django.db import models
>
> class Company(models.Model):
> creator = models.ForeignKey('defer.User', null=True)
> name = models.CharField(max_length=32)
> description = models.TextField()
>
> class User(models.Model):
> company = models.ForeignKey(Company)
> username = models.CharField(max_length=32)
> name = models.CharField(max_length=32)
>
> Now we query `Company` in such a way that it is `select_related` again 
> into the query, and defer fields on the instance of the table that is 
> `select_related`, but not on the "main" instance of the table.
>
> >>> cs = Company.objects.all().select_related(
> 'creator', 
> 'creator__company').defer('creator__company__description')
> >>>cs.first().description # Causes a query.
>
> Printing the SQL shows what's going wrong:
>
> >>> print cs.query
> SELECT "defer_company"."id", "defer_company"."creator_id",
> "defer_company"."name", "defer_user"."id",
> "defer_user"."company_id", "defer_user"."username", 
> "defer_user"."name",
> T3."id", T3."creator_id", T3."name"
> FROM "defer_company"
> LEFT OUTER JOIN "defer_user" ON ( "defer_company"."creator_id" = 
> "defer_user"."id" )
> LEFT OUTER JOIN "defer_company" T3 ON ( "defer_user"."company_id" = 
> T3."id" )
>
> Note that we are *not* retrieving the field `defer_company.description`. 
> It seems that the `defer` called on the instance of the `defer_company` 
> table selected through the `creator` field (`T3` above) is being applied to 
> the "first" instance of the table as well.
>
> Updating the SQL to include that column (in the way I expected `defer` and 
> `select_related` to behave) succeeds in returning the correct data -- in 
> sqlite3, at least.
>
> I encountered this behavior in version 1.6.5 and also reproduced it in 
> 1.7c2.
>
> Thanks!
>  
> -Zach
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-users+unsubscr...@googlegroups.com.
To post to this group, send email to django-users@googlegroups.com.
Visit this group at http://groups.google.com/group/django-users.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-users/d66f5898-6dd5-422f-8d7f-782183625b39%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Strange behavior with defer and select_related

2014-08-14 Thread Zach Snow
Hi there. I'm running into some strange behavior when using `defer` with
`select_related` and I wanted to find out if I'm simply misunderstanding
what should be going on, or if indeed I have stumbled onto a bug.

Summary: Using `defer` on a table that has been re-included in a query via
`select_related` affects the fields returned by "original" table.

Consider in the following example that each `Company` might have many
users, and each company might have been created by a user owned by the
company itself or a different company. Now we'd like to query the database
for a list of companies, along with enough information about the `creator`
to display, say, their name and company name.

(For the sake of the example the number of fields is very small, but in
practice I ran into this bug because doing multiple `select_related`s on a
wide table actually caused me to hit the 1664 column limit in Postgres).

# This is in an app "defer".
from django.db import models

class Company(models.Model):
creator = models.ForeignKey('defer.User', null=True)
name = models.CharField(max_length=32)
description = models.TextField()

class User(models.Model):
company = models.ForeignKey(Company)
username = models.CharField(max_length=32)
name = models.CharField(max_length=32)

Now we query `Company` in such a way that it is `select_related` again into
the query, and defer fields on the instance of the table that is
`select_related`, but not on the "main" instance of the table.

>>> cs = Company.objects.all().select_related(
'creator',
'creator__company').defer('creator__company__description')
>>>cs.first().description # Causes a query.

Printing the SQL shows what's going wrong:

>>> print cs.query
SELECT "defer_company"."id", "defer_company"."creator_id",
"defer_company"."name", "defer_user"."id",
"defer_user"."company_id", "defer_user"."username",
"defer_user"."name",
T3."id", T3."creator_id", T3."name"
FROM "defer_company"
LEFT OUTER JOIN "defer_user" ON ( "defer_company"."creator_id" =
"defer_user"."id" )
LEFT OUTER JOIN "defer_company" T3 ON ( "defer_user"."company_id" =
T3."id" )

Note that we are *not* retrieving the field `defer_company.description`. It
seems that the `defer` called on the instance of the `defer_company` table
selected through the `creator` field (`T3` above) is being applied to the
"first" instance of the table as well.

Updating the SQL to include that column (in the way I expected `defer` and
`select_related` to behave) succeeds in returning the correct data -- in
sqlite3, at least.

I encountered this behavior in version 1.6.5 and also reproduced it in
1.7c2.

Thanks!

-Zach

-- 
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-users+unsubscr...@googlegroups.com.
To post to this group, send email to django-users@googlegroups.com.
Visit this group at http://groups.google.com/group/django-users.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-users/CAE1SZhCmoN7ThUb295Zm3x_DZiayN2Wnn1GJQMTsvoG5GwHxEA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.