Re: On aggregates

[EMAIL PROTECTED] Sat, 24 May 2008 14:01:02 -0700

All of this looks pretty good.  The 2 things I have to say are, a) I
think the F() way of dealing with that, in place of list comprehension
is the right way to go, there has been a lot of good work on that
patch, and I think it works well, conceptually, with this. b) Is there
any plan to make selecting on the primary key the default, so one can
do Count('authors') in place of Count('authors__id')?


Alex

On May 24, 8:18 am, "Nicolas E. Lara G." <[EMAIL PROTECTED]>
wrote:
> Hi Russell,
> I have been adding your test cases to the test suit and ran into
> troubles with some of them so I would like to know a bit more on what
> you were expecting from the queries:
>
> > * Does Author X have any friends?
> > (aggregate on a related object)
>
> Author.objects.filter(pk=1).aggregate(Count('friends__id'))
>
>
>
> > * Give me a list of all Books with more than 2 authors
> > (Query over the count of objects in a m2m relationship)
>
> >>>Book.objects.all().annotate(num_authors=Count('authors__name')).filter(num_authors__gt=2)
> > * Give me a list of all Authors that have no friends
> > (Query that counts when rows that don't exist in the m2m table)
>
> >>> Author.objects.all().annotate(num_friends=Count('friends__id', 
> >>> distinct=True), allow_nulls=True).filter(num_friends=0)
> > * Give me a list of all publishers that have published more than 10 books
> > (Query over the reverse direction of a FK)
>
> >>> Publisher.objects.all().annotate(num_books=Count('book__id')).filter(num_books__gt=10)
> > * Give me a list of all publishers that have published more than 10
> > books this year
> > (Query over the reverse direction of a FK, with a filter on the related 
> > object)
>
> >>> Publisher.objects.all().annotate(num_books=Count('book__id')).filter(num_books__gt=10,
> >>>  book__price__lt=50)
>
> So far so good. Though filter() is not yet implemented on aggregates
> the previous examples would work as soon as fileter starts working on
> aggregates. The generated SQL and what is planned for flter seems to
> match very well the problems in question.
>
> Now the hard ones:
>
> > * Give me a list of all Books that were written by X and one other author.
> > (Query that checks a count of m2m objects, plus membership of the m2m set)
>
> We have two possibilities here that would make sense.
> The first intuitive way to express this would be:
>
> >>> Book.objects.all().filter(authors__name__contains='Norvig').annotate(num_authors=Count('authors__id')).filter(num_authors__gt=1)
>
> This has problems because, given the nature of the SQL that is being
> generated, when the filtering occurs (It is transformed into a WHERE
> clause) we are restricting the rows to only those that have the author
> X. This way, counting them wont refer to the actual number of authors
> for each book but the number of authors currently selected (just 1).
> To work around this, the way of doing the query is this:
>
> >>> Book.objects.all().annotate(num_authors=Count('authors__id')).filter(authors__name__contains='Norvig',
> >>>  num_authors__gt=1)
>
> which would do the filtering after the aggregate so the count is
> executed on all the books.
>
> Should we consider this something that should be solved with
> documentation or should I implement the logic to restructure the query
> if an aggregate is executed on a QuerySet that has been filtered?
>
> > * Give me a list of all Authors that have published a book with at
> > least one other person
> > (Filters over a count generated on a related object)
>
> In this one I cheated doing this:>>> [a for a in 
> Author.objects.all().annotate(num_coleagues=Count('book__authors__id'), 
> num_books=Count('book__id', distinct=True)) if a.num_coleagues - a.num_books 
> > 0]
>
> but I am sure the intention was to get it done by just using the
> aggregation syntax. Do you have any suggestion on how to express this
> (with either existing or non-existant syntax)?
>
> An option for this would probably be the previously suggested F-
> syntax. So we could do:
>
> >>> Author.objects.all().annotate(num_coleagues=Count('book__authors__id'), 
> >>> num_books=Count('book__id', 
> >>> distinct=True)).filter(num_coleagues__gt=F(num_books))
> > * Give me the average price of all Books that were written by X and
> > one other author.
> > (Aggregate over objects discovered using membership of the m2m set)
>
> Also for this problem, how would you express it in aggregation syntax?
>
> An option that looks very good from the syntax point of view would be:
>
> >>> Book.objects.all().annotate(num_authors=Count('authors__id')).filter(num_authors__gt=1).aggregate(Avg('price'))
>
> but for this to work we would need to introduce the notion of
> subqueries which, AFAIK, have only been dealt with in extra().
>
> I believe it would be quite clean to introduce subqueries in common
> cases like this. Especially when dealing with aggregate since
> aggregate is a terminal clause.
>
> Cheers,
>
> Nicolas Lara
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en
-~----------~----~----~----~------~----~------~--~---

Re: On aggregates

Reply via email to