We have room for some improvement here. It should be possible to add
subqueries to the current query. This would be especially useful for
adding aggregates.

For aggregates we need this feature because the current API doesn't
work for anything but simple cases. For example
Book.objects.annotate(authors_cnt=Count('authors'),
suppliers_cnt=Count('suppliers')) will currently produce two joins,
one to authors and one to suppliers. Thus, if a book has three authors
and two suppliers, it will have both counts as six. Solving this
problem (taking in account .filter() calls, too) is extremely complex.

If the user could instead tell the ORM to add the counts in
subqueries, the ORM would have much easier job of generating a correct
query.

For the "add single aggregate case" it might be sufficient to add a
strategy flag to aggregates. So,
Book.objects.annotate(authors_cnt=Count('authors',
strategy=models.SUBQUERY)) is a possible API.

If you want multiple aggregates in a single subquery we likely need
something more explicit. Maybe something like
    Book.objects.add_related(
        authors_subq=models.AggregateSubquery(
            Author.objects.annotate(cnt=Count('pk'), avg_age=Avg('age')))
This would add authors_subq__cnt and authors_subq__avg_age
automatically to the outer query as annotations.

The add_related() method would come handy in other cases, too.
Consider joining a translation for currently active language:
    Book.objects.add_related(
        current_translation=models.ModelRelation('translations',
only=Q(lang=get_current_language()))

You could also attach raw SQL in the same way:
    Room.objects.add_related(
        current_reservation=models.RawSQLRelation(
            "select * from reservations where start_date <=
now()::date and end_date >= now()::date",
            on_condition=Q(pk=F('current_reservation__room_id')),
            model=Reservation)
    ))

We could later on add support for more esoteric concepts like lateral
joins or materializing the subquery into a temporary table.

I believe the current API for aggregates and multivalued relations has
come to a dead end. Simpler queries work well enough with the current
API. For more complex queries the API isn't easy to understand, and it
is extremely hard to translate complex ORM queries to SQL. Consider
    Book.objects.filter(
        Q(authors__height__gte=180), ~Q(authors__age__gte=50)
    ).annotate(author_cnt=Count('authors'))

This query is *extremely* hard to translate to SQL. Just the first
filter case is something nasty, the Q-object doesn't mean "keep all
rows that has author's age less than 50 and height greater than 180".
It means "a book must have at least one author taller than 180. *All*
of such authors for the book must have age less than 50 or the age
must be null". Still, we need to generate a count over such authors
for the book. And this example doesn't even reveal the full complexity
of the problem.

In summary, I vote +1 for the ability to compose querysets. The first
issue to tackle is to come up with a good extensible API for this.

 - Anssi

On Fri, Jan 22, 2016 at 6:44 PM, charettes <charett...@gmail.com> wrote:
> Hi Patryk,
>
>> Currently we have Prefetch objects that kind of solve this problem for M2M
>> relations but perhaps it would be nice to also be able to use QuerySets in
>> select_related() or even in filter(). I don't think Prefetch objects are
>> best
>> suited for that and I personally find having to instantiate them
>> explicitly
>> kind of ugly.
>
> From what I understand you're not a fan of Prefetch objects but I think most
> of your problems can be solved by using them and custom managers as the
> formers
> also support foreign keys[1].
>
> class AuthorDisplayManager(models.Manager):
>     def get_queryset(self):
>         queryset = super().get_queryset()
>         return queryset.filter(
>             visible=True,
>         ).annotate(
>             num_books=Count('books'),
>             num_awards=Count('awards'),
>         ).prefetch_related(
>             'foo',
>         )
>
> class Author(models.Model):
>     ...
>     for_display = AuthorDisplayManager()
>
>
> class BookDisplayManager(models.Manager):
>     def get_queryset(self):
>         queryset = super().get_queryset()
>         return queryset.filter(
>             visible=True,
>         ).prefetch_related(
>             Prefetch(
>                 'author',
>                 Author.for_display.all(),
>             )
>         )
>
>
> class Book(models.Model):
>     ...
>     for_display = BookDisplayManager()
>
> Simon
>
> [1] https://code.djangoproject.com/ticket/17003
>
> Le vendredi 22 janvier 2016 05:25:24 UTC-5, Patryk Zawadzki a écrit :
>>
>> Hi,
>>
>> Currently the way QuerySets work makes it hard to reuse code. Consider an
>> example where I have template tags to render an author and a book (that in
>> turn displays author). I don't want my template code to execute any
>> additional queries per row displayed so I am forced to prefetch everything
>> upfront. Now if I want to display a list of authors and a list of books my
>> code is vastly different:
>>
>> ```
>> authors = Author.objects.filter(visible=True)
>> authors_for_display = authors.annotate(num_books=Count('books'),
>> num_awards=Count('awards')).prefetch_related('foo')
>> ```
>>
>> vs.
>>
>> ```
>> books = Book.objects.filter(visible=True, author__visible=True)
>> books_for_display =
>> books.annotate(author_num_books=Count('author__books'),
>> author_num_awards=Count('author__awards')).select_related('author').prefetch_related('author__foo')
>> ```
>>
>> Both cases need to do the exact same filtering, annotations and prefetches
>> for the Author object but notation is so different that no code reuse can
>> happen ("baz" vs. "bar__baz" vs. "foo__bar__baz"). The behaviour of
>> annotations is also different as now they end up on the Book object which
>> forces me to write some glue code. If my structure gains another level of
>> depth I will be forced to go to even greater extents.
>>
>> Now consider that I have many other objects that reference the Author and
>> thus I have many places in code that implement the same logic but each time
>> with a slight syntactic twist. What if a front-end engineer asks me to
>> change the information fetched about the Author? Or perhaps I discover that
>> half of the query time is spent fetching a megabyte of Author's extended
>> biography in HTML that the front-end never uses outside of that Author's
>> page. I am forced to hunt down and update all of these.
>>
>> Currently we have Prefetch objects that kind of solve this problem for M2M
>> relations but perhaps it would be nice to also be able to use QuerySets in
>> select_related() or even in filter(). I don't think Prefetch objects are
>> best suited for that and I personally find having to instantiate them
>> explicitly kind of ugly. To me having another SelectRelated object is
>> probably a no-go.
>>
>> Ideas?
>
> --
> You received this message because you are subscribed to the Google Groups
> "Django developers (Contributions to Django itself)" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to django-developers+unsubscr...@googlegroups.com.
> To post to this group, send email to django-developers@googlegroups.com.
> Visit this group at https://groups.google.com/group/django-developers.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/django-developers/ebad6483-f90c-41f9-8710-65b328320cdd%40googlegroups.com.
>
> For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/CALMtK1Hfu4JEuNBhB_pi59TFqfQmJ0vLyC13jxmiRopvcB27dA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to