On 11/30/06, Rob Hudson <[EMAIL PROTECTED]> wrote:
>
>
> I think for those who need aggregate data these would cover a lot of
> ground.  I'd be willing to work on a patch if this is considered
> generally useful and we can work out what the API should look like.
>
>
1 - I'm agreed on the need for easier access to aggregates. Truth be told,
aggregates are the reason I got involved with Django in the first place.
However, other priorities have arisen in the meantime, so I haven't got
around to doing anything about them.

2 - Keep in mind that Malcolm has been working on refactoring
django.db.models.query. Until this refactor is committed, we are trying to
minimize the number of large changes to query.py.

3 - Also keep in mind that one of the goals of the SQLAlchemy branch is to
make complex aggregates (such as those requiring group_by and having) easier
to represent. That said, there doesn't appear to have been a lot of progress
on this branch (at least, not in public commits, anyway).

4 - If you search the archives (user and developer), you will find several
discussions on aggregate functions. group_by() and having() (or
pre-magic-removal analogs thereof) have been rejected previously on the
grounds that the Django ORM is not intended to be 'SQL with a different
syntax'. Any proposal for group_by/having will have to be logical from a
Django ORM point of view, and not presuppose/require knowledge of how SQL
formulates queries.

5 - The aggregates you suggest are the quick and obvious method for getting
aggregates into the query language. However, here are some issues to
consider:

Article.objects.count() return an integer that is the count of all author
objects. This makes sense, and syntactically parses the same way that it
operates.

However, what does Article.objects.max('pagecount') return? The integer that
is the largest page count, or the Article that has the largest pagecount?

If it is the former, how do you use the maximum value to get the Article
with that maximum value in a single query?

If it is the latter, does it return a single object, or a queryset that
evaluates to an object?

What happens if there are two objects with the same maximum pagecount?

How do you get multiple aggregates for a value in a single query (efficiency
matters)?

How does the simple case fit into the big picture? Ideally, the simple min()
would be a degenerate case of the min-with-group by-and-having. Prove to me
that adding min(), max(), etc isn't going to become a wart that we have to
support into the future when 'aggregate clauses 3000' is added to Django's
query syntax.

So, as you can see - it's not as simple as 'just add a min() where count()
is already'.

Like I said at the beginning, I'm keen to see aggregates implemented - I
just want to see them done right. There are many things that _could_ be done
to implement aggregates; whether they are the _right_ thing to do is another
matter entirely. I'm open to any discussion on this issue, and would be
happy to help shepard any patches resulting from the discussion into the
trunk.

Yours,
Russ Magee %-)


--~--~---------~--~----~------------~-------~--~----~
 You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to