[sqlalchemy] Re: select where field=max(field)

King Simon-NFHD78 Tue, 11 Nov 2008 02:31:43 -0800

> -----Original Message-----
> From: sqlalchemy@googlegroups.com 
> [mailto:[EMAIL PROTECTED] On Behalf Of John Hunter
> Sent: 11 November 2008 01:54
> To: sqlalchemy@googlegroups.com
> Subject: [sqlalchemy] Re: select where field=max(field)
> 
> 
> On Mon, Nov 10, 2008 at 11:10 AM, Michael Bayer
> <[EMAIL PROTECTED]> wrote:
> 
> > you need an extra tuple on the join, query.join((q1, s.s==q1.c.s))
> 
> This gets past the syntax error, but does not produce the right
> results.  I had to take some time off today to work on other problems,
> but am now returning to this query.  To better take advantage of all
> of your generous time :-) I wrote a free-standing example that
> populates a test database.  The initial query Simon suggested works
> and produces the desired output -- the goal is to replicate this with
> a sqlalchemy query.  I also include the join we were working on.  Now
> that the syntax is correct, it runs, but gives the wrong output.
> 
> Since someone proposed a bowling example earlier I decided to run with
> that since it fits my problem quite well: instead of finding the
> number of symbols per strategy where the sum(pnl)<-150000, we are
> looking for the number of bowlers per league where the
> sum(frames)>200.  Example below
> 
>


Hi John,

I had to play around with this for a while, but I got there in the end.
The problem with the version I suggested was that the subquery produces
a number of rows for each league. When we join that subquery back to the
'league' table, you get a combinatorial explosion in the number of rows
returned. Joining is generally only useful if the field that you are
joining on is unique in at least one of the tables ('league' wasn't
unique in either).

The actual solution is much simpler than I expected. I didn't realise
that session.query doesn't actually need to start with a mapped class or
attribute at all - you can ask it directly for columns in the subquery.
Here's the result:

    q1 = (session.query(Frame.league, Frame.bowler, total_score)
          .group_by(Frame.league, Frame.bowler)
          .having(total_score<200)).subquery()

    q2 = (session.query(q1.c.league, func.count('*'))
          .group_by(q1.c.league))

By using 'q1.c.league' instead of 'Frame.league', SA doesn't try and put
the 'league' table in the outer query. The SQL looks like this:

SELECT anon_1.league AS anon_1_league,
       count(?) AS count_1
FROM (SELECT frame.league AS league,
      frame.bowler AS bowler,
      sum(frame.score) AS sum_1
      FROM frame
      GROUP BY frame.league, frame.bowler
      HAVING sum(frame.score) < ?) AS anon_1
GROUP BY anon_1.league

Which is pretty much the query we wanted, apart from the names. I hope
it works in your original example as well!

Cheers,

Simon

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~----------~----~----~----~------~----~------~--~---

bowlers.py
Description: bowlers.py

[sqlalchemy] Re: select where field=max(field)

Reply via email to