[sqlalchemy] Re: select where field=max(field)

2008-11-11 Thread John Hunter

On Tue, Nov 11, 2008 at 4:31 AM, King Simon-NFHD78
<[EMAIL PROTECTED]> wrote:

> Which is pretty much the query we wanted, apart from the names. I hope
> it works in your original example as well!


This worked great -- and I learned a bunch of useful sql and
sqlalchemy tricks along the way.  Many thanks for taking the time to
walk me through this one.

JDH

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~--~~~~--~~--~--~---



[sqlalchemy] Re: select where field=max(field)

2008-11-11 Thread King Simon-NFHD78
> -Original Message-
> From: sqlalchemy@googlegroups.com 
> [mailto:[EMAIL PROTECTED] On Behalf Of John Hunter
> Sent: 11 November 2008 01:54
> To: sqlalchemy@googlegroups.com
> Subject: [sqlalchemy] Re: select where field=max(field)
> 
> 
> On Mon, Nov 10, 2008 at 11:10 AM, Michael Bayer
> <[EMAIL PROTECTED]> wrote:
> 
> > you need an extra tuple on the join, query.join((q1, s.s==q1.c.s))
> 
> This gets past the syntax error, but does not produce the right
> results.  I had to take some time off today to work on other problems,
> but am now returning to this query.  To better take advantage of all
> of your generous time :-) I wrote a free-standing example that
> populates a test database.  The initial query Simon suggested works
> and produces the desired output -- the goal is to replicate this with
> a sqlalchemy query.  I also include the join we were working on.  Now
> that the syntax is correct, it runs, but gives the wrong output.
> 
> Since someone proposed a bowling example earlier I decided to run with
> that since it fits my problem quite well: instead of finding the
> number of symbols per strategy where the sum(pnl)<-15, we are
> looking for the number of bowlers per league where the
> sum(frames)>200.  Example below
> 
> 

Hi John,

I had to play around with this for a while, but I got there in the end.
The problem with the version I suggested was that the subquery produces
a number of rows for each league. When we join that subquery back to the
'league' table, you get a combinatorial explosion in the number of rows
returned. Joining is generally only useful if the field that you are
joining on is unique in at least one of the tables ('league' wasn't
unique in either).

The actual solution is much simpler than I expected. I didn't realise
that session.query doesn't actually need to start with a mapped class or
attribute at all - you can ask it directly for columns in the subquery.
Here's the result:

q1 = (session.query(Frame.league, Frame.bowler, total_score)
  .group_by(Frame.league, Frame.bowler)
  .having(total_score<200)).subquery()

q2 = (session.query(q1.c.league, func.count('*'))
  .group_by(q1.c.league))

By using 'q1.c.league' instead of 'Frame.league', SA doesn't try and put
the 'league' table in the outer query. The SQL looks like this:

SELECT anon_1.league AS anon_1_league,
   count(?) AS count_1
FROM (SELECT frame.league AS league,
  frame.bowler AS bowler,
  sum(frame.score) AS sum_1
  FROM frame
  GROUP BY frame.league, frame.bowler
  HAVING sum(frame.score) < ?) AS anon_1
GROUP BY anon_1.league

Which is pretty much the query we wanted, apart from the names. I hope
it works in your original example as well!

Cheers,

Simon

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~--~~~~--~~--~--~---



bowlers.py
Description: bowlers.py


[sqlalchemy] Re: select where field=max(field)

2008-11-10 Thread John Hunter

On Mon, Nov 10, 2008 at 11:10 AM, Michael Bayer
<[EMAIL PROTECTED]> wrote:

> you need an extra tuple on the join, query.join((q1, s.s==q1.c.s))

This gets past the syntax error, but does not produce the right
results.  I had to take some time off today to work on other problems,
but am now returning to this query.  To better take advantage of all
of your generous time :-) I wrote a free-standing example that
populates a test database.  The initial query Simon suggested works
and produces the desired output -- the goal is to replicate this with
a sqlalchemy query.  I also include the join we were working on.  Now
that the syntax is correct, it runs, but gives the wrong output.

Since someone proposed a bowling example earlier I decided to run with
that since it fits my problem quite well: instead of finding the
number of symbols per strategy where the sum(pnl)<-15, we are
looking for the number of bowlers per league where the
sum(frames)>200.  Example below


Thanks for any additional input!

import datetime
import sqlalchemy as sa
import sqlalchemy.orm as orm
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy import func
Base = declarative_base()

class Frame(Base):
__tablename__ = 'frame'

league = sa.Column(sa.String(12), primary_key=True)
bowler = sa.Column(sa.String(12), primary_key=True)
frame = sa.Column(sa.Integer, primary_key=True)
score =  sa.Column(sa.Integer)


def __init__(self, league, bowler, frame, score):
self.league = league
self.bowler = bowler
self.frame = frame
self.score = score

def __repr__(self):
return "Game('%s', '%s', '%d', '%d')"%(self.league,
self.bowler, self.frame, self.score)

def populate(session):
'add some random bowling data to the dbase'
import random
for league in 'strikers', 'punters', 'plungers', 'scorers':
for i in range(random.randint(3,10)):
bowler = chr(i+65)
for frame in range(1, 11):
score = random.randint(0,30)
session.add(Frame(league, bowler, frame, score))

session.commit()

if __name__=='__main__':
engine = sa.create_engine("sqlite:///test.db")
Base.metadata.bind = engine

Session = orm.sessionmaker()
session = Session(bind=engine)
Base.metadata.drop_all()
Base.metadata.create_all()
populate(session)

# this is what we are trying to achieve
query = """\
SELECT league, COUNT(*)
FROM (SELECT frame.league AS league,
frame.bowler AS frame_bowler,
sum(frame.score) AS sum_1
 FROM frame
 GROUP BY frame.league, frame.bowler
 HAVING sum(frame.score) < 200) AS frames
GROUP BY league
"""
print 'desired', session.execute(query).fetchall()

# this is what Simon suggested

total_score = func.sum(Frame.score)
q1 = (session.query(Frame.league, Frame.bowler,
total_score).group_by(Frame.league,
Frame.bowler).having(total_score<200)).subquery()

q2 = (session.query(Frame.league, func.count('*')).join((q1,
Frame.league==q1.c.league)).group_by(Frame.league))

print q2
print q2.all()

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~--~~~~--~~--~--~---



[sqlalchemy] Re: select where field=max(field)

2008-11-10 Thread John Hunter

On Mon, Nov 10, 2008 at 10:05 AM, King Simon-NFHD78
<[EMAIL PROTECTED]> wrote:

> Actually, the section after that (Using Subqueries) probably does
> something very close to what you want. What's the result of these lines:
>
> q1 = (session.query(Snapshot.strategy, Snapshot.symbol, sum_pnl)
>  .group_by(Snapshot.strategy, Snapshot.symbol)
>  .having(sum_pnl<-15000)).subquery()
>
> q2 = (session.query(Snapshot.strategy, func.count('*'))
>  .join(q1, Snapshot.strategy = q1.c.strategy)
>  .groupby(Snapshot.strategy))


I assume you mean '=='? I get a foreign key error on the join::

sum_pnl = func.sum(Snapshot.pnl)

q1 = (session.query(Snapshot.strategy, Snapshot.symbol, sum_pnl)
 .group_by(Snapshot.strategy, Snapshot.symbol)
 .having(sum_pnl<-15000)).subquery()

q2 = (session.query(Snapshot.strategy, func.count('*'))
 .join(q1, Snapshot.strategy==q1.c.strategy)
 .groupby(Snapshot.strategy))

  File 
"/home/titan/johnh/dev/lib/python2.4/site-packages/SQLAlchemy-0.5.0rc3dev_r5205-py2.4.egg/sqlalchemy/sql/util.py",
line 109, in join_condition
raise exc.ArgumentError(
ArgumentError: Can't find any foreign key relationships between
'snapshot' and '{ANON 157186924 anon}'

Still playing with the aliases in the link you referred me to above
but haven't gotten there yet...

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~--~~~~--~~--~--~---



[sqlalchemy] Re: select where field=max(field)

2008-11-10 Thread Michael Bayer


On Nov 10, 2008, at 12:08 PM, John Hunter wrote:

>
> On Mon, Nov 10, 2008 at 10:05 AM, King Simon-NFHD78
> <[EMAIL PROTECTED]> wrote:
>
>> Actually, the section after that (Using Subqueries) probably does
>> something very close to what you want. What's the result of these  
>> lines:
>>
>> q1 = (session.query(Snapshot.strategy, Snapshot.symbol, sum_pnl)
>> .group_by(Snapshot.strategy, Snapshot.symbol)
>> .having(sum_pnl<-15000)).subquery()
>>
>> q2 = (session.query(Snapshot.strategy, func.count('*'))
>> .join(q1, Snapshot.strategy = q1.c.strategy)
>> .groupby(Snapshot.strategy))
>
>
> I assume you mean '=='? I get a foreign key error on the join::
>
>sum_pnl = func.sum(Snapshot.pnl)
>
>q1 = (session.query(Snapshot.strategy, Snapshot.symbol, sum_pnl)
> .group_by(Snapshot.strategy, Snapshot.symbol)
> .having(sum_pnl<-15000)).subquery()
>
>q2 = (session.query(Snapshot.strategy, func.count('*'))
> .join(q1, Snapshot.strategy==q1.c.strategy)
> .groupby(Snapshot.strategy))
>
>  File "/home/titan/johnh/dev/lib/python2.4/site-packages/ 
> SQLAlchemy-0.5.0rc3dev_r5205-py2.4.egg/sqlalchemy/sql/util.py",
> line 109, in join_condition
>raise exc.ArgumentError(
> ArgumentError: Can't find any foreign key relationships between
> 'snapshot' and '{ANON 157186924 anon}'
>
> Still playing with the aliases in the link you referred me to above
> but haven't gotten there yet...
>
you need an extra tuple on the join, query.join((q1, s.s==q1.c.s))

i think I might need to look into raising an error when the arguments  
aren't sent properly, im not sure why it doesn't do that already.



--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~--~~~~--~~--~--~---



[sqlalchemy] Re: select where field=max(field)

2008-11-10 Thread King Simon-NFHD78

> -Original Message-
> From: sqlalchemy@googlegroups.com 
> [mailto:[EMAIL PROTECTED] On Behalf Of John Hunter
> Sent: 10 November 2008 15:29
> To: sqlalchemy@googlegroups.com
> Subject: [sqlalchemy] Re: select where field=max(field)
> 
> 
> On Mon, Nov 10, 2008 at 8:53 AM, King Simon-NFHD78
> <[EMAIL PROTECTED]> wrote:
> 
> > It should be fairly easy to build that query with SA's underlying
> > expression language. I'm not certain how to do it through 
> session.query,
> > but I'm sure it's possible.
> 
> The snippet you posted does do what I want when inserted directly into
> mysql database.  I am getting close on the sqlalchemy incantation::
> 
> 
> In [106]: q = session.query(Snapshot.strategy, Snapshot.symbol,
> sum_pnl).group_by(Snapshot.strategy,
> Snapshot.symbol).having(sum_pnl<-15000)
> 
> In [107]: newq = session.query(Snapshot.strategy, func.count('*'))
> 
> In [108]: print newq.select_from (q.subquery
> ()).group_by(Snapshot.strategy )
> SELECT anon_1.strategy AS anon_1_strategy, count(%s) AS count_1
> FROM (SELECT snapshot.strategy AS strategy, snapshot.symbol AS
> symbol, sum(snapshot.pnl) AS sum_1
> FROM snapshot GROUP BY snapshot.strategy, snapshot.symbol
> HAVING sum(snapshot.pnl) < %s) AS anon_1 GROUP BY 
> snapshot.strategy
> 
> The main problem is that the last "GROUP BY snapshot.strategy" is not
> what I want, since I need to be doing "GROUP BY anon_1.strategy".  Is
> there some label magic I can employ to explicitly refer to the
> Snapshot.query from the original "q" rather than the one from "newq"?
> 

I haven't used session.query to do anything more than simple queries, so
I don't honestly know. Does this section of the documentation help at
all?

http://www.sqlalchemy.org/docs/05/ormtutorial.html#datamapping_joins_ali
ases

Actually, the section after that (Using Subqueries) probably does
something very close to what you want. What's the result of these lines:

q1 = (session.query(Snapshot.strategy, Snapshot.symbol, sum_pnl)
  .group_by(Snapshot.strategy, Snapshot.symbol)
  .having(sum_pnl<-15000)).subquery()

q2 = (session.query(Snapshot.strategy, func.count('*'))
  .join(q1, Snapshot.strategy = q1.c.strategy)
  .groupby(Snapshot.strategy))

print q2

Simon

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~--~~~~--~~--~--~---



[sqlalchemy] Re: select where field=max(field)

2008-11-10 Thread John Hunter

On Mon, Nov 10, 2008 at 8:53 AM, King Simon-NFHD78
<[EMAIL PROTECTED]> wrote:

> It should be fairly easy to build that query with SA's underlying
> expression language. I'm not certain how to do it through session.query,
> but I'm sure it's possible.

The snippet you posted does do what I want when inserted directly into
mysql database.  I am getting close on the sqlalchemy incantation::


In [106]: q = session.query(Snapshot.strategy, Snapshot.symbol,
sum_pnl).group_by(Snapshot.strategy,
Snapshot.symbol).having(sum_pnl<-15000)

In [107]: newq = session.query(Snapshot.strategy, func.count('*'))

In [108]: print newq.select_from (q.subquery
()).group_by(Snapshot.strategy )
SELECT anon_1.strategy AS anon_1_strategy, count(%s) AS count_1
FROM (SELECT snapshot.strategy AS strategy, snapshot.symbol AS
symbol, sum(snapshot.pnl) AS sum_1
FROM snapshot GROUP BY snapshot.strategy, snapshot.symbol
HAVING sum(snapshot.pnl) < %s) AS anon_1 GROUP BY snapshot.strategy

The main problem is that the last "GROUP BY snapshot.strategy" is not
what I want, since I need to be doing "GROUP BY anon_1.strategy".  Is
there some label magic I can employ to explicitly refer to the
Snapshot.query from the original "q" rather than the one from "newq"?

Thanks,
JDH

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~--~~~~--~~--~--~---



[sqlalchemy] Re: select where field=max(field)

2008-11-10 Thread King Simon-NFHD78

> -Original Message-
> From: sqlalchemy@googlegroups.com 
> [mailto:[EMAIL PROTECTED] On Behalf Of John Hunter
> Sent: 10 November 2008 14:07
> To: sqlalchemy@googlegroups.com
> Subject: [sqlalchemy] Re: select where field=max(field)
> 
> 
> On Mon, Nov 10, 2008 at 4:33 AM, King Simon-NFHD78
> <[EMAIL PROTECTED]> wrote:
> 
> > I'm no SQL expert, so please take this with a pinch of 
> salt, but as far
> > as I know, conditions in the 'WHERE' clause of an SQL statement are
> > applied BEFORE any grouping, so you can't use grouping 
> functions (such
> > as SUM, MAX etc) in the WHERE clause. Instead, you add a 'HAVING'
> > statement on the end to filter the rows AFTER the grouping.
> 
> Ahh, that helps a lot.
> 
> > BTW, I think the 'no grouping functions in WHERE clause' 
> rule is also
> > the reason why your MAX query didn't work. The  fix that 
> Mike gave you
> > turned that part of your query into a subquery that only 
> produced that
> > single value. This statement:
> 
> I see.  That is why the select worked in my first test case but not
> the second.  In the max case, there was only a single value to return.
>  In the sum case, there was a sum grouped by (strategy, symbol).
> Indeed, the having clause does what I want::
> 
>   In [39]: q = session.query(Snapshot.strategy, Snapshot.symbol,
>sum_pnl).group_by(Snapshot.strategy, 
> Snapshot.symbol).having(sum_pnl<-15000)
> 
>   In [40]: print q SELECT snapshot.strategy AS snapshot_strategy,
> snapshot.symbol AS
> snapshot_symbol, sum(snapshot.pnl) AS sum_1
> FROM snapshot GROUP BY snapshot.strategy, snapshot.symbol
> HAVING sum(snapshot.pnl) < %s
> 
> This produces a list of (strategy, symbol, sum(pnl)) as desired.
> 
> Now what I'm trying to figure out how to do is get a count over each
> strategy of the number of symbols where the sum(pnl)<-15000.  So I
> need to do one group_by over (strategy, symbol) to get the right sums,
> and then one group_by over strategy alone to get the symbol counts
> where the threshold criterion is met.
> 
> To be honest, I don't really know how to do this in pure SQL, so this
> is part sqlalachemy, part SQL question.  In the past, I have done
> naive sql queries and done the extra logic in python, so this time
> around I am trying to be a little more persistent in figuring out the
> sql way to do things.
> 

I think the query should look something like this:

SELECT strategy, COUNT(*)
FROM (SELECT snapshot.strategy AS strategy,
 snapshot.symbol AS snapshot_symbol,
 sum(snapshot.pnl) AS sum_1
  FROM snapshot
  GROUP BY snapshot.strategy, snapshot.symbol
  HAVING sum(snapshot.pnl) < 1500) AS strategies
GROUP BY strategy

Run that by hand on your database and see if you get the results you
expect. The nested query gets the list of strategies that match the
original criteria, and the outer query uses that to produce the counts.
(Note that there are other ways to get the same result. For example, you
could JOIN your snapshot table to the subquery, which might be useful if
you wanted other columns from it in the outer query)

It should be fairly easy to build that query with SA's underlying
expression language. I'm not certain how to do it through session.query,
but I'm sure it's possible.

Simon

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~--~~~~--~~--~--~---



[sqlalchemy] Re: select where field=max(field)

2008-11-10 Thread John Hunter

On Mon, Nov 10, 2008 at 4:33 AM, King Simon-NFHD78
<[EMAIL PROTECTED]> wrote:

> I'm no SQL expert, so please take this with a pinch of salt, but as far
> as I know, conditions in the 'WHERE' clause of an SQL statement are
> applied BEFORE any grouping, so you can't use grouping functions (such
> as SUM, MAX etc) in the WHERE clause. Instead, you add a 'HAVING'
> statement on the end to filter the rows AFTER the grouping.

Ahh, that helps a lot.

> BTW, I think the 'no grouping functions in WHERE clause' rule is also
> the reason why your MAX query didn't work. The  fix that Mike gave you
> turned that part of your query into a subquery that only produced that
> single value. This statement:

I see.  That is why the select worked in my first test case but not
the second.  In the max case, there was only a single value to return.
 In the sum case, there was a sum grouped by (strategy, symbol).
Indeed, the having clause does what I want::

  In [39]: q = session.query(Snapshot.strategy, Snapshot.symbol,
   sum_pnl).group_by(Snapshot.strategy, Snapshot.symbol).having(sum_pnl<-15000)

  In [40]: print q SELECT snapshot.strategy AS snapshot_strategy,
snapshot.symbol AS
snapshot_symbol, sum(snapshot.pnl) AS sum_1
FROM snapshot GROUP BY snapshot.strategy, snapshot.symbol
HAVING sum(snapshot.pnl) < %s

This produces a list of (strategy, symbol, sum(pnl)) as desired.

Now what I'm trying to figure out how to do is get a count over each
strategy of the number of symbols where the sum(pnl)<-15000.  So I
need to do one group_by over (strategy, symbol) to get the right sums,
and then one group_by over strategy alone to get the symbol counts
where the threshold criterion is met.

To be honest, I don't really know how to do this in pure SQL, so this
is part sqlalachemy, part SQL question.  In the past, I have done
naive sql queries and done the extra logic in python, so this time
around I am trying to be a little more persistent in figuring out the
sql way to do things.

Thanks for your explanation!

JDH

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~--~~~~--~~--~--~---



[sqlalchemy] Re: select where field=max(field)

2008-11-10 Thread King Simon-NFHD78

> -Original Message-
> From: sqlalchemy@googlegroups.com 
> [mailto:[EMAIL PROTECTED] On Behalf Of John Hunter
> Sent: 08 November 2008 05:09
> To: sqlalchemy@googlegroups.com
> Subject: [sqlalchemy] Re: select where field=max(field)
> 

[SNIP]

> Here is a query that lists the sum(pnl) for each symbol and strategy
> in my snapshots table
> 
>   session.query(Snapshot.strategy, Snapshot.symbol,
> func.sum(Snapshot.pnl)).group_by(Snapshot.strategy,
> Snapshot.symbol).all()
> 
> That works fine.  But what if I only want to list the rows where the
> sum(pnl)<-15000 ?  I tried a few things:
> 
>   session.query(Snapshot.strategy, Snapshot.symbol,
> func.sum(Snapshot.pnl)).group_by(Snapshot.strategy,
> Snapshot.symbol).filter(func.sum(Snapshot.pnl)<-15000).all()
> 
> but got the error below
> 
> raise exc.DBAPIError.instance(statement, parameters, e,
> connection_invalidated=is_disconnect)
> ProgrammingError: (ProgrammingError) (, 'Invalid use of group
> function') u'SELECT snapshot.strategy AS snapshot_strategy,
> snapshot.symbol AS snapshot_symbol, sum(snapshot.pnl) AS sum_1 \nFROM
> snapshot \nWHERE sum(snapshot.pnl) < %s GROUP BY snapshot.strategy,
> snapshot.symbol' [-15000]
> 

I'm no SQL expert, so please take this with a pinch of salt, but as far
as I know, conditions in the 'WHERE' clause of an SQL statement are
applied BEFORE any grouping, so you can't use grouping functions (such
as SUM, MAX etc) in the WHERE clause. Instead, you add a 'HAVING'
statement on the end to filter the rows AFTER the grouping.

ie. The SQL you want is something like:

SELECT snapshot.strategy AS snapshot_strategy,
   snapshot.symbol AS snapshot_symbol,
   sum(snapshot.pnl) AS sum_1
FROM snapshot
GROUP BY snapshot.strategy, snapshot.symbol
HAVING sum(snapshot.pnl) < 15000

In SA, I think you might be able to write that as (untested):

session.query(Snapshot.strategy,
  Snapshot.symbol,
  func.sum(Snapshot.pnl))
.group_by(Snapshot.strategy, Snapshot.symbol)
.having(func.sum(Snapshot.pnl)<-15000).all()


BTW, I think the 'no grouping functions in WHERE clause' rule is also
the reason why your MAX query didn't work. The  fix that Mike gave you
turned that part of your query into a subquery that only produced that
single value. This statement:

func.max(Snapshot.datetime).select()

...gets turned into something like 'SELECT max(datetime) FROM snapshot'.
This then gets embedded as a subquery into your larger query. It's
probably worth printing the SQL produced by each of the queries so that
you can see the difference.

Hope that helps,

Simon

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~--~~~~--~~--~--~---



[sqlalchemy] Re: select where field=max(field)

2008-11-07 Thread John Hunter

On Fri, Nov 7, 2008 at 3:57 PM, Michael Bayer <[EMAIL PROTECTED]> wrote:
>
> Theres a good tutorial on the topic of GROUP BY from a SQL
> perspective, here:
>
> http://weblogs.sqlteam.com/jeffs/jeffs/archive/2007/07/20/60261.aspx
>
> in this case you probably want
> query.filter(Snapshot.totalqty==func.max(Snapshot.totalqty).select()).

Indeed, that does work:

  
session.query(Snapshot.symbol).filter(Snapshot.datetime==func.max(Snapshot.datetime).select()).all()

and I read the tutorial you pointed me an certainly get the idea that
everything in the select list must be contained in the group by clause
or used in an aggregate function, but I am still unclear on what the
role of the "select" method is on the filter clause above, and how it
pertains to the tutorial message.

I'm now having a related problem that perhaps will shed light on my
confusions...

Here is a query that lists the sum(pnl) for each symbol and strategy
in my snapshots table

  session.query(Snapshot.strategy, Snapshot.symbol,
func.sum(Snapshot.pnl)).group_by(Snapshot.strategy,
Snapshot.symbol).all()

That works fine.  But what if I only want to list the rows where the
sum(pnl)<-15000 ?  I tried a few things:

  session.query(Snapshot.strategy, Snapshot.symbol,
func.sum(Snapshot.pnl)).group_by(Snapshot.strategy,
Snapshot.symbol).filter(func.sum(Snapshot.pnl)<-15000).all()

but got the error below

raise exc.DBAPIError.instance(statement, parameters, e,
connection_invalidated=is_disconnect)
ProgrammingError: (ProgrammingError) (, 'Invalid use of group
function') u'SELECT snapshot.strategy AS snapshot_strategy,
snapshot.symbol AS snapshot_symbol, sum(snapshot.pnl) AS sum_1 \nFROM
snapshot \nWHERE sum(snapshot.pnl) < %s GROUP BY snapshot.strategy,
snapshot.symbol' [-15000]

So I thought I'd try your "select" magic method, which as noted above
I don't really understand how to use.  But I get a new error when I
call the select method on the sum func, eg

In [47]: session.query(Snapshot.strategy, Snapshot.symbol,
func.sum(Snapshot.pnl)).group_by(Snapshot.strategy,
Snapshot.symbol).filter(-15000>func.sum(Snapshot.pnl).select()).all()

Traceback (most recent call last):
  File "", line 1, in ?
  File "", line 1, in 
  File 
"/home/titan/johnh/dev/lib/python2.4/site-packages/SQLAlchemy-0.5.0rc3dev_r5205-py2.4.egg/sqlalchemy/orm/query.py",
line 52, in generate
fn(self, *args[1:], **kw)
  File 
"/home/titan/johnh/dev/lib/python2.4/site-packages/SQLAlchemy-0.5.0rc3dev_r5205-py2.4.egg/sqlalchemy/orm/query.py",
line 624, in filter
raise sa_exc.ArgumentError("filter() argument must be of type
sqlalchemy.sql.ClauseElement or string")
ArgumentError: filter() argument must be of type
sqlalchemy.sql.ClauseElement or string


Sorry to be flailing around a bit here -- hopefully you can shed some
light for me!

JDH

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~--~~~~--~~--~--~---



[sqlalchemy] Re: select where field=max(field)

2008-11-07 Thread Bobby Impollonia

If you are okay with only getting one record in the case of ties you can do
session.query(Snapshot).order_by(Snapshot.totalqty.desc()).first()

On Fri, Nov 7, 2008 at 12:22 PM, John Hunter <[EMAIL PROTECTED]> wrote:
>
> I am having trouble writing a sqlalchemy query which selects all rows
> where a field equals the max for that field, eg
>
>
>  q = 
> session.query(Snapshot).filter(Snapshot.totalqty==func.max(Snapshot.totalqty))
>
> When I try and get the results of the query, I get the error below.
> How should I use func.max here?
>
> __version__ = 0.5.0rc3
>
> In [54]: len(q.all())
> 
> Traceback (most recent call last):
>  File "", line 1, in ?
>  File 
> "/home/titan/johnh/dev/lib/python2.4/site-packages/SQLAlchemy-0.5.0rc3dev_r5205-py2.4.egg/sqlalchemy/orm/query.py",
> line 994, in all
>return list(self)
>  File 
> "/home/titan/johnh/dev/lib/python2.4/site-packages/SQLAlchemy-0.5.0rc3dev_r5205-py2.4.egg/sqlalchemy/orm/query.py",
> line 1082, in __iter__
>return self._execute_and_instances(context)
>  File 
> "/home/titan/johnh/dev/lib/python2.4/site-packages/SQLAlchemy-0.5.0rc3dev_r5205-py2.4.egg/sqlalchemy/orm/query.py",
> line 1085, in _execute_and_instances
>result = self.session.execute(querycontext.statement,
> params=self._params, mapper=self._mapper_zero_or_none(),
> _state=self._refresh_state)
>  File 
> "/home/titan/johnh/dev/lib/python2.4/site-packages/SQLAlchemy-0.5.0rc3dev_r5205-py2.4.egg/sqlalchemy/orm/session.py",
> line 749, in execute
>return self.__connection(engine, close_with_result=True).execute(
>  File 
> "/home/titan/johnh/dev/lib/python2.4/site-packages/SQLAlchemy-0.5.0rc3dev_r5205-py2.4.egg/sqlalchemy/engine/base.py",
> line 826, in execute
>return Connection.executors[c](self, object, multiparams, params)
>  File 
> "/home/titan/johnh/dev/lib/python2.4/site-packages/SQLAlchemy-0.5.0rc3dev_r5205-py2.4.egg/sqlalchemy/engine/base.py",
> line 877, in execute_clauseelement
>return self._execute_compiled(elem.compile(dialect=self.dialect,
> column_keys=keys, inline=len(params) > 1), distilled_params=params)
>  File 
> "/home/titan/johnh/dev/lib/python2.4/site-packages/SQLAlchemy-0.5.0rc3dev_r5205-py2.4.egg/sqlalchemy/engine/base.py",
> line 889, in _execute_compiled
>self.__execute_raw(context)
>  File 
> "/home/titan/johnh/dev/lib/python2.4/site-packages/SQLAlchemy-0.5.0rc3dev_r5205-py2.4.egg/sqlalchemy/engine/base.py",
> line 898, in __execute_raw
>self._cursor_execute(context.cursor, context.statement,
> context.parameters[0], context=context)
>  File 
> "/home/titan/johnh/dev/lib/python2.4/site-packages/SQLAlchemy-0.5.0rc3dev_r5205-py2.4.egg/sqlalchemy/engine/base.py",
> line 942, in _cursor_execute
>self._handle_dbapi_exception(e, statement, parameters, cursor)
>  File 
> "/home/titan/johnh/dev/lib/python2.4/site-packages/SQLAlchemy-0.5.0rc3dev_r5205-py2.4.egg/sqlalchemy/engine/base.py",
> line 924, in _handle_dbapi_exception
>raise exc.DBAPIError.instance(statement, parameters, e,
> connection_invalidated=is_disconnect)
> ProgrammingError: (ProgrammingError) (, 'Invalid use of group
> function') u'SELECT snapshot.symbol AS snapshot_symbol,
> snapshot.strategy AS snapshot_strategy, snapshot.longshort AS
> snapshot_longshort, snapshot.datetime AS snapshot_datetime,
> snapshot.date AS snapshot_date, snapshot.year AS snapshot_year,
> snapshot.month AS snapshot_month, snapshot.qty AS snapshot_qty,
> snapshot.totalqty AS snapshot_totalqty, snapshot.price AS
> snapshot_price, snapshot.possize AS snapshot_possize, snapshot.pnl AS
> snapshot_pnl, snapshot.realized AS snapshot_realized,
> snapshot.pnl_hedged AS snapshot_pnl_hedged, snapshot.is_stop AS
> snapshot_is_stop, snapshot.stop_flag AS snapshot_stop_flag \nFROM
> snapshot \nWHERE snapshot.totalqty = max(snapshot.totalqty)' []
>
> >
>

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~--~~~~--~~--~--~---



[sqlalchemy] Re: select where field=max(field)

2008-11-07 Thread Michael Bayer

Theres a good tutorial on the topic of GROUP BY from a SQL  
perspective, here:

http://weblogs.sqlteam.com/jeffs/jeffs/archive/2007/07/20/60261.aspx

in this case you probably want  
query.filter(Snapshot.totalqty==func.max(Snapshot.totalqty).select()).



On Nov 7, 2008, at 3:22 PM, John Hunter wrote:

>
> I am having trouble writing a sqlalchemy query which selects all rows
> where a field equals the max for that field, eg
>
>
>  q =  
> session 
> .query 
> (Snapshot).filter(Snapshot.totalqty==func.max(Snapshot.totalqty))
>
> When I try and get the results of the query, I get the error below.
> How should I use func.max here?
>
> __version__ = 0.5.0rc3
>
> In [54]: len(q.all())
> 
> Traceback (most recent call last):
>  File "", line 1, in ?
>  File "/home/titan/johnh/dev/lib/python2.4/site-packages/ 
> SQLAlchemy-0.5.0rc3dev_r5205-py2.4.egg/sqlalchemy/orm/query.py",
> line 994, in all
>return list(self)
>  File "/home/titan/johnh/dev/lib/python2.4/site-packages/ 
> SQLAlchemy-0.5.0rc3dev_r5205-py2.4.egg/sqlalchemy/orm/query.py",
> line 1082, in __iter__
>return self._execute_and_instances(context)
>  File "/home/titan/johnh/dev/lib/python2.4/site-packages/ 
> SQLAlchemy-0.5.0rc3dev_r5205-py2.4.egg/sqlalchemy/orm/query.py",
> line 1085, in _execute_and_instances
>result = self.session.execute(querycontext.statement,
> params=self._params, mapper=self._mapper_zero_or_none(),
> _state=self._refresh_state)
>  File "/home/titan/johnh/dev/lib/python2.4/site-packages/ 
> SQLAlchemy-0.5.0rc3dev_r5205-py2.4.egg/sqlalchemy/orm/session.py",
> line 749, in execute
>return self.__connection(engine, close_with_result=True).execute(
>  File "/home/titan/johnh/dev/lib/python2.4/site-packages/ 
> SQLAlchemy-0.5.0rc3dev_r5205-py2.4.egg/sqlalchemy/engine/base.py",
> line 826, in execute
>return Connection.executors[c](self, object, multiparams, params)
>  File "/home/titan/johnh/dev/lib/python2.4/site-packages/ 
> SQLAlchemy-0.5.0rc3dev_r5205-py2.4.egg/sqlalchemy/engine/base.py",
> line 877, in execute_clauseelement
>return self._execute_compiled(elem.compile(dialect=self.dialect,
> column_keys=keys, inline=len(params) > 1), distilled_params=params)
>  File "/home/titan/johnh/dev/lib/python2.4/site-packages/ 
> SQLAlchemy-0.5.0rc3dev_r5205-py2.4.egg/sqlalchemy/engine/base.py",
> line 889, in _execute_compiled
>self.__execute_raw(context)
>  File "/home/titan/johnh/dev/lib/python2.4/site-packages/ 
> SQLAlchemy-0.5.0rc3dev_r5205-py2.4.egg/sqlalchemy/engine/base.py",
> line 898, in __execute_raw
>self._cursor_execute(context.cursor, context.statement,
> context.parameters[0], context=context)
>  File "/home/titan/johnh/dev/lib/python2.4/site-packages/ 
> SQLAlchemy-0.5.0rc3dev_r5205-py2.4.egg/sqlalchemy/engine/base.py",
> line 942, in _cursor_execute
>self._handle_dbapi_exception(e, statement, parameters, cursor)
>  File "/home/titan/johnh/dev/lib/python2.4/site-packages/ 
> SQLAlchemy-0.5.0rc3dev_r5205-py2.4.egg/sqlalchemy/engine/base.py",
> line 924, in _handle_dbapi_exception
>raise exc.DBAPIError.instance(statement, parameters, e,
> connection_invalidated=is_disconnect)
> ProgrammingError: (ProgrammingError) (, 'Invalid use of group
> function') u'SELECT snapshot.symbol AS snapshot_symbol,
> snapshot.strategy AS snapshot_strategy, snapshot.longshort AS
> snapshot_longshort, snapshot.datetime AS snapshot_datetime,
> snapshot.date AS snapshot_date, snapshot.year AS snapshot_year,
> snapshot.month AS snapshot_month, snapshot.qty AS snapshot_qty,
> snapshot.totalqty AS snapshot_totalqty, snapshot.price AS
> snapshot_price, snapshot.possize AS snapshot_possize, snapshot.pnl AS
> snapshot_pnl, snapshot.realized AS snapshot_realized,
> snapshot.pnl_hedged AS snapshot_pnl_hedged, snapshot.is_stop AS
> snapshot_is_stop, snapshot.stop_flag AS snapshot_stop_flag \nFROM
> snapshot \nWHERE snapshot.totalqty = max(snapshot.totalqty)' []
>
> >


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~--~~~~--~~--~--~---