Re: [sqlalchemy] DISTINCT with LIMIT problem
Thanks very much! I got it to work apparently fine using from_self(). I didn't seem to need anything special for eager loads to continue to function... were you only expecting I'd have troubles with eager loads if I used subquery()? On 5/30/2013 6:29 PM, Michael Bayer wrote: On May 30, 2013, at 6:06 PM, Kent jkentbo...@gmail.com wrote: Thank you, I'll try that, but quick concern: I specifically skipped trying to use .subquery() because the docs say Eager JOIN generation within the query is disabled. Doesn't that mean I won't get my joinedload() results from the inner query? Or does that refer to the outer query having eager join disabled? if you want to eager load also from that subquery, you need to sitck it into an aliased: MySubqClass = aliased(MyClass, subq) query(x, MySubqClass).options(joinedload(MySubqClass.foobar)) On 5/30/2013 5:54 PM, Michael Bayer wrote: On May 30, 2013, at 5:19 PM, Kent jkentbo...@gmail.com mailto:jkentbo...@gmail.com wrote: Solution A: Group by all columns (yielding the same effect as distinct), but which makes the window analytical function process *after* the group by and yields the correct count (17 instead of 72): are all those columns indexed? even if they are, crappy query... OR Solution B: Put the count(*) over () in an outer select, like this: select count(*) over () as recordcount, anon.* from ( select distinct tablea.colx, tableb.coly from tablea, tableb ) as anon limit 100 this is very much how SQLAlchemy wants you to do it. Either solution yields the correct answer I believe, but I'm having difficulty translating the SQL that I know will work into sqlalchemy land. For Solution B, I don't know how to wrap my query in an outer select (similar to the LIMIT implementation for Oracle) in a way that will still allow sqlalchemy to extract rows into instances: from_self() can do this (probably use add_column() for the window function), or subquery() should work very well. subq = q.subquery(); q = query(func.count('*').over().label(..), subq); . send me a quick example if that's not working and I'll work out the query. -- You received this message because you are subscribed to a topic in the Google Groups sqlalchemy group. To unsubscribe from this topic, visit https://groups.google.com/d/topic/sqlalchemy/_U28GXXR6sg/unsubscribe?hl=en. To unsubscribe from this group and all its topics, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy?hl=en. For more options, visit https://groups.google.com/groups/opt_out. -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy?hl=en. For more options, visit https://groups.google.com/groups/opt_out. -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy?hl=en. For more options, visit https://groups.google.com/groups/opt_out.
Re: [sqlalchemy] DISTINCT with LIMIT problem
Am 30.05.2013, 23:19 Uhr, schrieb Kent jkentbo...@gmail.com: For example, a query may look like this: select distinct count(*) over () as recordcount, tablea.colx, tableb.coly from tablea, tableb where limit 100 This doesn't *quite* work because the analytical window function count(*) over() is applied *before* the distinct, so the count returns the wrong number (a Cartesian effect, returning 72 instead of 17, in this example). Why are you generating Cartesian products? DISTINCT is designed to work on denormalised result sets, ie. those which can contain duplicates. Can't you avoid this with a join between your tables? Charlie -- Charlie Clark Managing Director Clark Consulting Research German Office Kronenstr. 27a Düsseldorf D- 40217 Tel: +49-211-600-3657 Mobile: +49-178-782-6226 -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy?hl=en. For more options, visit https://groups.google.com/groups/opt_out.
Re: [sqlalchemy] DISTINCT with LIMIT problem
I allow the user to join with other tables for the purpose of filtering (even though the joined tables won't be selected). Cartesian is probably the wrong term for the effect, but in the end, I get duplicate rows. I could get rid of the need for distinct by extensively using EXISTS clauses instead of joins; this is true. But when several tables are chained to together with joins, I expect using EXISTS to become less manageable and to perform poorer. (I could be wrong on both accounts.) For example, our interface may allow the query of Employee records. But the user might join with the EmailAddress table to strictly filter results. Employee records: idname 1 kent 2 charlie EmailAddress records: empid address === 1k...@mymail.goo 1k...@mymail.goo 1k...@gmail.de 2char...@gmail.de session.query(Employee).join(EmailAddress).filter(EmailAddress.contains('@')) Remember, we are only selecting emp.id, emp.name (but joining with another table). So without DISTINCT: idname 1 kent 1 kent 1 kent 2 charlie With DISTINCT: idname 1 kent 2 charlie Like I say, using EXISTS would remove the need for DISTINCT, but I haven't gone down that path... On 5/31/2013 8:41 AM, Charlie Clark wrote: Am 30.05.2013, 23:19 Uhr, schrieb Kent jkentbo...@gmail.com: For example, a query may look like this: select distinct count(*) over () as recordcount, tablea.colx, tableb.coly from tablea, tableb where limit 100 This doesn't *quite* work because the analytical window function count(*) over() is applied *before* the distinct, so the count returns the wrong number (a Cartesian effect, returning 72 instead of 17, in this example). Why are you generating Cartesian products? DISTINCT is designed to work on denormalised result sets, ie. those which can contain duplicates. Can't you avoid this with a join between your tables? Charlie -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy?hl=en. For more options, visit https://groups.google.com/groups/opt_out.
[sqlalchemy] DISTINCT with LIMIT problem
We use func.count().over() in order to help support result pagination. When attempting to limit the result set, I have found that if other tables are being joined (for the where clause, but not selected), then I need to add DISTINCT to the query or else the Cartesian result of my query messes up LIMIT. (There are, say, 72 rows returned, where this only represents 17 distinct records, for example.) For example, a query may look like this: select distinct count(*) over () as recordcount, tablea.colx, tableb.coly from tablea, tableb where limit 100 This doesn't *quite* work because the analytical window function count(*) over() is applied *before* the distinct, so the count returns the wrong number (a Cartesian effect, returning 72 instead of 17, in this example). I have two potential solutions: Solution A: Group by all columns (yielding the same effect as distinct), but which makes the window analytical function process *after* the group by and yields the correct count (17 instead of 72): select count(*) over () as recordcount, tablea.colx, tableb.coly from tablea, tableb where ... group by tablea.colx, tableb.coly *[all columns]* limit 100 OR Solution B: Put the count(*) over () in an outer select, like this: select count(*) over () as recordcount, anon.* from ( select distinct tablea.colx, tableb.coly from tablea, tableb ) as anon limit 100 Either solution yields the correct answer I believe, but I'm having difficulty translating the SQL that I know will work into sqlalchemy land. For Solution A, in the case of wanting to group by, I don't know how to get the full list of all selected columns to add to the group_by in such a way that even joinedload() will be included in the group by: q = Session.query(class).join(joins).filter(...).option(joinedload(...)) q = q.group_by(* ??? How to tell sqlalchemy to group by all selected columns, even those which will be join loaded ???* ) q = q.add_column(func.count().over().label('recordcount')) For Solution B, I don't know how to wrap my query in an outer select (similar to the LIMIT implementation for Oracle) in a way that will still allow sqlalchemy to extract rows into instances: This renders the correct SQL, I think: qry = Session.query(qry.with_labels().statement, func.count().over().label('recordcount')) But I'm using SQL statement here so sqlalchemy won't translate result rows into object instances. Can you point me in the right direction for one of these 2 solutions, please? Many thanks, Kent -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy?hl=en. For more options, visit https://groups.google.com/groups/opt_out.
Re: [sqlalchemy] DISTINCT with LIMIT problem
On May 30, 2013, at 5:19 PM, Kent jkentbo...@gmail.com wrote: Solution A: Group by all columns (yielding the same effect as distinct), but which makes the window analytical function process after the group by and yields the correct count (17 instead of 72): are all those columns indexed? even if they are, crappy query... OR Solution B: Put the count(*) over () in an outer select, like this: select count(*) over () as recordcount, anon.* from ( select distinct tablea.colx, tableb.coly from tablea, tableb ) as anon limit 100 this is very much how SQLAlchemy wants you to do it. Either solution yields the correct answer I believe, but I'm having difficulty translating the SQL that I know will work into sqlalchemy land. For Solution B, I don't know how to wrap my query in an outer select (similar to the LIMIT implementation for Oracle) in a way that will still allow sqlalchemy to extract rows into instances: from_self() can do this (probably use add_column() for the window function), or subquery() should work very well. subq = q.subquery(); q = query(func.count('*').over().label(..), subq); . send me a quick example if that's not working and I'll work out the query. -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy?hl=en. For more options, visit https://groups.google.com/groups/opt_out.
Re: [sqlalchemy] DISTINCT with LIMIT problem
Thank you, I'll try that, but quick concern: I specifically skipped trying to use .subquery() because the docs say Eager JOIN generation within the query is disabled. Doesn't that mean I won't get my joinedload() results from the inner query? Or does that refer to the outer query having eager join disabled? On 5/30/2013 5:54 PM, Michael Bayer wrote: On May 30, 2013, at 5:19 PM, Kent jkentbo...@gmail.com mailto:jkentbo...@gmail.com wrote: Solution A: Group by all columns (yielding the same effect as distinct), but which makes the window analytical function process *after* the group by and yields the correct count (17 instead of 72): are all those columns indexed? even if they are, crappy query... OR Solution B: Put the count(*) over () in an outer select, like this: select count(*) over () as recordcount, anon.* from ( select distinct tablea.colx, tableb.coly from tablea, tableb ) as anon limit 100 this is very much how SQLAlchemy wants you to do it. Either solution yields the correct answer I believe, but I'm having difficulty translating the SQL that I know will work into sqlalchemy land. For Solution B, I don't know how to wrap my query in an outer select (similar to the LIMIT implementation for Oracle) in a way that will still allow sqlalchemy to extract rows into instances: from_self() can do this (probably use add_column() for the window function), or subquery() should work very well. subq = q.subquery(); q = query(func.count('*').over().label(..), subq); . send me a quick example if that's not working and I'll work out the query. -- You received this message because you are subscribed to a topic in the Google Groups sqlalchemy group. To unsubscribe from this topic, visit https://groups.google.com/d/topic/sqlalchemy/_U28GXXR6sg/unsubscribe?hl=en. To unsubscribe from this group and all its topics, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy?hl=en. For more options, visit https://groups.google.com/groups/opt_out. -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy?hl=en. For more options, visit https://groups.google.com/groups/opt_out.
Re: [sqlalchemy] DISTINCT with LIMIT problem
On May 30, 2013, at 6:06 PM, Kent jkentbo...@gmail.com wrote: Thank you, I'll try that, but quick concern: I specifically skipped trying to use .subquery() because the docs say Eager JOIN generation within the query is disabled. Doesn't that mean I won't get my joinedload() results from the inner query? Or does that refer to the outer query having eager join disabled? if you want to eager load also from that subquery, you need to sitck it into an aliased: MySubqClass = aliased(MyClass, subq) query(x, MySubqClass).options(joinedload(MySubqClass.foobar)) On 5/30/2013 5:54 PM, Michael Bayer wrote: On May 30, 2013, at 5:19 PM, Kent jkentbo...@gmail.com mailto:jkentbo...@gmail.com wrote: Solution A: Group by all columns (yielding the same effect as distinct), but which makes the window analytical function process *after* the group by and yields the correct count (17 instead of 72): are all those columns indexed? even if they are, crappy query... OR Solution B: Put the count(*) over () in an outer select, like this: select count(*) over () as recordcount, anon.* from ( select distinct tablea.colx, tableb.coly from tablea, tableb ) as anon limit 100 this is very much how SQLAlchemy wants you to do it. Either solution yields the correct answer I believe, but I'm having difficulty translating the SQL that I know will work into sqlalchemy land. For Solution B, I don't know how to wrap my query in an outer select (similar to the LIMIT implementation for Oracle) in a way that will still allow sqlalchemy to extract rows into instances: from_self() can do this (probably use add_column() for the window function), or subquery() should work very well. subq = q.subquery(); q = query(func.count('*').over().label(..), subq); . send me a quick example if that's not working and I'll work out the query. -- You received this message because you are subscribed to a topic in the Google Groups sqlalchemy group. To unsubscribe from this topic, visit https://groups.google.com/d/topic/sqlalchemy/_U28GXXR6sg/unsubscribe?hl=en. To unsubscribe from this group and all its topics, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy?hl=en. For more options, visit https://groups.google.com/groups/opt_out. -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy?hl=en. For more options, visit https://groups.google.com/groups/opt_out. -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy?hl=en. For more options, visit https://groups.google.com/groups/opt_out.