I allow the user to join with other tables for the purpose of filtering (even though the joined tables won't be selected). Cartesian is probably the wrong term for the effect, but in the end, I get duplicate rows. I could get rid of the need for distinct by extensively using EXISTS clauses instead of joins; this is true.

But when several tables are chained to together with joins, I expect using EXISTS to become less manageable and to perform poorer. (I could be wrong on both accounts.)

For example, our interface may allow the query of Employee records. But the user might join with the EmailAddress table to strictly filter results.

Employee records:
id    name
====  ========
1     kent
2     charlie

EmailAddress records:
empid   address
=======  ============
1        k...@mymail.goo
1        k...@mymail.goo
1        k...@gmail.de
2        char...@gmail.de

session.query(Employee).join(EmailAddress).filter(EmailAddress.contains('@'))

Remember, we are only selecting emp.id, emp.name (but joining with another table).

So without DISTINCT:

id    name
====  ========
1     kent
1     kent
1     kent
2     charlie

With DISTINCT:

id    name
====  ========
1     kent
2     charlie


Like I say, using EXISTS would remove the need for DISTINCT, but I haven't gone down that path...


On 5/31/2013 8:41 AM, Charlie Clark wrote:
Am 30.05.2013, 23:19 Uhr, schrieb Kent <jkentbo...@gmail.com>:

For example, a query may look like this:

select distinct
  count(*) over () as recordcount, tablea.colx, tableb.coly
from tablea, tableb
where ....
limit 100

This doesn't *quite* work because the analytical window function count(*)
over() is applied *before* the distinct, so the count returns the wrong
number (a Cartesian effect, returning 72 instead of 17, in this example).

Why are you generating Cartesian products? DISTINCT is designed to work
on denormalised result sets, ie. those which can contain duplicates.
Can't you avoid this with a join between your tables?

Charlie

--
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.


Reply via email to