Re: [sqlalchemy] how to tell which tables already joined in a query statement (or best way to dynamically build a query?)

2015-08-07 Thread Mike Bayer



On 8/7/15 10:35 AM, Brian Cherinka wrote:


Hi,

I'm trying to build an SQLalchemy ORM query dynamically based on user 
selected options.  Some of these options come from the same table, but 
the user could select either one or both criteria to filter on.  Since 
I don't know which options the user will select ahead of time, I have 
to join to the same table multiple times.  However this throws an error


ProgrammingError: (psycopg2.ProgrammingError) table name TableB 
specified more than once


when I try to submit the query.  How can I find out which tables have 
already been joined in a query?  Or what's the best way to handle 
building a query based on multiple criteria?   I'm using SQLalchemy 
1.0.0.

Here is my pseudo-code.

Option 1.  Option 2.  Option 3.   (any or all options can be selected, 
and they all come from the same joined table)


// base table
query = session.query(TableA)

// add on new criteria
if option 1: query = query.join(TableB).filter(TableB.option1  X )
if option 2: query = query.join(TableB).filter(TableB.option2  X )
if option 3: query = query.join(TableB).filter(TableB.option3  X )

However, when attempting query.all(), this throws the above error, if 
I have selected any two options.   What I think it should be is 
something like this...


//base
query = session.query(TableA)
//join
query = query.join(TableB)
// add on new criteria
if option 1: query = query.filter(TableB.option1  X )
if option 2: query = query.filter(TableB.option2  X )
if option 3: query = query.filter(TableB.option3  X )

but I don't want to join to TableB if I don't have to.  I have many 
different tables where this kind of situation applies, and it seems 
inefficient to join to all other tables just in case I may need to 
filter on something.


This is a thing for which there is a plan to make this really clear and 
doable.   But right now that is only a plan.   It's not necessarily 
straightforward, in the totally open-ended case, to determine every 
table that will be in the FROM clause, given that if a query has 
filter(X.foo == 'bar'), now X is in the FROM list, which you wouldn't 
know until you generate the core Select statement.The problem of 
determining exactly what outer JOINs and such are present in an easy and 
performant way is a large problem to be solved and you can see the 
proposed, eventual API for this at 
https://bitbucket.org/zzzeek/sqlalchemy/issues/3225/query-heuristic-inspection-system#comment-12988632.


So given that this is a big new feature that's currently targeted at 
least a year away, for now we need to keep things simple.


If I really had simple imperative code all in one place like that, I'd 
probably just refer to the flag twice:


if option1 or option2 or option3:
query = query.join(TableB)

if option1: # etc
if option2: # etc.

But I can hear you cringing, so you'd have to just keep track of what 
you're joining.So still keeping it boringly simple, use a set.   To 
make this maybe less tedious, we'll build a closure function so that we 
can be q a little more succinct:


def joiner():
already_joined = set()
def join(q, ent):
if ent not in already_joined:
already_joined.add(ent)
q = q.join(ent)
return q
return join

def my_filter_thing(option1, option2, option3):
q = session.query(A)
join = joiner()

if option1:
q = join(q, B).filter(B.option1 == foo)
if option2:
q = join(q, B).filter(B.option2 == bar)

Then I hear, OK but I'm passing the Query to other functions and I don't 
want to ship that extra thing along with it everywhere.  So we can stick 
the already_joined on the Query directly.  But the thing to track here 
is that this is an immutable object so that the previous version of the 
Query doesn't have the newer state on it, in case you are forking off a 
Query object into multiple versions of itself:


def join(q, ent):
if not hasattr(q, '_already_joined'):
already_joined = q._already_joined = frozenset()
else:
already_joined = q._already_joined
if ent not in already_joined:
q = q.join(ent)
q._already_joined = q._already_joined.union([ent])
return q


def my_filter_thing(option1, option2, option3):
q = session.query(A)

if option1:
q = join(q, B).filter(B.option1 == foo)
if option2:
q = join(q, B).filter(B.option2 == bar)


Another option is to try to anticipate what we'll be doing in #3225; 
that is, look in query._from_obj.  This is where we're poking around in 
things not 100% stable API over the long term and the #3225 API would be 
very preferable, but it could be:


from sqlalchemy.sql import util

def join(q, ent):
if ent not in set(util.surface_selectables(q._from_obj[0])):
q = q.join(ent)
return q

surface_selectables() is a helper that basically looks at Join objects 
and pulls out the left and right of each recursively, so it can find 
tables.  query._from_obj 

[sqlalchemy] how to tell which tables already joined in a query statement (or best way to dynamically build a query?)

2015-08-07 Thread Brian Cherinka

Hi, 

I'm trying to build an SQLalchemy ORM query dynamically based on user 
selected options.  Some of these options come from the same table, but the 
user could select either one or both criteria to filter on.  Since I don't 
know which options the user will select ahead of time, I have to join to 
the same table multiple times.  However this throws an error 

ProgrammingError: (psycopg2.ProgrammingError) table name TableB specified 
more than once

when I try to submit the query.  How can I find out which tables have 
already been joined in a query?  Or what's the best way to handle building 
a query based on multiple criteria?   I'm using SQLalchemy 1.0.0. 

Here is my pseudo-code.  

Option 1.  Option 2.  Option 3.   (any or all options can be selected, and 
they all come from the same joined table)

// base table
query = session.query(TableA)

// add on new criteria
if option 1: query = query.join(TableB).filter(TableB.option1  X )
if option 2: query = query.join(TableB).filter(TableB.option2  X )
if option 3: query = query.join(TableB).filter(TableB.option3  X )

However, when attempting query.all(), this throws the above error, if I 
have selected any two options.   What I think it should be is something 
like this...

//base
query = session.query(TableA)
//join
query = query.join(TableB)
// add on new criteria
if option 1: query = query.filter(TableB.option1  X )
if option 2: query = query.filter(TableB.option2  X )
if option 3: query = query.filter(TableB.option3  X )

but I don't want to join to TableB if I don't have to.  I have many 
different tables where this kind of situation applies, and it seems 
inefficient to join to all other tables just in case I may need to filter 
on something.  

Any thoughts, help or suggestions?
Thanks, Brian





-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.