Re: [sqlalchemy] Filters on polymorphic unions

Benjamin Gonzalez Tue, 09 Oct 2012 11:17:41 -0700

Thanks for the valuable suggestions. You are actually correct: the scheme 
that they are using is sharding across tables, that's why the structure is 
the same.


And the way I was setting things up is very similar to the example you 
mentioned (http://www.sqlalchemy.org/trac/wiki/UsageRecipes/EntityName) so 
that I could query multiple table names using the same class. I put 
together a small example that using that and a simplified version of 
Manager and Engineer. Constructing the union the way you suggested works 
(the criterion goes in the subquery's WHERE clause instead of outside of 
the union). 

SELECT anon_1.manager_employee_id AS anon_1_manager_employee_id, 
anon_1.manager_name AS anon_1_manager_name 
FROM (SELECT manager.employee_id AS manager_employee_id, manager.name AS 
manager_name 
FROM manager 
WHERE manager.name = :name_1 UNION ALL SELECT engineer.employee_id AS 
engineer_employee_id, engineer.name AS engineer_name 
FROM engineer 
WHERE engineer.name = :name_2) AS anon_1

But now the problem seems to be that SQLAlchemy will only return rows 
matching the first class. 

Here's the example code. In this case there is 1 row in the manager & 
engineer tables that contains the name "test". After constructing the 
union, I was hoping the query would return two results but it will only 
return one. However if I run count() on the query, it correctly states that 
there are two rows. Am I missing something else?

class Employee(object):
    def __init__(self, name):
        self.name = name
    def __repr__(self):
        return self.__class__.__name__ + " " + self.name

managers_table = Table('manager', metadata,
    Column('employee_id', Integer, primary_key=True),
    Column('name', String(50)),
)

engineers_table = Table('engineer', metadata,
    Column('employee_id', Integer, primary_key=True),
    Column('name', String(50)),
)

def map_class_to_some_table(cls, table, entity_name, **kw):
     newcls = type(entity_name, (cls, ), {})
     mapper(newcls, table, **kw)
     return newcls

Manager = map_class_to_some_table(Employee, managers_table, 'Manager')
Engineer = map_class_to_some_table(Employee, engineers_table, 'Engineer')

q1 = session.query(Manager).filter(Manager.name=='test')
q2 = session.query(Engineer).filter(Engineer.name=='test')

u = q1.union_all(q2)
print u.count() #count = 2
rows = u.all()
print len(rows) #len = 1
print rows[0] # only contains the manager result

On Tuesday, October 9, 2012 9:35:32 AM UTC-5, Michael Bayer wrote:
>
>
> On Oct 9, 2012, at 9:54 AM, Benjamin Gonzalez wrote:
>
>
>  Hi,
>
>  I've been using SQLAlchemy for the last couple of weeks. I'll definitely 
> call myself a beginner so I apologize if I'm missing something obvious with 
> this question :)
>
>
>  I'm working with a database where there is data split across tables with 
> the exact same structure, so there are cases where I need to query more 
> than one table and filter by the same criteria.
>
>
> this sounds like you're describing horizontal partitioning.   such as if a 
> particular row of data could equally well be in one table or the other, and 
> its only some kind of sharding scheme that determines which table.
>
>
> Looking around in the documentation I read about Concrete Table 
> Inheritance and thought that could help me in this case. 
>
>
> yeah concrete inheritance is not really intended for this case, though I 
> can see what you're going for.    
>
>
>  And perform a query such as:
>
>  session.query(Employee).filter(Employee.name=='test')
>
>  It will apply the name filter after the union is done:
>
> SELECT pjoin.employee_id AS pjoin_employee_id, pjoin.name AS pjoin_name, 
> pjoin.type AS pjoin_type, pjoin.manager_data AS pjoin_manager_data, 
> pjoin.engineer_info AS pjoin_engineer_info 
> FROM (SELECT employees.employee_id AS employee_id, employees.name AS 
> name, CAST(NULL AS VARCHAR(50)) AS manager_data, CAST(NULL AS VARCHAR(50)) 
> AS engineer_info, 'employee' AS type 
> FROM employees UNION ALL SELECT managers.employee_id AS employee_id, 
> managers.name AS name, managers.manager_data AS manager_data, CAST(NULL 
> AS VARCHAR(50)) AS engineer_info, 'manager' AS type 
> FROM managers UNION ALL SELECT engineers.employee_id AS employee_id, 
> engineers.name AS name, CAST(NULL AS VARCHAR(50)) AS manager_data, 
> engineers.engineer_info AS engineer_info, 'engineer' AS type 
> FROM engineers) AS pjoin 
> WHERE pjoin.name = :name_1
>
>  Is there a way to ask SQLAlchemy to apply the filter to each table in the 
> inner queries? They all have a name column, and in the case of a large 
> dataset it's helpful because it avoids loading all the rows for each table 
> first and then applying the WHERE clause.
>
>
> oh OK so you're taking advantage of the UNION generated by 
> polymorphic_union to try filtering across all those tables.    this is a 
> funny case because we do have an extension that serves as an example for 
> "horizontal partitioning", which is in sqlalchemy.ext.horizontal_shard, but 
> that example is architected around the idea that you're talking to multiple 
> database backends, each of which has just one table in it.   So it doesn't 
> do the UNION thing which of course is preferable if it is possible.   
>
> One example that might be worth peeking at is how this mapping scheme can 
> be set up without necessarily using Concrete inheritance, though really 
> it's probably easier that you are just using the concrete system, that 
> example is at: http://www.sqlalchemy.org/trac/wiki/UsageRecipes/EntityName.   
> It has a "classical" version and a "declarative" version, though the 
> declarative is a bit awkward.    Stick with the concrete if that's easier.
>
> As far as the query, to come up with an ad-hoc, filtered UNION would have 
> to be invented.    I'd start by working it out manually using Query, then 
> perhaps trying to genericize it, either that you have a simple function 
> that comes up with a Query with unions, or if you really need to do this 
> fluently, as a Query subclass.   Here's sort of the idea of that:
>
> def query_all_shards(session, criterion):
>     q = None
>     for subclass in MyConcreteBase.__subclasses__():
>         if q is None:
>             q = session.query(subclass).filter(criterion)
>         else:
>             union_q = session.query(subclass).filter(criterion)
>             q = q.union_all(union_q)
>     return q
>
> I haven't tested this, but something like that.   Then if you wanted to 
> bake this into Query, you could build a Query subclass that does this, 
> giving it a method like "union_on_shards()" that would trigger a "union" 
> operation similar to the above.    There's a Query subclass example in the 
> sqlalchemy/ext/horizontal_shard.py module, and also in some of the wiki 
> examples such as 
> http://www.sqlalchemy.org/trac/wiki/UsageRecipes/PreFilteredQuery .
>
> Overall I'd stay away from polymorphic_union as a mapper-level default 
> since it generates huge beasts of queries, which you often won't want.
>
>
>
>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To view this discussion on the web visit 
https://groups.google.com/d/msg/sqlalchemy/-/uN72Da50-AoJ.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.

Re: [sqlalchemy] Filters on polymorphic unions

Reply via email to