Re: [sqlalchemy] subqueryload for a ColumnProperty?
Mapped selects, nice! I wish I'd seen that part in the documentation earlier. Do you think it might be worth a reference near Alternatives to column_property()http://www.sqlalchemy.org/docs/orm/mapper_config.html#alternatives-to-column-propertywhere you also mention object_session()? So there's one more issue. Once there's a lot of customers and orders, this query performs pretty poorly in MySQL (tried 5.5.14 Source and 5.0.92-50-log Percona). To simplify, I'll use joinedload() in my example instead of subqueryload(). f = s.query(Customer).filter_by(id=46).options(joinedload(_num_orders)).all()2011-09-11 03:04:33,598 INFO sqlalchemy.engine.base.Engine BEGIN (implicit)2011-09-11 03:04:33,599 INFO sqlalchemy.engine.base.Engine SELECT customer.id AS customer_id, anon_1.customer_id AS anon_1_customer_id, anon_1.count AS anon_1_count FROM customer LEFT OUTER JOIN (SELECT `order`.customer_id AS customer_id, count(`order`.id) AS count FROM `order` GROUP BY `order`.customer_id) AS anon_1 ON customer.id = anon_1.customer_id WHERE customer.id = %(id_1)s2011-09-11 03:04:33,599 INFO sqlalchemy.engine.base.Engine {'id_1': 46} The EXPLAIN plan: mysql explain SELECT customer.id AS customer_id, anon_1.customer_id AS anon_1_customer_id, anon_1.count AS anon_1_count FROM customer LEFT OUTER JOIN (SELECT `order`.customer_id AS customer_id, count(`order`.id) AS count FROM `order` GROUP BY `order`.customer_id) AS anon_1 ON customer.id = anon_1.customer_id WHERE customer.id = 46 \G *** 1. row *** id: 1 select_type: PRIMARY table: customer type: const possible_keys: PRIMARY key: PRIMARY key_len: 4 ref: const rows: 1 Extra: Using index *** 2. row *** id: 1 select_type: PRIMARY table: derived2 type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 11200 Extra: *** 3. row *** id: 2 select_type: DERIVED table: order type: index possible_keys: NULL key: customer_id key_len: 5 ref: NULL rows: 28725 Extra: Using index3 rows in set (0.04 sec) It results in a huge derived table and a table scan over the order table. Out of curiosity, I ran the same statement in PostgresSQL (9.0.4) and noticed it works much better: sumeet=# explain analyze SELECT customer.id AS customer_id, anon_1.customer_id AS anon_1_customer_id, anon_1.count AS anon_1_count FROM customer LEFT OUTER JOIN (SELECT order.customer_id AS customer_id, count(order.id) AS count FROM order GROUP BY order.customer_id) AS anon_1 ON customer.id = anon_1.customer_id WHERE customer.id = 46; QUERY PLAN -- Nested Loop Left Join (cost=0.00..508.06 rows=1 width=16) (actual time=4.069..4.070 rows=1 loops=1) Join Filter: (customer.id = order.customer_id) - Index Scan using customer_pkey on customer (cost=0.00..8.27 rows=1 width=4) (actual time=0.017..0.018 rows=1 loops=1) Index Cond: (id = 46) - GroupAggregate (cost=0.00..499.77 rows=1 width=8) (actual time=4.046..4.046 rows=1 loops=1) - Seq Scan on order (cost=0.00..499.75 rows=2 width=8) (actual time=0.021..4.042 rows=3 loops=1) Filter: (customer_id = 46) Total runtime: 4.127 ms (8 rows) Not that it helps me any :) I'm on MySQL. I can dramatically improve the query by adding a WHERE clause inside the subquery. I bolded where the change is: mysql explain SELECT customer.id AS customer_id, anon_1.customer_id AS anon_1_customer_id, anon_1.count AS anon_1_count FROM customer LEFT OUTER JOIN (SELECT `order`.customer_id AS customer_id, count(`order`.id) AS count FROM `order` *WHERE customer_id = 46* GROUP BY `order`.customer_id) AS anon_1 ON customer.id = anon_1.customer_id WHERE customer.id = 46 \G *** 1. row *** id: 1 select_type: PRIMARY table: customer type: const possible_keys: PRIMARY key: PRIMARY key_len: 4 ref: const rows: 1 Extra: Using index *** 2. row *** id: 1 select_type: PRIMARY table: derived2 type: system possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 1 Extra: *** 3. row *** id: 2 select_type: DERIVED table: order type: ref possible_keys: customer_id key: customer_id key_len: 5 ref: rows: 3 Extra: Using where; Using index3 rows in
Re: [sqlalchemy] subqueryload for a ColumnProperty?
On Sep 11, 2011, at 7:05 AM, Sumeet Agarwal wrote: It results in a huge derived table and a table scan over the order table. Out of curiosity, I ran the same statement in PostgresSQL (9.0.4) and noticed it works much better: Yeah one thing I do when I use MySQL, is that I assume at the gate that joins are out of the question. MySQL is not a system that is reasonably capable of basic relational tasks. If I'm using MySQL, it's usually because I'm integrating with some existing PHP application or some manager forced some bad decision on me. I know this is not helping and you want this: mysql explain SELECT customer.id AS customer_id, anon_1.customer_id AS anon_1_customer_id, anon_1.count AS anon_1_count FROM customer LEFT OUTER JOIN (SELECT `order`.customer_id AS customer_id, count(`order`.id) AS count FROM `order` WHERE customer_id = 46 GROUP BY `order`.custome r_id) AS anon_1 ON customer.id = anon_1.customer_id WHERE customer.id = 46 So to get exactly the behavior of, copy the WHERE clause into the joined load or subq load, that's not directly possible with the JoinedLoader or SubqueryLoader.It's also the kind of thing that's not very easy to do generically, if your lead query were much more complex than a simple WHERE clause for example. By far the easiest way is to just write it out as two queries and assign the value. You don't even need the whole relationship() thing, just emit a second query for the counts, then set the attribute. That's doityourself.py attached, where I purposely did it with no joins whatsoever since if you have the customer_id here, you don't need the JOIN at all - MySQL hates joins so this approach will definitely be the fastest (plus no SQLAlchemy attribute overhead).You can make it a little more generic by creating a function that you give the WHERE clause to, it then applies the WHERE to both the customer and order tables in some way. If you still wanted it with the JOIN to the parent, and optionally deal with the relationship() to OrderCount again, that approach is actually a specialization of the subqueryload recipe at http://www.sqlalchemy.org/trac/wiki/UsageRecipes/DisjointEagerLoading .The recipe then uses q.from_self(Child).join(Parent.children) to create the join from the parent to child. But here the level of hardcoding is already so great I don't see what it buys you except a slower MySQL query. If you were doing joined loading, you can keep using OrderCount and use contains_eager() to write it out explicitly: subq = s.query(OrderCount).filter(OrderCount.customer_id==1).subquery() for c in s.query(Customer).\ filter(Customer.id==1).\ outerjoin(subq, Customer._num_orders).\ options(contains_eager(_num_orders, alias=subq)): print c.id, c.num_orders The query there still has more nesting than MySQL likes. Next approach, hardcode a bind in there, very brittle but would produce this result: order_counts = select([Order.customer_id, func.count(Order.id).label('count')]).\ where(Order.customer_id==bindparam(cust_id)).\ group_by(Order.customer_id).alias() mapper( OrderCount, order_counts, primary_key=order_counts.c.customer_id ) the joinedload scenario would work like this: s.query(Customer).filter(Customer.id==48).options(joinedload(_num_orders)).params(cust_id=48) subqueryload(), not as easy - we need to use the technique at http://www.sqlalchemy.org/trac/wiki/UsageRecipes/GlobalFilter (which also works with the joinedload version): from sqlalchemy.orm.interfaces import MapperOption class SetABindOption(MapperOption): propagate_to_loaders = True def __init__(self, **kw): self.kw = kw def process_query_conditionally(self, query): process query during a lazyload query._params = query._params.union(self.kw) def process_query(self, query): process query during a primary user query self.process_query_conditionally(query) s.query(Customer).filter(Customer.id==1).options(subqueryload(_num_orders), SetABindOption(cust_id=1)) But I'd stick with, just do an extra SELECT statement unless more genericism is needed. Since it appears this case is so special purpose already and MySQL is going to chug to a halt if you fall back on other methods. -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com. To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en. from sqlalchemy import * from sqlalchemy.orm import * from sqlalchemy.ext.declarative import declarative_base Base= declarative_base() class Customer(Base): __tablename__ = 'customer' id = Column(Integer, primary_key=True) orders = relationship(Order) class Order(Base):
Re: [sqlalchemy] subqueryload for a ColumnProperty?
For a deferred() itself, we don't have an option that does this.Though this is an unusual request. If you definitely want the deferreds to load, what's the issue having them render inline into the original query ?The advantage to subqueryload is primarily in that it loads multiple rows per parent object efficiently, without re-fetching the full parent row many times, or needing to use a less efficient OUTER JOIN. A deferred is always one row per parent - and no join is needed anyway. On Sep 6, 2011, at 6:49 PM, Sumeet Agarwal wrote: I have a collection of deferred `ColumnProperty`s that I'd like to start loading in subsequent queries. I know you can get this to happen on the first access of a `ColumnProperty` that is deferred, but I wonder if there's any way to specify this in the query. For relationships, `joinedload[_all()]` has a counterpart `subqueryload[_all()]`. Is there nothing similar we can do for `ColumnProperty`? -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com. To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en. -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com. To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
Re: [sqlalchemy] subqueryload for a ColumnProperty?
Still getting the hang of SQLAlchemy terminology. I didn't mean a simple deferred() column, but a deferred column_property() which defines a scalar select(). Like in the example here http://www.sqlalchemy.org/docs/orm/mapper_config.html#sql-expressions-as-mapped-attributes . My goal is to, instead of including a correlated subquery in the SELECT. The column_property looks like Customer.num_orders = orm.column_property( select([func.count(Order.id)])\ .where(Order.customer_id == Customer.id)\ .correlate(Customer.__table__), deferred=True ) I'd rather do a separate query to load num_orders, rather than getting it lazily or by using undefer(). It almost feels like I might want to define num_orders as a relationship somehow? I dunno if what I would like to do maps cleanly to a SQLAlchemy pattern. On Wed, Sep 7, 2011 at 7:51 AM, Michael Bayer mike...@zzzcomputing.comwrote: For a deferred() itself, we don't have an option that does this.Though this is an unusual request. If you definitely want the deferreds to load, what's the issue having them render inline into the original query ?The advantage to subqueryload is primarily in that it loads multiple rows per parent object efficiently, without re-fetching the full parent row many times, or needing to use a less efficient OUTER JOIN. A deferred is always one row per parent - and no join is needed anyway. On Sep 6, 2011, at 6:49 PM, Sumeet Agarwal wrote: I have a collection of deferred `ColumnProperty`s that I'd like to start loading in subsequent queries. I know you can get this to happen on the first access of a `ColumnProperty` that is deferred, but I wonder if there's any way to specify this in the query. For relationships, `joinedload[_all()]` has a counterpart `subqueryload[_all()]`. Is there nothing similar we can do for `ColumnProperty`? -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com. To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en. -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com. To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en. -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com. To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
Re: [sqlalchemy] subqueryload for a ColumnProperty?
On Sep 7, 2011, at 5:09 PM, Sumeet Agarwal wrote: Still getting the hang of SQLAlchemy terminology. I didn't mean a simple deferred() column, but a deferred column_property() which defines a scalar select(). Like in the example here http://www.sqlalchemy.org/docs/orm/mapper_config.html#sql-expressions-as-mapped-attributes. My goal is to, instead of including a correlated subquery in the SELECT. The column_property looks like My goal is to..., missing a verb there :). If it were me, I'd think you're asking to unwrap the correlated subquery into a plain JOIN. Since that works more efficiently. I'd rather do a separate query to load num_orders, rather than getting it lazily or by using undefer(). It almost feels like I might want to define num_orders as a relationship somehow? I dunno if what I would like to do maps cleanly to a SQLAlchemy pattern. Using relationship() is kind of the old school approach, but yup I think that actually allows you to get exactly subqueryload() for an attribute, so good idea ! a rare moment that SQLAlchemy amazes even me. here's that since its faster for me to just illustrate from sqlalchemy import * from sqlalchemy.orm import * from sqlalchemy.ext.declarative import declarative_base Base= declarative_base() class Customer(Base): __tablename__ = 'customer' id = Column(Integer, primary_key=True) orders = relationship(Order) @property def num_orders(self): return self._num_orders.count if self._num_orders else 0 class Order(Base): __tablename__ = 'order' id = Column(Integer, primary_key=True) customer_id = Column(Integer, ForeignKey('customer.id')) class OrderCount(object): pass order_counts = select([Order.customer_id, func.count(Order.id).label('count')]).\ group_by(Order.customer_id).alias() mapper( OrderCount, order_counts, primary_key=order_counts.c.customer_id ) Customer._num_orders = relationship(OrderCount, uselist=False) e = create_engine('sqlite://', echo=True) Base.metadata.create_all(e) s = Session(e) s.add_all([ Customer(orders=[Order(), Order(), Order()]), Customer(orders=[Order(), Order()]), Customer(orders=[]), ]) s.commit() for c in s.query(Customer).options(subqueryload(_num_orders)): print c.id, c.num_orders On Wed, Sep 7, 2011 at 7:51 AM, Michael Bayer mike...@zzzcomputing.com wrote: For a deferred() itself, we don't have an option that does this.Though this is an unusual request. If you definitely want the deferreds to load, what's the issue having them render inline into the original query ?The advantage to subqueryload is primarily in that it loads multiple rows per parent object efficiently, without re-fetching the full parent row many times, or needing to use a less efficient OUTER JOIN. A deferred is always one row per parent - and no join is needed anyway. On Sep 6, 2011, at 6:49 PM, Sumeet Agarwal wrote: I have a collection of deferred `ColumnProperty`s that I'd like to start loading in subsequent queries. I know you can get this to happen on the first access of a `ColumnProperty` that is deferred, but I wonder if there's any way to specify this in the query. For relationships, `joinedload[_all()]` has a counterpart `subqueryload[_all()]`. Is there nothing similar we can do for `ColumnProperty`? -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com. To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en. -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com. To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en. -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com. To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en. -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com. To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
[sqlalchemy] subqueryload for a ColumnProperty?
I have a collection of deferred `ColumnProperty`s that I'd like to start loading in subsequent queries. I know you can get this to happen on the first access of a `ColumnProperty` that is deferred, but I wonder if there's any way to specify this in the query. For relationships, `joinedload[_all()]` has a counterpart `subqueryload[_all()]`. Is there nothing similar we can do for `ColumnProperty`? -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com. To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en.