Re: [sqlalchemy] simple relationship join ends up in 1 s
The columns only contain address strings and not much more. Okay I did some profiling for the customerorder plus two joins on itemsbought and products: 1008030 function calls (968253 primitive calls) in 2.220 seconds Ordered by: cumulative time ncalls tottime percall cumtime percall filename:lineno(function) 10.0000.0002.2112.211 sqltest.py:151(test) 10.0080.0082.1902.190 /home/dumbass/sources/venv/lib/python2.7/site-packages/sqlalchemy/orm/query.py:2294(all) 10.0000.0001.3191.319 /home/dumbass/sources/venv/lib/python2.7/site-packages/sqlalchemy/orm/query.py:2407(__iter__) 10.0000.0001.3121.312 /home/dumbass/sources/venv/lib/python2.7/site-packages/sqlalchemy/orm/query.py:2421(_execute_and_instances) 40.0000.0001.3030.326 /home/dumbass/sources/venv/lib/python2.7/site-packages/sqlalchemy/engine/base.py:661(execute) 10.0000.0001.3001.300 /home/dumbass/sources/venv/lib/python2.7/site-packages/sqlalchemy/sql/elements.py:320(_execute_on_connection) 10.0000.0001.3001.300 /home/dumbass/sources/venv/lib/python2.7/site-packages/sqlalchemy/engine/base.py:790(_execute_clauseelement) 40.0000.0001.2980.324 /home/dumbass/sources/venv/lib/python2.7/site-packages/sqlalchemy/engine/base.py:877(_execute_context) 90.0000.0001.2930.144 /home/dumbass/sources/venv/lib/python2.7/site-packages/MySQLdb/cursors.py:164(execute) 70.0000.0001.2930.185 /home/dumbass/sources/venv/lib/python2.7/site-packages/sqlalchemy/engine/default.py:435(do_execute) 90.0000.0001.2930.144 /home/dumbass/sources/venv/lib/python2.7/site-packages/MySQLdb/cursors.py:353(_query) 17090.0100.0000.8640.001 /home/dumbass/sources/venv/lib/python2.7/site-packages/sqlalchemy/orm/loading.py:27(instances) 22084/55370.2480.0000.7780.000 /home/dumbass/sources/venv/lib/python2.7/site-packages/sqlalchemy/orm/loading.py:330(_instance) 22052/55370.1220.0000.6940.000 /home/dumbass/sources/venv/lib/python2.7/site-packages/sqlalchemy/orm/loading.py:284(populate_state) 90.0000.0000.6850.076 /home/dumbass/sources/venv/lib/python2.7/site-packages/MySQLdb/cursors.py:358(_post_get_result) 90.0000.0000.6840.076 /home/dumbass/sources/venv/lib/python2.7/site-packages/MySQLdb/cursors.py:324(_fetch_row) 90.2060.0230.6840.076 {built-in method fetch_row} 90.0000.0000.6080.068 /home/dumbass/sources/venv/lib/python2.7/site-packages/MySQLdb/cursors.py:315(_do_query) 38290.0090.0000.4260.000 /home/dumbass/sources/venv/lib/python2.7/site-packages/sqlalchemy/orm/strategies.py:1451(load_collection_from_joined_existing_row) 1116040.0730.0000.4090.000 /home/dumbass/sources/venv/lib/python2.7/site-packages/MySQLdb/connections.py:212(string_decoder) 10979/55050.0160.0000.3440.000 /home/dumbass/sources/venv/lib/python2.7/site-packages/sqlalchemy/orm/strategies.py:1475(load_scalar_from_joined_new_row) 1116040.1240.0000.3360.000 {method 'decode' of 'str' objects} as I can see creating objects and the query itself(?) takes a lot of cpupower? But why? It is just a simple query? Is it really the object creating? I just want to iterate over that data to display it in ordermenu for the admin. Or shall I use plain execution of sql for that, is sqlalchemy more for specific queries to get specific data and not the mass data? On Fri, Sep 12, 2014 at 6:40 PM, Jonathan Vanasco jonat...@findmeon.com wrote: I think you need to profile the application like Mike said. Just to be thorough, I'd suggest you time the Query execution and the ORM population. Do you have an example of the faster raw SQL that you're trying to generate ? When you do a query on the commandline it is often faster because the database is only displaying the initial data on the screen. When you're using something like SqlAlchemy, the app needs to get all the data and then create objects. This is usually lightweight, but if you have some columns with a lot of data, there will be a hit. When you profile, you'll have a better idea if the query, the loader, of the connection is to blame. -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy. For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To unsubscribe from this group and
Re: [sqlalchemy] simple relationship join ends up in 1 s
that jump in time between __iter__ and all() seems kind of odd, and I wonder if some of that is in setting up a larger span of memory to store the entire result in a list; using yield_per() often gives much better results, though it isn't compatible with collection-joined eager loading. I'd normally break up the query in some other way. Anyway, MySQLdb which is an ultra fast C library is taking .2 sec just to read the rows, then setting up objects (note there's more than one object to handle per row w/ joined eager loading), populating them and their collections and establishing their identity within the identity map is taking .8 sec, which is typical. if it's any consolation, the instance()/_instance() pair is 25% faster in 1.0, see http://docs.sqlalchemy.org/en/latest/changelog/migration_10.html#orm-full-object-fetches-25-faster. short answer, yes, cPython is a very slow interpreter and creating new objects is expensive, putting them in dictionaries and creating weakrefs to them is expensive (weakrefs is key to how the Session doesn't run out of memory even if you do zillions of queries with it, as long as you throw away the objects), doing lookups to see if we've already seen this identity in the row before is expensive (that storage/lookup is how lots of things get accomplished, including eager loading), and all of these processes deliver to us features that people really like in the ORM. if you don't want all those features, then you can load individual columns and you'll see that's 2-3 times faster (also sped up a little more in 1.0), and if you use Core directly to get the rows there is minimal overhead vs. what MySQLdb incurs directly. More is coming in 1.0 to educate people about this, for the load lots of rows use case take a look at the performance suite in https://bitbucket.org/zzzeek/sqlalchemy/src/b9d430af752b7cc955932a54a8f8db18f46d89a6/examples/performance/large_resultsets.py?at=ticket_3100 for many varieties of how to load 500K rows, here's the performance numbers I get on that test with MySQLdb: test_orm_full_objects_list : Load fully tracked objects using the ORM, all at once into a list (50 iterations); total time 11.050664 sec test_orm_full_objects_chunks : Load fully tracked objects using the ORM using yield per (50 iterations); total time 7.297299 sec test_orm_bundles : Load lightweight bundle objects using the ORM. (50 iterations); total time 2.086640 sec test_orm_columns : Load individual columns into named tuples using the ORM. (50 iterations); total time 1.534253 sec test_core_fetchall : Load Core result rows using fetchall. (50 iterations); total time 1.114845 sec test_core_fetchmany_w_streaming : Load Core result rows using fetchmany/streaming. (50 iterations); total time 0.894745 sec test_core_fetchmany : Load Core result rows using Core / fetchmany. (50 iterations); total time 0.968827 sec test_dbapi_fetchall_plus_append_objects : Load rows using DBAPI fetchall(), generate an object for each row. (50 iterations); total time 1.017718 sec test_dbapi_fetchall_no_object : Load rows using DBAPI fetchall(), don't make any objects. (50 iterations); total time 0.767289 sec I added a test to differentiate between list(query(Class)) vs. for row in query(Class).yield_per(1000) above; the only difference here is that instances() yields out 1000 objects at a time, or just loads the whole thing into a single list at once.You can see a dramatic difference in time spent, 7.29 sec vs. 11 seconds, just for cPython to set up memory big enough to hold all that data at once, so that is probably where the 1.3 seconds becomes 2.2 seconds in your example; calling .all() and not batching the result into smaller chunks. So when you deal with large fields of data, chunking can help a lot, but if you reduce the amount of memory per row by not using objects at all, then you do even better. On Sep 16, 2014, at 5:47 AM, tatütata Okay rogersher...@googlemail.com wrote: The columns only contain address strings and not much more. Okay I did some profiling for the customerorder plus two joins on itemsbought and products: 1008030 function calls (968253 primitive calls) in 2.220 seconds Ordered by: cumulative time ncalls tottime percall cumtime percall filename:lineno(function) 10.0000.0002.2112.211 sqltest.py:151(test) 10.0080.0082.1902.190 /home/dumbass/sources/venv/lib/python2.7/site-packages/sqlalchemy/orm/query.py:2294(all) 10.0000.0001.3191.319 /home/dumbass/sources/venv/lib/python2.7/site-packages/sqlalchemy/orm/query.py:2407(__iter__) 10.0000.0001.3121.312 /home/dumbass/sources/venv/lib/python2.7/site-packages/sqlalchemy/orm/query.py:2421(_execute_and_instances) 40.0000.0001.3030.326
Re: [sqlalchemy] simple relationship join ends up in 1 s
Slightly OT - Why are 5k rows being shown on an admin's order-menu ? 5k rows sounds like the spec for an offline report ( csv/excel ). Admin interfaces typically paginate data into much smaller chunks. Putting 5k entries on a page is going to create a lot of usability concerns. -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy. For more options, visit https://groups.google.com/d/optout.
Re: [sqlalchemy] simple relationship join ends up in 1 s
Thats what I did before, activating echo=True shows me correct query too. This joinedload takes 1.3 seconds for this 5000 entries. And only 4 queries in general for complete side load. Adding one more joinedload ends up in 2 seconds, thats way too much for 1700 customers and 5 k bought items On Thu, Sep 11, 2014 at 10:12 PM, Simon King si...@simonking.org.uk wrote: The query that you are doing: customerorders = sqlsession.query(Customerorder)\ .join(Itemsbought.order)\ .all() probably doesn't do what you intend. In particular, it doesn't populate the Customerorder.itemsbought collection. So when you iterate over customerorders and access the itemsbought property, it issues a new query for each order. Try starting with something like this: customerorders = (sqlsession.query(Customerorder) .options(saorm.joinedload(itemsbought)) .all()) Hope that helps, Simon On 11 Sep 2014, at 20:54, Michael Bayer mike...@zzzcomputing.com wrote: then it sounds like you have lazy loading happening, check echo=True On Sep 11, 2014, at 3:12 PM, tatütata Okay rogersher...@googlemail.com wrote: Okay will try it. Maybe you got it wrong. I have only 5 k rows On Thu, Sep 11, 2014 at 8:55 PM, Michael Bayer mike...@zzzcomputing.com wrote: three joins over 100K rows is going to be very slow, two seconds seems in the ballpark. EXPLAIN will show if there are any table scans taking place. as far as why the relationship is faster, it might be doing something different, check the echo=True output. On Sep 11, 2014, at 2:35 PM, tatütata Okay rogersher...@googlemail.com wrote: Yeah that would have been my next steps. I hoped more for some hints if the setup is correct more if there are issues with the relationship configuration? I think speed related things come up when there are over 10 entries and more? We speak here about 5000, thats why I think I did a mistake on the configuration of the relation patterns? On Thu, Sep 11, 2014 at 8:12 PM, Michael Bayer mike...@zzzcomputing.com wrote: for some tips on isolating where the speed issue is, follow the steps at: http://docs.sqlalchemy.org/en/latest/faq.html#how-can-i-profile-a-sqlalchemy-powered-application for too many joins, you also want to look into EXPLAIN PLAN. On Sep 11, 2014, at 12:57 PM, tatütata Okay rogersher...@googlemail.com wrote: Hey, I got some shop tables like: customerorders, itemsbought, products whole query with three joins takes nothing on command line sql. When I trying to achieve same thing with ORM and my relationship it takes 1 second, joining all three 3 tables 2 seconds? May I come first up with my table and mapper definitions: class Products(Base): __tablename__ = 'products' id = Column(Integer, primary_key=True) productname = Column(Unicode(255)) productname_en =Column(Unicode(255), default = u'nice product') path = Column(Unicode(300)) pic = Column(Unicode(300), unique=True) bigpic = Column(Unicode(300), unique=True) sold = Column(Boolean) price = Column(Integer) locked = Column(Boolean) sort = Column(Integer) catid = Column(Integer) sizeid = Column(Integer, ForeignKey('productsizes.sizeid')) productsizes = relationship(ProductSizes, backref=products, lazy='joined') itemsbought = relationship(Itemsbought, backref=products) def __init__(self, productname, path, pic, bigpic, sold, price, sizeid, locked, sort, catid): self.productname = productname self.path = path self.pic = pic self.bigpic = bigpic self.sold = sold self.price = price self.sizeid = sizeid self.locked = locked self.sort = sort self.catid = catid class Customerorder(Base): __tablename__ = 'customerorders' id = Column(Integer, primary_key=True) ip = Column(Unicode(40)) comment = Column(Unicode(1000)) plz = Column(Unicode(30)) name = Column(Unicode(100)) street = Column(Unicode(200)) ort = Column(Unicode(100)) date = Column(DateTime, default = func.now()) money_received = Column(Boolean) shipped = Column(Boolean) def __init__(self, comment, ip, street, plz, ort, name, money_received): self.ip = ip self.customerid = customerid self.comment = comment self.street = street self.plz = plz self.ort = ort self.name = name self.money_received = money_received self.shipped = shipped class Itemsbought(Base): __tablename__ = 'itemsbought' id = Column(Integer, primary_key=True) orderid = Column(Integer, ForeignKey('customerorders.id')) order = relationship('Customerorder', backref=backref(itemsbought)) productid = Column(Integer, default = 0)
Re: [sqlalchemy] simple relationship join ends up in 1 s
OK, keep going with the profiling, whats the total number of rows you are seeing with the joined load, 5000 total ? sometimes a join is actually scrolling through many more rows than that, duplicate rows in the case of a cartesian product.echo='debug' would show that. Or there could be very large textual fields in rows that are being fetched repeatedly (in which case subqueryload() might be better). There's a lot of reasons this could happen. On Sep 12, 2014, at 3:16 AM, tatütata Okay rogersher...@googlemail.com wrote: Thats what I did before, activating echo=True shows me correct query too. This joinedload takes 1.3 seconds for this 5000 entries. And only 4 queries in general for complete side load. Adding one more joinedload ends up in 2 seconds, thats way too much for 1700 customers and 5 k bought items On Thu, Sep 11, 2014 at 10:12 PM, Simon King si...@simonking.org.uk wrote: The query that you are doing: customerorders = sqlsession.query(Customerorder)\ .join(Itemsbought.order)\ .all() probably doesn't do what you intend. In particular, it doesn't populate the Customerorder.itemsbought collection. So when you iterate over customerorders and access the itemsbought property, it issues a new query for each order. Try starting with something like this: customerorders = (sqlsession.query(Customerorder) .options(saorm.joinedload(itemsbought)) .all()) Hope that helps, Simon On 11 Sep 2014, at 20:54, Michael Bayer mike...@zzzcomputing.com wrote: then it sounds like you have lazy loading happening, check echo=True On Sep 11, 2014, at 3:12 PM, tatütata Okay rogersher...@googlemail.com wrote: Okay will try it. Maybe you got it wrong. I have only 5 k rows On Thu, Sep 11, 2014 at 8:55 PM, Michael Bayer mike...@zzzcomputing.com wrote: three joins over 100K rows is going to be very slow, two seconds seems in the ballpark. EXPLAIN will show if there are any table scans taking place. as far as why the relationship is faster, it might be doing something different, check the echo=True output. On Sep 11, 2014, at 2:35 PM, tatütata Okay rogersher...@googlemail.com wrote: Yeah that would have been my next steps. I hoped more for some hints if the setup is correct more if there are issues with the relationship configuration? I think speed related things come up when there are over 10 entries and more? We speak here about 5000, thats why I think I did a mistake on the configuration of the relation patterns? On Thu, Sep 11, 2014 at 8:12 PM, Michael Bayer mike...@zzzcomputing.com wrote: for some tips on isolating where the speed issue is, follow the steps at: http://docs.sqlalchemy.org/en/latest/faq.html#how-can-i-profile-a-sqlalchemy-powered-application for too many joins, you also want to look into EXPLAIN PLAN. On Sep 11, 2014, at 12:57 PM, tatütata Okay rogersher...@googlemail.com wrote: Hey, I got some shop tables like: customerorders, itemsbought, products whole query with three joins takes nothing on command line sql. When I trying to achieve same thing with ORM and my relationship it takes 1 second, joining all three 3 tables 2 seconds? May I come first up with my table and mapper definitions: class Products(Base): __tablename__ = 'products' id = Column(Integer, primary_key=True) productname = Column(Unicode(255)) productname_en =Column(Unicode(255), default = u'nice product') path = Column(Unicode(300)) pic = Column(Unicode(300), unique=True) bigpic = Column(Unicode(300), unique=True) sold = Column(Boolean) price = Column(Integer) locked = Column(Boolean) sort = Column(Integer) catid = Column(Integer) sizeid = Column(Integer, ForeignKey('productsizes.sizeid')) productsizes = relationship(ProductSizes, backref=products, lazy='joined') itemsbought = relationship(Itemsbought, backref=products) def __init__(self, productname, path, pic, bigpic, sold, price, sizeid, locked, sort, catid): self.productname = productname self.path = path self.pic = pic self.bigpic = bigpic self.sold = sold self.price = price self.sizeid = sizeid self.locked = locked self.sort = sort self.catid = catid class Customerorder(Base): __tablename__ = 'customerorders' id = Column(Integer, primary_key=True) ip = Column(Unicode(40)) comment = Column(Unicode(1000)) plz = Column(Unicode(30)) name = Column(Unicode(100)) street = Column(Unicode(200)) ort = Column(Unicode(100)) date = Column(DateTime, default = func.now()) money_received = Column(Boolean) shipped = Column(Boolean) def __init__(self, comment, ip, street, plz, ort, name,
Re: [sqlalchemy] simple relationship join ends up in 1 s
I think you need to profile the application like Mike said. Just to be thorough, I'd suggest you time the Query execution and the ORM population. Do you have an example of the faster raw SQL that you're trying to generate ? When you do a query on the commandline it is often faster because the database is only displaying the initial data on the screen. When you're using something like SqlAlchemy, the app needs to get all the data and then create objects. This is usually lightweight, but if you have some columns with a lot of data, there will be a hit. When you profile, you'll have a better idea if the query, the loader, of the connection is to blame. -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy. For more options, visit https://groups.google.com/d/optout.
Re: [sqlalchemy] simple relationship join ends up in 1 s
for some tips on isolating where the speed issue is, follow the steps at: http://docs.sqlalchemy.org/en/latest/faq.html#how-can-i-profile-a-sqlalchemy-powered-application for too many joins, you also want to look into EXPLAIN PLAN. On Sep 11, 2014, at 12:57 PM, tatütata Okay rogersher...@googlemail.com wrote: Hey, I got some shop tables like: customerorders, itemsbought, products whole query with three joins takes nothing on command line sql. When I trying to achieve same thing with ORM and my relationship it takes 1 second, joining all three 3 tables 2 seconds? May I come first up with my table and mapper definitions: class Products(Base): __tablename__ = 'products' id = Column(Integer, primary_key=True) productname = Column(Unicode(255)) productname_en =Column(Unicode(255), default = u'nice product') path = Column(Unicode(300)) pic = Column(Unicode(300), unique=True) bigpic = Column(Unicode(300), unique=True) sold = Column(Boolean) price = Column(Integer) locked = Column(Boolean) sort = Column(Integer) catid = Column(Integer) sizeid = Column(Integer, ForeignKey('productsizes.sizeid')) productsizes = relationship(ProductSizes, backref=products, lazy='joined') itemsbought = relationship(Itemsbought, backref=products) def __init__(self, productname, path, pic, bigpic, sold, price, sizeid, locked, sort, catid): self.productname = productname self.path = path self.pic = pic self.bigpic = bigpic self.sold = sold self.price = price self.sizeid = sizeid self.locked = locked self.sort = sort self.catid = catid class Customerorder(Base): __tablename__ = 'customerorders' id = Column(Integer, primary_key=True) ip = Column(Unicode(40)) comment = Column(Unicode(1000)) plz = Column(Unicode(30)) name = Column(Unicode(100)) street = Column(Unicode(200)) ort = Column(Unicode(100)) date = Column(DateTime, default = func.now()) money_received = Column(Boolean) shipped = Column(Boolean) def __init__(self, comment, ip, street, plz, ort, name, money_received): self.ip = ip self.customerid = customerid self.comment = comment self.street = street self.plz = plz self.ort = ort self.name = name self.money_received = money_received self.shipped = shipped class Itemsbought(Base): __tablename__ = 'itemsbought' id = Column(Integer, primary_key=True) orderid = Column(Integer, ForeignKey('customerorders.id')) order = relationship('Customerorder', backref=backref(itemsbought)) productid = Column(Integer, default = 0) def __init__(self, productid, orderid): self.productid = productid self.orderid = orderid I try to query only Customerorder and Itemsbought without having those relations to products: customerorders = sqlsession.query(Customerorder)\ .join(Itemsbought.order)\ .all() or with joinedload, outerjoins all the same thing. log shows me always sqlalchemy makes left outer joins. Trying to use this data in loop ends up in 1700 queries: for order in customerorders: print order.name, for item in order.itemsbought: print item.productid I managed to get all this correct data with sizes and products in 2,5 s with joinedload but this is a way too long for 5 k entries in itemsbought and 1680 orders. Can someone point to what I am doin wrong? And whats the best way to get all orders in an clearly arranged way with products, price, sizes etc. Cheers -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy. For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy. For more options, visit https://groups.google.com/d/optout.
Re: [sqlalchemy] simple relationship join ends up in 1 s
Yeah that would have been my next steps. I hoped more for some hints if the setup is correct more if there are issues with the relationship configuration? I think speed related things come up when there are over 10 entries and more? We speak here about 5000, thats why I think I did a mistake on the configuration of the relation patterns? On Thu, Sep 11, 2014 at 8:12 PM, Michael Bayer mike...@zzzcomputing.com wrote: for some tips on isolating where the speed issue is, follow the steps at: http://docs.sqlalchemy.org/en/latest/faq.html#how-can-i-profile-a-sqlalchemy-powered-application for too many joins, you also want to look into EXPLAIN PLAN. On Sep 11, 2014, at 12:57 PM, tatütata Okay rogersher...@googlemail.com wrote: Hey, I got some shop tables like: customerorders, itemsbought, products whole query with three joins takes nothing on command line sql. When I trying to achieve same thing with ORM and my relationship it takes 1 second, joining all three 3 tables 2 seconds? May I come first up with my table and mapper definitions: class Products(Base): __tablename__ = 'products' id = Column(Integer, primary_key=True) productname = Column(Unicode(255)) productname_en =Column(Unicode(255), default = u'nice product') path = Column(Unicode(300)) pic = Column(Unicode(300), unique=True) bigpic = Column(Unicode(300), unique=True) sold = Column(Boolean) price = Column(Integer) locked = Column(Boolean) sort = Column(Integer) catid = Column(Integer) sizeid = Column(Integer, ForeignKey('productsizes.sizeid')) productsizes = relationship(ProductSizes, backref=products, lazy='joined') itemsbought = relationship(Itemsbought, backref=products) def __init__(self, productname, path, pic, bigpic, sold, price, sizeid, locked, sort, catid): self.productname = productname self.path = path self.pic = pic self.bigpic = bigpic self.sold = sold self.price = price self.sizeid = sizeid self.locked = locked self.sort = sort self.catid = catid class Customerorder(Base): __tablename__ = 'customerorders' id = Column(Integer, primary_key=True) ip = Column(Unicode(40)) comment = Column(Unicode(1000)) plz = Column(Unicode(30)) name = Column(Unicode(100)) street = Column(Unicode(200)) ort = Column(Unicode(100)) date = Column(DateTime, default = func.now()) money_received = Column(Boolean) shipped = Column(Boolean) def __init__(self, comment, ip, street, plz, ort, name, money_received): self.ip = ip self.customerid = customerid self.comment = comment self.street = street self.plz = plz self.ort = ort self.name = name self.money_received = money_received self.shipped = shipped class Itemsbought(Base): __tablename__ = 'itemsbought' id = Column(Integer, primary_key=True) orderid = Column(Integer, ForeignKey('customerorders.id')) order = relationship('Customerorder', backref=backref(itemsbought)) productid = Column(Integer, default = 0) def __init__(self, productid, orderid): self.productid = productid self.orderid = orderid I try to query only Customerorder and Itemsbought without having those relations to products: customerorders = sqlsession.query(Customerorder)\ .join(Itemsbought.order)\ .all() or with joinedload, outerjoins all the same thing. log shows me always sqlalchemy makes left outer joins. Trying to use this data in loop ends up in 1700 queries: for order in customerorders: print order.name, for item in order.itemsbought: print item.productid I managed to get all this correct data with sizes and products in 2,5 s with joinedload but this is a way too long for 5 k entries in itemsbought and 1680 orders. Can someone point to what I am doin wrong? And whats the best way to get all orders in an clearly arranged way with products, price, sizes etc. Cheers -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy. For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy. For more options, visit https://groups.google.com/d/optout. -- You received this
Re: [sqlalchemy] simple relationship join ends up in 1 s
three joins over 100K rows is going to be very slow, two seconds seems in the ballpark. EXPLAIN will show if there are any table scans taking place. as far as why the relationship is faster, it might be doing something different, check the echo=True output. On Sep 11, 2014, at 2:35 PM, tatütata Okay rogersher...@googlemail.com wrote: Yeah that would have been my next steps. I hoped more for some hints if the setup is correct more if there are issues with the relationship configuration? I think speed related things come up when there are over 10 entries and more? We speak here about 5000, thats why I think I did a mistake on the configuration of the relation patterns? On Thu, Sep 11, 2014 at 8:12 PM, Michael Bayer mike...@zzzcomputing.com wrote: for some tips on isolating where the speed issue is, follow the steps at: http://docs.sqlalchemy.org/en/latest/faq.html#how-can-i-profile-a-sqlalchemy-powered-application for too many joins, you also want to look into EXPLAIN PLAN. On Sep 11, 2014, at 12:57 PM, tatütata Okay rogersher...@googlemail.com wrote: Hey, I got some shop tables like: customerorders, itemsbought, products whole query with three joins takes nothing on command line sql. When I trying to achieve same thing with ORM and my relationship it takes 1 second, joining all three 3 tables 2 seconds? May I come first up with my table and mapper definitions: class Products(Base): __tablename__ = 'products' id = Column(Integer, primary_key=True) productname = Column(Unicode(255)) productname_en =Column(Unicode(255), default = u'nice product') path = Column(Unicode(300)) pic = Column(Unicode(300), unique=True) bigpic = Column(Unicode(300), unique=True) sold = Column(Boolean) price = Column(Integer) locked = Column(Boolean) sort = Column(Integer) catid = Column(Integer) sizeid = Column(Integer, ForeignKey('productsizes.sizeid')) productsizes = relationship(ProductSizes, backref=products, lazy='joined') itemsbought = relationship(Itemsbought, backref=products) def __init__(self, productname, path, pic, bigpic, sold, price, sizeid, locked, sort, catid): self.productname = productname self.path = path self.pic = pic self.bigpic = bigpic self.sold = sold self.price = price self.sizeid = sizeid self.locked = locked self.sort = sort self.catid = catid class Customerorder(Base): __tablename__ = 'customerorders' id = Column(Integer, primary_key=True) ip = Column(Unicode(40)) comment = Column(Unicode(1000)) plz = Column(Unicode(30)) name = Column(Unicode(100)) street = Column(Unicode(200)) ort = Column(Unicode(100)) date = Column(DateTime, default = func.now()) money_received = Column(Boolean) shipped = Column(Boolean) def __init__(self, comment, ip, street, plz, ort, name, money_received): self.ip = ip self.customerid = customerid self.comment = comment self.street = street self.plz = plz self.ort = ort self.name = name self.money_received = money_received self.shipped = shipped class Itemsbought(Base): __tablename__ = 'itemsbought' id = Column(Integer, primary_key=True) orderid = Column(Integer, ForeignKey('customerorders.id')) order = relationship('Customerorder', backref=backref(itemsbought)) productid = Column(Integer, default = 0) def __init__(self, productid, orderid): self.productid = productid self.orderid = orderid I try to query only Customerorder and Itemsbought without having those relations to products: customerorders = sqlsession.query(Customerorder)\ .join(Itemsbought.order)\ .all() or with joinedload, outerjoins all the same thing. log shows me always sqlalchemy makes left outer joins. Trying to use this data in loop ends up in 1700 queries: for order in customerorders: print order.name, for item in order.itemsbought: print item.productid I managed to get all this correct data with sizes and products in 2,5 s with joinedload but this is a way too long for 5 k entries in itemsbought and 1680 orders. Can someone point to what I am doin wrong? And whats the best way to get all orders in an clearly arranged way with products, price, sizes etc. Cheers -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy. For more options, visit https://groups.google.com/d/optout. -- You received
Re: [sqlalchemy] simple relationship join ends up in 1 s
Okay will try it. Maybe you got it wrong. I have only 5 k rows On Thu, Sep 11, 2014 at 8:55 PM, Michael Bayer mike...@zzzcomputing.com wrote: three joins over 100K rows is going to be very slow, two seconds seems in the ballpark. EXPLAIN will show if there are any table scans taking place. as far as why the relationship is faster, it might be doing something different, check the echo=True output. On Sep 11, 2014, at 2:35 PM, tatütata Okay rogersher...@googlemail.com wrote: Yeah that would have been my next steps. I hoped more for some hints if the setup is correct more if there are issues with the relationship configuration? I think speed related things come up when there are over 10 entries and more? We speak here about 5000, thats why I think I did a mistake on the configuration of the relation patterns? On Thu, Sep 11, 2014 at 8:12 PM, Michael Bayer mike...@zzzcomputing.com wrote: for some tips on isolating where the speed issue is, follow the steps at: http://docs.sqlalchemy.org/en/latest/faq.html#how-can-i-profile-a-sqlalchemy-powered-application for too many joins, you also want to look into EXPLAIN PLAN. On Sep 11, 2014, at 12:57 PM, tatütata Okay rogersher...@googlemail.com wrote: Hey, I got some shop tables like: customerorders, itemsbought, products whole query with three joins takes nothing on command line sql. When I trying to achieve same thing with ORM and my relationship it takes 1 second, joining all three 3 tables 2 seconds? May I come first up with my table and mapper definitions: class Products(Base): __tablename__ = 'products' id = Column(Integer, primary_key=True) productname = Column(Unicode(255)) productname_en =Column(Unicode(255), default = u'nice product') path = Column(Unicode(300)) pic = Column(Unicode(300), unique=True) bigpic = Column(Unicode(300), unique=True) sold = Column(Boolean) price = Column(Integer) locked = Column(Boolean) sort = Column(Integer) catid = Column(Integer) sizeid = Column(Integer, ForeignKey('productsizes.sizeid')) productsizes = relationship(ProductSizes, backref=products, lazy='joined') itemsbought = relationship(Itemsbought, backref=products) def __init__(self, productname, path, pic, bigpic, sold, price, sizeid, locked, sort, catid): self.productname = productname self.path = path self.pic = pic self.bigpic = bigpic self.sold = sold self.price = price self.sizeid = sizeid self.locked = locked self.sort = sort self.catid = catid class Customerorder(Base): __tablename__ = 'customerorders' id = Column(Integer, primary_key=True) ip = Column(Unicode(40)) comment = Column(Unicode(1000)) plz = Column(Unicode(30)) name = Column(Unicode(100)) street = Column(Unicode(200)) ort = Column(Unicode(100)) date = Column(DateTime, default = func.now()) money_received = Column(Boolean) shipped = Column(Boolean) def __init__(self, comment, ip, street, plz, ort, name, money_received): self.ip = ip self.customerid = customerid self.comment = comment self.street = street self.plz = plz self.ort = ort self.name = name self.money_received = money_received self.shipped = shipped class Itemsbought(Base): __tablename__ = 'itemsbought' id = Column(Integer, primary_key=True) orderid = Column(Integer, ForeignKey('customerorders.id')) order = relationship('Customerorder', backref=backref(itemsbought)) productid = Column(Integer, default = 0) def __init__(self, productid, orderid): self.productid = productid self.orderid = orderid I try to query only Customerorder and Itemsbought without having those relations to products: customerorders = sqlsession.query(Customerorder)\ .join(Itemsbought.order)\ .all() or with joinedload, outerjoins all the same thing. log shows me always sqlalchemy makes left outer joins. Trying to use this data in loop ends up in 1700 queries: for order in customerorders: print order.name, for item in order.itemsbought: print item.productid I managed to get all this correct data with sizes and products in 2,5 s with joinedload but this is a way too long for 5 k entries in itemsbought and 1680 orders. Can someone point to what I am doin wrong? And whats the best way to get all orders in an clearly arranged way with products, price, sizes etc. Cheers -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy. For
Re: [sqlalchemy] simple relationship join ends up in 1 s
The query that you are doing: customerorders = sqlsession.query(Customerorder)\ .join(Itemsbought.order)\ .all() probably doesn't do what you intend. In particular, it doesn't populate the Customerorder.itemsbought collection. So when you iterate over customerorders and access the itemsbought property, it issues a new query for each order. Try starting with something like this: customerorders = (sqlsession.query(Customerorder) .options(saorm.joinedload(itemsbought)) .all()) Hope that helps, Simon On 11 Sep 2014, at 20:54, Michael Bayer mike...@zzzcomputing.com wrote: then it sounds like you have lazy loading happening, check echo=True On Sep 11, 2014, at 3:12 PM, tatütata Okay rogersher...@googlemail.com wrote: Okay will try it. Maybe you got it wrong. I have only 5 k rows On Thu, Sep 11, 2014 at 8:55 PM, Michael Bayer mike...@zzzcomputing.com wrote: three joins over 100K rows is going to be very slow, two seconds seems in the ballpark. EXPLAIN will show if there are any table scans taking place. as far as why the relationship is faster, it might be doing something different, check the echo=True output. On Sep 11, 2014, at 2:35 PM, tatütata Okay rogersher...@googlemail.com wrote: Yeah that would have been my next steps. I hoped more for some hints if the setup is correct more if there are issues with the relationship configuration? I think speed related things come up when there are over 10 entries and more? We speak here about 5000, thats why I think I did a mistake on the configuration of the relation patterns? On Thu, Sep 11, 2014 at 8:12 PM, Michael Bayer mike...@zzzcomputing.com wrote: for some tips on isolating where the speed issue is, follow the steps at: http://docs.sqlalchemy.org/en/latest/faq.html#how-can-i-profile-a-sqlalchemy-powered-application for too many joins, you also want to look into EXPLAIN PLAN. On Sep 11, 2014, at 12:57 PM, tatütata Okay rogersher...@googlemail.com wrote: Hey, I got some shop tables like: customerorders, itemsbought, products whole query with three joins takes nothing on command line sql. When I trying to achieve same thing with ORM and my relationship it takes 1 second, joining all three 3 tables 2 seconds? May I come first up with my table and mapper definitions: class Products(Base): __tablename__ = 'products' id = Column(Integer, primary_key=True) productname = Column(Unicode(255)) productname_en =Column(Unicode(255), default = u'nice product') path = Column(Unicode(300)) pic = Column(Unicode(300), unique=True) bigpic = Column(Unicode(300), unique=True) sold = Column(Boolean) price = Column(Integer) locked = Column(Boolean) sort = Column(Integer) catid = Column(Integer) sizeid = Column(Integer, ForeignKey('productsizes.sizeid')) productsizes = relationship(ProductSizes, backref=products, lazy='joined') itemsbought = relationship(Itemsbought, backref=products) def __init__(self, productname, path, pic, bigpic, sold, price, sizeid, locked, sort, catid): self.productname = productname self.path = path self.pic = pic self.bigpic = bigpic self.sold = sold self.price = price self.sizeid = sizeid self.locked = locked self.sort = sort self.catid = catid class Customerorder(Base): __tablename__ = 'customerorders' id = Column(Integer, primary_key=True) ip = Column(Unicode(40)) comment = Column(Unicode(1000)) plz = Column(Unicode(30)) name = Column(Unicode(100)) street = Column(Unicode(200)) ort = Column(Unicode(100)) date = Column(DateTime, default = func.now()) money_received = Column(Boolean) shipped = Column(Boolean) def __init__(self, comment, ip, street, plz, ort, name, money_received): self.ip = ip self.customerid = customerid self.comment = comment self.street = street self.plz = plz self.ort = ort self.name = name self.money_received = money_received self.shipped = shipped class Itemsbought(Base): __tablename__ = 'itemsbought' id = Column(Integer, primary_key=True) orderid = Column(Integer, ForeignKey('customerorders.id')) order = relationship('Customerorder', backref=backref(itemsbought)) productid = Column(Integer, default = 0) def __init__(self, productid, orderid): self.productid = productid self.orderid = orderid I try to query only Customerorder and Itemsbought without having those relations to products: customerorders = sqlsession.query(Customerorder)\ .join(Itemsbought.order)\ .all() or with joinedload, outerjoins all the same thing. log shows me always sqlalchemy makes left outer