[sqlalchemy] How to best select from a large data set
Hi, I'm in a particular situation where our ORM setup is taking way too long to perform a selection from a large data set. In my database I have 5000 customers who made purchases and made some form of payment. I need to find the names of all customers who made payments by cash. My SQL query looks like this: SELECT customers.name, payments.payid FROM customers, purchases, payments WHERE customers.cid = purchases.cid AND purchases.payid = payments.payid AND payments.pay_type = 'cash' Currently my code looks like this: customers = Customer.select() for customer in customers: for purchase in customer.purchases: if purchase.payment.pay_type == 'cash': print %s, %s % (customer.name, purchase.payment.payid) Customer is a class with a mapper assigned to it using assign_mapper() (please see the source[1] for details). In my file (below) I perform the query three ways: using mappers, using a statement which I assemble manually, and using a MySQL query. The timings are as follows: $ time python dbprofile.py mapper /dev/null real0m19.256s user0m11.501s sys 0m5.416s $ time python dbprofile.py raw sql /dev/null real0m0.186s user0m0.128s sys 0m0.044s $ time python dbprofile.py statement /dev/null real0m0.346s user0m0.244s sys 0m0.084s What's the best way to execute this query fast? I have read the Working with Large Collections section of the documentation, but it seems to deal mostly with deletion. Is there a custom mapper I could define which would select the data I need in one DB query? [1] Source here: http://rafb.net/p/vaELsx86.html Thanks! Mike --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: SQLAlchemy 0.4 beta5 (OperationalError) database table is locked '\nDROP TABLE os' () sqlite 3.4.2
--On 14. September 2007 10:07:27 -0700 jason kirtland [EMAIL PROTECTED] wrote: jean-marc pouchoulon wrote: helo, I don't understand why my sqlite database is locked on a drop ( the code is following) thanks for your explanation. [...snip...] ordinateurs = computers_table.select(computers_table.c.prix 100.0).execute() ordinateur_pas_cher = ordinateurs.fetchone() print (marque d'un ordi pas cher: %s ) % ordinateur_pas_cher.marque ordinateurs = computers_table.select(and_((computers_table.c.prix 1000.0),(computers_table.c.prix 1000.0))).execute() ordinateur_medium_price = ordinateurs.fetchone() print (marque d'un ordi medium price: %s ) % ordinateur_pas_cher.marque ordinateurs = select([computers_table.c.marque,computers_table.c.prix]).execute() for ordi in ordinateurs: print marque:%s prix:%s % (ordi[0],ordi[1]) result_os.close() ordinateurs.close() I am currently facing a similar issue using SA 0.4b5. The following unittest (taken from z3c.sqlalchemy) works with SA 0.3.X but fails with SA 0.4b5 (fails using SQLite but works against a Postgres database): def testSimplePopulation(self): db = createSAWrapper(self.dsn) # obtain mapper for table 'users' User = db.getMapper('users') session = db.session session.save(User(firstname='udo', lastname='juergens')) session.save(User(firstname='heino', lastname='n/a')) session.flush() rows = session.query(User).all() self.assertEqual(len(rows), 2) == ERROR: testSimplePopulation (tests.testSQLAlchemy.WrapperTests) -- Traceback (most recent call last): File /opt/python-2.4.4/lib/python2.4/unittest.py, line 260, in run testMethod() File /Users/ajung_data/sandboxes/haufe.sqlalchemy/lib/python/z3c/sqlalchemy/tests/testSQLAlchemy.py, line 79, in testSimplePopulation session.flush() File /Users/ajung_data/sandboxes/haufe.sqlalchemy/lib/python/sqlalchemy/orm/session.py, line 665, in flush self.uow.flush(self, objects) File /Users/ajung_data/sandboxes/haufe.sqlalchemy/lib/python/sqlalchemy/orm/unitofwork.py, line 202, in flush flush_context.execute() File /Users/ajung_data/sandboxes/haufe.sqlalchemy/lib/python/sqlalchemy/orm/unitofwork.py, line 414, in execute UOWExecutor().execute(self, head) File /Users/ajung_data/sandboxes/haufe.sqlalchemy/lib/python/sqlalchemy/orm/unitofwork.py, line 1033, in execute self.execute_save_steps(trans, task) File /Users/ajung_data/sandboxes/haufe.sqlalchemy/lib/python/sqlalchemy/orm/unitofwork.py, line 1047, in execute_save_steps self.save_objects(trans, task) File /Users/ajung_data/sandboxes/haufe.sqlalchemy/lib/python/sqlalchemy/orm/unitofwork.py, line 1038, in save_objects task.mapper.save_obj(task.polymorphic_tosave_objects, trans) File /Users/ajung_data/sandboxes/haufe.sqlalchemy/lib/python/sqlalchemy/orm/mapper.py, line 1155, in save_obj c = connection.execute(statement.values(value_params), params) File /Users/ajung_data/sandboxes/haufe.sqlalchemy/lib/python/sqlalchemy/engine/base.py, line 789, in execute return Connection.executors[c](self, object, multiparams, params) File /Users/ajung_data/sandboxes/haufe.sqlalchemy/lib/python/sqlalchemy/engine/base.py, line 821, in _execute_clauseelement return self._execute_compiled(elem.compile(dialect=self.dialect, parameters=param, inline=executemany), multiparams, params) File /Users/ajung_data/sandboxes/haufe.sqlalchemy/lib/python/sqlalchemy/engine/base.py, line 832, in _execute_compiled self.__execute_raw(context) File /Users/ajung_data/sandboxes/haufe.sqlalchemy/lib/python/sqlalchemy/engine/base.py, line 850, in __execute_raw self._cursor_execute(context.cursor, context.statement, parameters, context=context) File /Users/ajung_data/sandboxes/haufe.sqlalchemy/lib/python/sqlalchemy/engine/base.py, line 867, in _cursor_execute raise exceptions.DBAPIError.instance(statement, parameters, e) OperationalError: (OperationalError) database is locked u'INSERT INTO users (firstname, lastname) VALUES (?, ?)' ['udo', 'juergens'] pgpCYWmKk8b7Y.pgp Description: PGP signature
[sqlalchemy] Re: How to best select from a large data set
In my database I have 5000 customers who made purchases and made some form of payment. I need to find the names of all customers who made payments by cash. My SQL query looks like this: SELECT customers.name, payments.payid FROM customers, purchases, payments WHERE customers.cid = purchases.cid AND purchases.payid = payments.payid AND payments.pay_type = 'cash' have a look at 'querying with joins' in the docs. Also, 0.4 docs are better explaning some things than 0.3 (but check for feature availability). non-orm: q = customers_table.join( purchases_table).join( payments_table).select( payments_table.c.pay_type=='cash').execute() for a in q: print a u have something like this anyway. orm: q = Customer.query().add_entity( Payment).join( ['purchases', 'payments'] ).filter( Payment.pay_type=='cash') for a in q: print a here u'll get tuples (objCustomer,objPayment) in both cases, for further optimization, u could (somehow?) preselect the columns u need instead of getting all. Especialy for the orm/Customers, u'll get whole purchases[] relations loaded unless they are lazy. u can also make a mapper directly over the table-join-select: class CusPay: pass m= Mapper( CusPay, customers_table.join( purchases_table).join( payments_table).select( payments_table.c.pay_type=='cash') ) for a in m.query(): print a or something of sorts, avoiding any extra columns/relation/whatever. Currently my code looks like this: for customer in customers: for purchase in customer.purchases: if purchase.payment.pay_type == 'cash': print %s, %s % (customer.name, purchase.payment.payid) u are not using any db-filtering, u are filtering all by hand in python. so u're instantiating all them customers, then for each loading all the purchases, then printing those =='cash' - essentialy loading whole db. ciao svilen --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---