Hi I did some profiling a simple select type query to check SA performance for such use-cases
The test consiste of a very simple Order record: order = Table('tblorder', metadata, Column('order_id', Integer, primary_key = True), Column('order_user_id', Integer), Column('order_state', Integer), Column('order_price', Integer), Column('order_delivery_cost', Integer), Column('order_creation_date', DateTime), Column('order_modification_date', DateTime), Column('order_last_payment_check', DateTime), Column('order_delivery_address_id', Integer), ) and some related log lines that keep track of what happens to the order: orderlog = Table('tblorderlog', metadata, Column('orderlog_id', Integer, primary_key = True), Column('orderlog_order_id', Integer, ForeignKey('tblorder.order_id')), Column('orderlog_creation_date', DateTime), Column('orderlog_message', String), Column('orderlog_priority', Integer), ) On the ORM side of things these 2 have a simple 1-1 mapping with classes Order and OrderLog and the loglines are mapped to the order using: 'orderLogs': relation(OrderLog, backref='order') The main loop of the test program is the following: times = 1000 start = time.time() for i in range(times): order = Session.query(Order).get(72244) for log in order.orderLogs: pass Session.clear() end = time.time() Note that it is the same Order that I am fetching all the time, because I am interested in SA overhead and not in the speed of the database (MySQL) itself, this way mysql will fetch from memory and not be disk bound (also in this test I made sure I am not network bound). Also note that the session is cleared each iteration trough the loop to force SA to perform the 2 queries and corresponding mapping overhead (1 qry for fetching order, 1 qry for fetching the 17 corresponding orderlog lines) on each iteration. Profiling with cProfile (python 2.5.1) gave the following results (sorted by total time spend in function): 2923724 function calls (2880671 primitive calls) in 5.965 CPU seconds Ordered by: internal time List reduced from 498 to 20 due to restriction <20> ncalls tottime percall cumtime percall filename:lineno(function) 2000 0.585 0.000 0.585 0.000 {method 'query' of '_mysql.connection' objects} 18000 0.192 0.000 1.715 0.000 mapper.py: 1407(_instance) 18000 0.160 0.000 0.801 0.000 mapper.py: 1381(populate_state) 2000 0.153 0.000 0.259 0.000 base.py: 1448(_init_metadata) 20000 0.153 0.000 0.198 0.000 times.py: 43(DateTime_or_None) 36000 0.147 0.000 0.147 0.000 identity.py: 91(__contains__) 16000 0.139 0.000 0.270 0.000 compiler.py: 239(visit_column) 112000 0.130 0.000 0.240 0.000 base.py: 1363(__getitem__) 94000 0.127 0.000 0.318 0.000 strategies.py: 71(new_execute) 18000 0.122 0.000 0.122 0.000 attributes.py: 745(__init__) 42000/2000 0.114 0.000 1.233 0.001 compiler.py: 176(process) 162036/162022 0.113 0.000 0.134 0.000 {getattr} 14000 0.106 0.000 0.541 0.000 compiler.py: 228(visit_label) 2000 0.103 0.000 1.222 0.001 compiler.py: 466(visit_select) 112000 0.103 0.000 0.110 0.000 base.py:1617(_get_col) 32000 0.089 0.000 0.158 0.000 compiler.py: 401(_truncated_identifier) 72000 0.086 0.000 0.096 0.000 util.py:806(add) 135101 0.085 0.000 0.085 0.000 {isinstance} 18000 0.073 0.000 0.181 0.000 identity.py:104(add) 46000 0.068 0.000 0.080 0.000 {built-in method sub} What we see here is that of the total of 5.965 seconds, the test spend only about 0.585 seconds performing the actual SQL querys, or about 10% of the total time. The other 90% of the time we are spending on SA overhead (except for the times.py line which is somewhere in the MySQLdb driver). If I look at the cummulative output of the profiler (ommitted for brevity) I see for instance that SA spends 1.3 seconds compiling the SQL expressions (it apparently recompiles the same SQL expression on each iteration). My question to the SA community would be how much ORM overhead do you find acceptable? and how does SA compare to for instance to java's Hibernate in this regard or other ORM solutions? I would love to help out trying to reduce this overhead, but I have no idea what the SA devs have in mind for the future of SA with regard to speed (e.g. speed vs. feature completeness). In the tutorials and documentation there is currently a strong emphasis on use cases highlighting the very dynamic nature of building complex queries using SA's ORM tools. Not so much on how to perform simple (and static) queries fast using the ORM. Maybe there is a way to meoize the results of the compile step so that this does not need to be redone all the time while the Query remains the same?. --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~----------~----~----~----~------~----~------~--~---