[sqlalchemy] Simple select is really slow when executed via ORM

2011-08-12 Thread Massi
Hi everyone,

I'm doing some test to evaluate the performance of querying with
sqlalchemy via ORM. I wrote a simple script to measure the execution
time of a simple select query made on relatively small table (300 000
records, 6 columns) in sqlite. Here is the script:

from sqlalchemy import *
from sqlalchemy.orm import mapper, sessionmaker
from time import clock

engine = create_engine('sqlite:///test.db', echo=False)
metadata = MetaData()

table = Table('projects', metadata,
Column('id', Integer, primary_key=True),
Column('inp1', String(50)),
Column('inp2', String(50)),
Column('inp3', String(50)),
Column('inp4', String(50)),
Column('inp5', String(50)),
)

class Project(object) :
pass

mapper(Project, table)
metadata.create_all(engine)

t = []
for i in range(30) :
t.append({inp1:str(i), inp2:str(i), inp3:str(i),
  inp4:str(i), inp5:str(i)})

c = clock()
engine.execute(table.insert(), t)
print Insert: +str(clock()-c)
session = sessionmaker(bind=engine)()
c = clock()
res = engine.execute(table.select()).fetchall()
print Sql query: +str(clock()-c)
c = clock()
res = session.query(Project).all()
print Session query: +str(clock()-c)

On my PC (windows 7, 64-bit, intel i7 2.93 Ghz) this is the output:
Insert: 3.41080167807
Sql query: 1.26728367673
Session query: 19.6452334842

The execution time of the ORM query is about 20 times the SQL one, and
this is definitely discouraging. So I guess if I'm doing something
wrong or if there are some tricks when using ORM that I'm not
considering. Any help is really appreciated.
Thanks in advance!

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.



Re: [sqlalchemy] Simple select is really slow when executed via ORM

2011-08-12 Thread Michael Bayer
Here's a comparison that begins to be slightly fair regarding the work of 
fetching raw rows versus generating and identity-managing full object rows.   
On my mac the SQL query takes 7 seconds and the Session query 13.7, so 
twice as slow, rather than 20.  The difference is we are actually fetching the 
*data* from the SQLite result row, and additionally generating some real Python 
objects, with a check in the session identity map to simulate a small amount of 
ORM bookkeeping.The ORM maintains an additional object known as 
InstanceState for each mapped object, and establishing this object adds a 
significant chunk of time as well.  Python is very slow in creating objects, 
and in calling functions.   Here, each tiny little additional thing the ORM 
does is multiplied by 30, so it adds up. As you can see, just adding a 
small handful of steps to the SQL version gets it much closer very quickly.

For a 300K row grab you are better off fetching tuples, that is 
query(Project.id, Project.inp1, Project.inp2), etc., which eliminates all the 
ORM bookkeeping associated with mapped objects.

from sqlalchemy import *
from sqlalchemy.orm import mapper, sessionmaker
from time import clock
import os

if os.path.exists(test.db):
os.remove(test.db)

engine = create_engine('sqlite:///test.db', echo=False)
metadata = MetaData()

table = Table('projects', metadata,
   Column('id', Integer, primary_key=True),
   Column('inp1', String(50)),
   Column('inp2', String(50)),
   Column('inp3', String(50)),
   Column('inp4', String(50)),
   Column('inp5', String(50)),
)

class Project(object) :
   pass

mapper(Project, table)
metadata.create_all(engine)

t = []
for i in range(30) :
   t.append({inp1:str(i), inp2:str(i), inp3:str(i),
 inp4:str(i), inp5:str(i)})

c = clock()
engine.execute(table.insert(), t)
print Insert: +str(clock()-c)
session = sessionmaker(bind=engine)()

class UnmappedProject(object):
pass

c = clock()
res = []
for row in engine.execute(table.select()):
identity_key = row[table.c.id]
if identity_key in session.identity_map:
# here we'd use existing object
assert False
else:
obj = UnmappedProject()
for col in table.c:
setattr(obj, col.key, row[col])
res.append(obj)

print Sql query: +str(clock()-c)
c = clock()
res = session.query(Project).all()
print Session query: +str(clock()-c)



On Aug 12, 2011, at 8:51 AM, Massi wrote:

 Hi everyone,
 
 I'm doing some test to evaluate the performance of querying with
 sqlalchemy via ORM. I wrote a simple script to measure the execution
 time of a simple select query made on relatively small table (300 000
 records, 6 columns) in sqlite. Here is the script:
 
 from sqlalchemy import *
 from sqlalchemy.orm import mapper, sessionmaker
 from time import clock
 
 engine = create_engine('sqlite:///test.db', echo=False)
 metadata = MetaData()
 
 table = Table('projects', metadata,
Column('id', Integer, primary_key=True),
Column('inp1', String(50)),
Column('inp2', String(50)),
Column('inp3', String(50)),
Column('inp4', String(50)),
Column('inp5', String(50)),
 )
 
 class Project(object) :
pass
 
 mapper(Project, table)
 metadata.create_all(engine)
 
 t = []
 for i in range(30) :
t.append({inp1:str(i), inp2:str(i), inp3:str(i),
  inp4:str(i), inp5:str(i)})
 
 c = clock()
 engine.execute(table.insert(), t)
 print Insert: +str(clock()-c)
 session = sessionmaker(bind=engine)()
 c = clock()
 res = engine.execute(table.select()).fetchall()
 print Sql query: +str(clock()-c)
 c = clock()
 res = session.query(Project).all()
 print Session query: +str(clock()-c)
 
 On my PC (windows 7, 64-bit, intel i7 2.93 Ghz) this is the output:
 Insert: 3.41080167807
 Sql query: 1.26728367673
 Session query: 19.6452334842
 
 The execution time of the ORM query is about 20 times the SQL one, and
 this is definitely discouraging. So I guess if I'm doing something
 wrong or if there are some tricks when using ORM that I'm not
 considering. Any help is really appreciated.
 Thanks in advance!
 
 -- 
 You received this message because you are subscribed to the Google Groups 
 sqlalchemy group.
 To post to this group, send email to sqlalchemy@googlegroups.com.
 To unsubscribe from this group, send email to 
 sqlalchemy+unsubscr...@googlegroups.com.
 For more options, visit this group at 
 http://groups.google.com/group/sqlalchemy?hl=en.
 

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.