On May 12, 2014, at 6:33 PM, Seth P <spadow...@gmail.com> wrote:

> Is it possible that the (primary key index (which is a composite index that 
> begins with gvkey, and is the only index on the table) isn't being used 
> because the the gvkey parameter is somehow passed as an integer rather than 
> as a string?

There's nothing in SQLAlchemy that coerces strings to integers.  If the actual 
type of the column on the DB is an integer, then there might be some conversion 
within pyodbc or the ODBC driver.

if you've got it narrowed down this much the next step is to figure out a raw 
pyodbc script that illustrates what the problem is. 


> The first EXEC below is pretty much instantaneous, whereas the second takes 
> about 8 seconds (and produces the same results).
> 
> EXEC sp_executesql
> N'SELECT sec_dprc.datadate AS sec_dprc_datadate, sec_dprc.prcod AS 
> sec_dprc_prcod
> FROM sec_dprc WHERE sec_dprc.gvkey = @gvkey ORDER BY sec_dprc.datadate',
> N'@gvkey VARCHAR(6)', '001045'
> 
> EXEC sp_executesql
> N'SELECT sec_dprc.datadate AS sec_dprc_datadate, sec_dprc.prcod AS 
> sec_dprc_prcod
> FROM sec_dprc WHERE sec_dprc.gvkey = @gvkey ORDER BY sec_dprc.datadate',
> N'@gvkey INT', 001045
> 
> 
> 
> On Monday, May 12, 2014 5:00:27 PM UTC-4, Michael Bayer wrote:
> 
> well there's only one parameter being processed here so there is clearly 
> negligible difference in time spent within Python as far as getting the 
> statement ready to execute and then executing it.
> 
> So the time is either in what SQL Server spends fetching the rows, or the 
> number of rows being fetched (which seems to be the same).   Which leaves 
> pretty much that SQL Server is making a different choice about the query plan 
> for this SELECT statement, this is typically due to an INDEX being used or 
> not.    You'd need to analyze the plan being used.   With SQL Server, the 
> option to get a plan within programmatic execution seems to be per this 
> answer 
> http://stackoverflow.com/questions/7359702/how-do-i-obtain-a-query-execution-plan
>  to execute "SET SHOWPLAN_TEXT ON" ahead of time.
> 
> Besides that, you can confirm where the time is being spent exactly using 
> Python profiling.   A description on how to achieve that is here: 
> http://stackoverflow.com/questions/1171166/how-can-i-profile-a-sqlalchemy-powered-application/1175677#1175677
> 
> 
> 
> On May 12, 2014, at 3:48 PM, Seth P <spad...@gmail.com> wrote:
> 
>> After tracking down some extreme slowness in loading a one-to-many 
>> relationship (e.g. myobject.foobars), I seem to have isolated the issue to 
>> engine.execute() being much slower with parameterized queries than with 
>> explicit queries. The following is actual code and output for loading 10,971 
>> rows from a database table. (The actual table has more columns than I'm 
>> including here, and is not designed by me.) Note that each explicit query 
>> (where I explicitly set the WHERE clause parameter and pass the resulting 
>> SQL statement to engine.execute()) runs in under 0.1 seconds, whereas each 
>> parameterized query (where I let SQLAlchemy bind the WHERE clause parameter) 
>> takes over 8 seconds.
>> 
>> The difference in runtimes is smaller when the number of rows returned is 
>> smaller, which seems odd since I would have thought that the binding of the 
>> WHERE clause parameters is just done once and would be virtually 
>> instantaneous.
>> 
>> Any thoughts?
>> 
>> Thanks,
>> 
>> Seth
>> 
>> 
>> import sqlalchemy as sa
>> from sqlalchemy.orm import sessionmaker
>> from sqlalchemy.ext.declarative import declarative_base
>> import time
>> 
>> engine = sa.create_engine('mssql+pyodbc://Compustat/Compustat')
>> session = sessionmaker(bind=engine, autoflush=False, 
>> expire_on_commit=False)()
>> 
>> class FooBar(declarative_base()):
>>     __tablename__ = 'sec_dprc'
>>     gvkey = sa.Column(sa.String(6), primary_key=True)
>>     datadate = sa.Column(sa.DateTime, primary_key=True)
>>     value = sa.Column(sa.Float, name='prcod')
>> 
>> def print_timing(func):
>>     def wrapper(*arg):
>>         t1 = time.time()
>>         rows = func(*arg)
>>         t2 = time.time()
>>         print("%30s() len=%d, last=%s, runtime=%0.3fs" % (str(func).split(' 
>> at')[0][10:], len(rows), rows[-1], t2 - t1))
>>         return t2 - t1
>>     return wrapper
>> 
>> if __name__ == '__main__':
>> 
>>     key = '001045'
>>     query = session.query(FooBar.datadate, 
>> FooBar.value).filter(sa.and_(FooBar.gvkey == key)).order_by(FooBar.datadate)
>>     sql_select_statement_base = str(query)
>>     print(sql_select_statement_base)
>> 
>>     @print_timing
>>     def execute_explicit_query():
>>         sql_select_statement_explicit = 
>> sql_select_statement_base.replace(":gvkey_1", "'%s'" % key)
>>         rows = 
>> engine.execute(sa.text(sql_select_statement_explicit)).fetchall()
>>         return rows
>> 
>>     @print_timing
>>     def execute_parameterized_query():
>>         rows = engine.execute(sa.text(sql_select_statement_base), 
>> {'gvkey_1':key}).fetchall()
>>         return rows
>> 
>>     num_iterations = 5
>>     explicit_runtime = 0.0
>>     parameterized_runtime = 0.0
>>     for i in range(num_iterations):
>>         explicit_runtime += execute_explicit_query()
>>         parameterized_runtime += execute_parameterized_query()
>>     print("Total runtime for %d explicit queries = %0.3fs." % 
>> (num_iterations, explicit_runtime))
>>     print("Total runtime for %d parameterized queries = %0.3fs." % 
>> (num_iterations, parameterized_runtime))
>> 
>> 
>> SELECT sec_dprc.datadate AS sec_dprc_datadate, sec_dprc.prcod AS 
>> sec_dprc_prcod 
>> FROM sec_dprc 
>> WHERE sec_dprc.gvkey = :gvkey_1 ORDER BY sec_dprc.datadate
>>         execute_explicit_query() len=10971, last=(datetime.datetime(2014, 5, 
>> 9, 0, 0), 37.96), runtime=0.082s
>>    execute_parameterized_query() len=10971, last=(datetime.datetime(2014, 5, 
>> 9, 0, 0), 37.96), runtime=8.852s
>>         execute_explicit_query() len=10971, last=(datetime.datetime(2014, 5, 
>> 9, 0, 0), 37.96), runtime=0.032s
>>    execute_parameterized_query() len=10971, last=(datetime.datetime(2014, 5, 
>> 9, 0, 0), 37.96), runtime=8.754s
>>         execute_explicit_query() len=10971, last=(datetime.datetime(2014, 5, 
>> 9, 0, 0), 37.96), runtime=0.039s
>>    execute_parameterized_query() len=10971, last=(datetime.datetime(2014, 5, 
>> 9, 0, 0), 37.96), runtime=9.182s
>>         execute_explicit_query() len=10971, last=(datetime.datetime(2014, 5, 
>> 9, 0, 0), 37.96), runtime=0.028s
>>    execute_parameterized_query() len=10971, last=(datetime.datetime(2014, 5, 
>> 9, 0, 0), 37.96), runtime=9.416s
>>         execute_explicit_query() len=10971, last=(datetime.datetime(2014, 5, 
>> 9, 0, 0), 37.96), runtime=0.080s
>>    execute_parameterized_query() len=10971, last=(datetime.datetime(2014, 5, 
>> 9, 0, 0), 37.96), runtime=8.425s
>> Total runtime for 5 explicit queries = 0.260s.
>> Total runtime for 5 parameterized queries = 44.629s.
>> 
>> 
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "sqlalchemy" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to sqlalchemy+...@googlegroups.com.
>> To post to this group, send email to sqlal...@googlegroups.com.
>> Visit this group at http://groups.google.com/group/sqlalchemy.
>> For more options, visit https://groups.google.com/d/optout.
> 
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "sqlalchemy" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to sqlalchemy+unsubscr...@googlegroups.com.
> To post to this group, send email to sqlalchemy@googlegroups.com.
> Visit this group at http://groups.google.com/group/sqlalchemy.
> For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.

Reply via email to