On Thu, Dec 15, 2011 at 19:52, Jon Nelson <jnel...@jamponi.net> wrote:
> On Thu, Dec 15, 2011 at 12:01 PM, Michael Bayer
> <mike...@zzzcomputing.com> wrote:
>>
>> On Dec 15, 2011, at 12:51 PM, Jon Nelson wrote:
>>
>>> Up front, I'm not using the ORM at all, and I'm using SQLAlchemy 0.7.4
>>> with psycopg2 2.4.3 on PostgreSQL 8.4.10 on Linux x86_64.
>>>
>>> I did some performance testing. Selecting 75 million rows (a straight
>>> up SELECT colA from tableA) from a 5GB table yielded some interesting
>>> results.
>>> psycopg2 averaged between 485,000 and 585,000 rows per second.
>>> Using COPY (via psycopg2) the average was right around 585,000.
>>> sqlalchemy averaged between 160,000 and 190,000 rows per second.
>>> That's a pretty big difference.

Weird, IIRC, SA was much closer than raw psycopg2 (without using
COPY), in the range of SA adding a 50% overhead, not a 200% overhead.

>>> I briefly looked into what the cause could be, but I didn't see
>>> anything jump out at me (except RowProxy, maybe).
>>> Thoughts?
>>
>> Performance tests like this are fraught with complicating details (such as, 
>> did you fully fetch each column in each row in both cases?  Did you have 
>> equivalent unicode and numeric conversions in place in both tests ? ).   In 
>> this case psycopg2 is written in pure C and SQLAlchemy's result proxy only 
>> partially (did you use the C extensions ?).    You'd use the Python 
>> profiling module to get a clear picture for what difference there is in 
>> effort.   But using any kind of abstraction layer, especially one written in 
>> Python, will always add latency versus a pure C program.
>
> I pretty much did this:
> for row in rows:
>  count += 1

That test is probably flawed, as you don't fetch actual values. You
should try to access individual elements (either by iterating over the
row, or indexing it one way or another -- the speed difference can
vary quite a bit depending on that). You might get even worse results
with a proper test though ;-).

-- 
Gaëtan de Menten

-- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To post to this group, send email to sqlalchemy@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.

Reply via email to