OK well this is all quite disturbing !

I would want to examine this more closely.  but, lets assume this is  
really how it is,  since thats how it looks (eesh).

So, you dont really want to use named cursors, you just want psycopg2  
to use cursors in its underlying operations so that it has more  
efficient operation...and the magical incantation you send to  
psycopg2 in order for it to use cursors is conn.cursor("x") as  
opposed to conn.cursor() (which to me is like, totally arbitrary..and  
at this point i am getting pretty annoyed with how utterly  
undocumented psycopg2 is...but then again so is MySQLDB).  so this is  
total artifact-land, and it has no implication for SA to grow some  
new "fine-grained" cursor-based API, we are here just looking for a  
"hack workaround" internally.

My first instinct is to say, "ok, then we just use the call  
"conn.cursor('x')" to get cursors when using the postgres dialect".   
however, im told that this probably wont work for CRUD (i.e. insert,  
update etc) operations, so we would have to keep that in mind (SA  
does scan incoming SQL for INSERT/UPDATE etc so thats not an issue).   
also, is conn.cursor("x") more efficient in all cases ?  or is the  
overhead of creating a cursor significant, and only worthwhile when  
you know you are going to fetch only a subset of rows from an  
enormous result set ?

so my second instinct is to perhaps go with the approach of using  
conn.cursor("x") for all SELECT statements when using the postgres  
dialect, *if* you tell the dialect you want to use that.  im thinking  
a flag "use_named_cursors=True" when you create_engine().  Id really  
rather not pollute the connect()/Connection/execute() API with this  
concept if at all possible...i was initially thinking of doing  
something like, conn.with_cursor(mycursor).execute(), but thats  
pretty explicit for something that should really be an internal  
optimization.

thats my thoughts so far.  just FYI if you want to "hack" your SA to  
use named cursors in all cases, its line 282 and 322 of lib/ 
sqlalchemy/engine/base.py .


On Jan 9, 2007, at 11:40 AM, Michele Petrazzo wrote:

>
> Michael Bayer wrote:
>>
>> On Jan 9, 2007, at 5:24 AM, Michele Petrazzo wrote:
>>> Because, like what I wrote on the psycopg2 ml (lists.initd.org/
>>> pipermail/psycopg/2007-January/005250.html)
>>> without the "server side cursor", the memory usage on the client
>>> side is proportionally at the number of records returned by the
>>> query (on my example 400k records, 60MB ram). I think because the
>>> driver fetch all the data from the server and leave it available
>>> with the fetchXXX methods.
>>
>> really ? im not sure about psycopg2 specifically,
>
> Here, sure! Assuming the I'm on linux and I have "ps" utility locally.
> The db are on the same machine where I'm making tests. PG are 8.1.5  
> and
> psycopg2 2.0.4.
>
> This is my test:
>
> <-code->
> import time
> import subprocess as SP
> import psycopg2
>
> NAMED_CURSOR = 1
>
> def get_mem():
>    #Wait for the memory stabilization
>    time.sleep(0.5)
>
>    #get the memory usage with ps.
>    output = SP.Popen(["ps","-eo","rss,args"],
> stdout=SP.PIPE).communicate()[0]
>    for line in output.split('\n'):
>      if not line: continue
>      mem, pr_name = line.split(None,1)
>      #show only the my python exe
>      if not "python " + __file__ in pr_name: continue
>      return mem
>
> def create_cur():
>    if NAMED_CURSOR:
>      return con.cursor("test")
>    else:
>      return con.cursor()
>
> def exec_q(q):
>    cur = create_cur()
>    print "empty cursor ", get_mem()
>    cur.execute(q)
>    print "cursor after execute ", get_mem()
>    print "result --", cur.fetchone()
>    print "after fetchone ", get_mem()
>    cur.close()
>    print "cursor closed ", get_mem()
>
> print "python and modules ", get_mem()
>
> con = psycopg2.connect('user=test password=test dbname=book')
> print "connection ", get_mem()
>
> exec_q("SELECT COUNT(id) FROM book")
>
> exec_q("SELECT * FROM book")
>
>
> </-code->
>
> ## with NAMED_CURSOR = 1
>
> michele:~/tmp$ time python test_sql_mem.py
> python and modules  3928
> connection  4048
> empty cursor  4068
> cursor after execute  4080
> result -- (406500L,)
> after fetchone  4088
> cursor closed  4092
> empty cursor  4092
> cursor after execute  4092
> result -- (6087, 'title', 15, '18', 'Sonzogno', 508126)
> after fetchone  4092
> cursor closed  4092
>
> real    0m6.107s
> user    0m0.080s
> sys     0m0.076s
> michele:~/tmp$
>
> ## with NAMED_CURSOR = 0
>
> michele:~/tmp$ time python test_sql_mem.py
> python and modules  3932
> connection  4048
> empty cursor  4068
> cursor after execute  4080
> result -- (406500L,)
> after fetchone  4092
> cursor closed  4092
> empty cursor  4092
> cursor after execute  58440
> result -- (6087, 'title', 15, '18', 'Sonzogno', 508126)
> after fetchone  58452
> cursor closed  58452
>
> real    0m11.499s
> user    0m0.544s
> sys     0m0.176s
> michele:~/tmp$
>
>> but its been my
>> experience with virtually all database client libraries that as you
>> fetch rows and discard them, you only pull as much as you fetch into
>> memory.  i.e. if you had a resultset of 4000 rows and just fetched 5
>> rows, only 5 rows get pulled over the wire.  thats definitely how the
>> lower level APIs work
>
>
> I'm with you, but my code show other results
>
>> and it would be pretty bad if psycopg2 didnt maintain that behavior.
>
> I'll send another email to the psycopg2 ml, with this results, and  
> I'll
> reply you in the near future.
>
> Michele
>
> >


--~--~---------~--~----~------------~-------~--~----~
 You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to