[sqlalchemy] Re: [Sqlalchemy-users] named cursor
server-side cursors (i.e. named) for all postgres functionality by default is currently in the trunk, and you can turn it off by saying client_side_cursors=True in create_engine(). however one user reported that table reflection breaks, which I cannot reproduce. so can some folks please try out the trunk with postgres and see if we can go with server side cursors by default? ive run the unit tests many times with both settings and i dont perceieve any performance hit from using server side cursors. id like this to be in the next release, but if theres some issue using server side cursors in all cases, then i have to break open the code and make it a lot more complex to detect the conditions where server-side cursors are appropriate. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: [Sqlalchemy-users] named cursor
Hi Mike, I've been lurking here in the shadows for a while...this particular topic looks very interesting to me and I'd love to upgrade to HEAD to try it out. However, I'm wondering what the risk factor is in doing that? Have there been any other major (potentially destabilizing) changes lately? I'm using r2168 right now, and it's been very stable. I've upgraded to HEAD quite a few times over the past year and I've always been impressed with (1) the overall stability of SA and (2) speed with which bugs were fixed when they are found. I'm just asking for a general idea of how stable you feel the trunk is right now. Don't worry, I'll take responsibility for my actions and will in no way hold you responsible for anything that may occur if I decide to upgrade right now--just looking for a general indicator. Thanks. ~ Daniel Michael Bayer wrote: server-side cursors (i.e. named) for all postgres functionality by default is currently in the trunk, and you can turn it off by saying client_side_cursors=True in create_engine(). however one user reported that table reflection breaks, which I cannot reproduce. so can some folks please try out the trunk with postgres and see if we can go with server side cursors by default? ive run the unit tests many times with both settings and i dont perceieve any performance hit from using server side cursors. id like this to be in the next release, but if theres some issue using server side cursors in all cases, then i have to break open the code and make it a lot more complex to detect the conditions where server-side cursors are appropriate. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: [Sqlalchemy-users] named cursor
it should be pretty stable, we havent had much dramatic going on. i moved some things around a bit in r2174 with regards to eager loading but i think those changes have ironed out pretty well. On Jan 10, 2007, at 10:39 PM, Daniel Miller wrote: Hi Mike, I've been lurking here in the shadows for a while...this particular topic looks very interesting to me and I'd love to upgrade to HEAD to try it out. However, I'm wondering what the risk factor is in doing that? Have there been any other major (potentially destabilizing) changes lately? I'm using r2168 right now, and it's been very stable. I've upgraded to HEAD quite a few times over the past year and I've always been impressed with (1) the overall stability of SA and (2) speed with which bugs were fixed when they are found. I'm just asking for a general idea of how stable you feel the trunk is right now. Don't worry, I'll take responsibility for my actions and will in no way hold you responsible for anything that may occur if I decide to upgrade right now--just looking for a general indicator. Thanks. ~ Daniel Michael Bayer wrote: server-side cursors (i.e. named) for all postgres functionality by default is currently in the trunk, and you can turn it off by saying client_side_cursors=True in create_engine(). however one user reported that table reflection breaks, which I cannot reproduce. so can some folks please try out the trunk with postgres and see if we can go with server side cursors by default? ive run the unit tests many times with both settings and i dont perceieve any performance hit from using server side cursors. id like this to be in the next release, but if theres some issue using server side cursors in all cases, then i have to break open the code and make it a lot more complex to detect the conditions where server-side cursors are appropriate. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: [Sqlalchemy-users] named cursor
On Jan 9, 2007, at 5:24 AM, Michele Petrazzo wrote: Because, like what I wrote on the psycopg2 ml (lists.initd.org/ pipermail/psycopg/2007-January/005250.html) without the server side cursor, the memory usage on the client side is proportionally at the number of records returned by the query (on my example 400k records, 60MB ram). I think because the driver fetch all the data from the server and leave it available with the fetchXXX methods. really ? im not sure about psycopg2 specifically, but its been my experience with virtually all database client libraries that as you fetch rows and discard them, you only pull as much as you fetch into memory. i.e. if you had a resultset of 4000 rows and just fetched 5 rows, only 5 rows get pulled over the wire. thats definitely how the lower level APIs work and it would be pretty bad if psycopg2 didnt maintain that behavior. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: [Sqlalchemy-users] named cursor
Michael Bayer wrote: On Jan 9, 2007, at 5:24 AM, Michele Petrazzo wrote: Because, like what I wrote on the psycopg2 ml (lists.initd.org/ pipermail/psycopg/2007-January/005250.html) without the server side cursor, the memory usage on the client side is proportionally at the number of records returned by the query (on my example 400k records, 60MB ram). I think because the driver fetch all the data from the server and leave it available with the fetchXXX methods. really ? im not sure about psycopg2 specifically, Here, sure! Assuming the I'm on linux and I have ps utility locally. The db are on the same machine where I'm making tests. PG are 8.1.5 and psycopg2 2.0.4. This is my test: -code- import time import subprocess as SP import psycopg2 NAMED_CURSOR = 1 def get_mem(): #Wait for the memory stabilization time.sleep(0.5) #get the memory usage with ps. output = SP.Popen([ps,-eo,rss,args], stdout=SP.PIPE).communicate()[0] for line in output.split('\n'): if not line: continue mem, pr_name = line.split(None,1) #show only the my python exe if not python + __file__ in pr_name: continue return mem def create_cur(): if NAMED_CURSOR: return con.cursor(test) else: return con.cursor() def exec_q(q): cur = create_cur() print empty cursor , get_mem() cur.execute(q) print cursor after execute , get_mem() print result --, cur.fetchone() print after fetchone , get_mem() cur.close() print cursor closed , get_mem() print python and modules , get_mem() con = psycopg2.connect('user=test password=test dbname=book') print connection , get_mem() exec_q(SELECT COUNT(id) FROM book) exec_q(SELECT * FROM book) /-code- ## with NAMED_CURSOR = 1 michele:~/tmp$ time python test_sql_mem.py python and modules 3928 connection 4048 empty cursor 4068 cursor after execute 4080 result -- (406500L,) after fetchone 4088 cursor closed 4092 empty cursor 4092 cursor after execute 4092 result -- (6087, 'title', 15, '18', 'Sonzogno', 508126) after fetchone 4092 cursor closed 4092 real0m6.107s user0m0.080s sys 0m0.076s michele:~/tmp$ ## with NAMED_CURSOR = 0 michele:~/tmp$ time python test_sql_mem.py python and modules 3932 connection 4048 empty cursor 4068 cursor after execute 4080 result -- (406500L,) after fetchone 4092 cursor closed 4092 empty cursor 4092 cursor after execute 58440 result -- (6087, 'title', 15, '18', 'Sonzogno', 508126) after fetchone 58452 cursor closed 58452 real0m11.499s user0m0.544s sys 0m0.176s michele:~/tmp$ but its been my experience with virtually all database client libraries that as you fetch rows and discard them, you only pull as much as you fetch into memory. i.e. if you had a resultset of 4000 rows and just fetched 5 rows, only 5 rows get pulled over the wire. thats definitely how the lower level APIs work I'm with you, but my code show other results and it would be pretty bad if psycopg2 didnt maintain that behavior. I'll send another email to the psycopg2 ml, with this results, and I'll reply you in the near future. Michele --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: [Sqlalchemy-users] named cursor
OK well this is all quite disturbing ! I would want to examine this more closely. but, lets assume this is really how it is, since thats how it looks (eesh). So, you dont really want to use named cursors, you just want psycopg2 to use cursors in its underlying operations so that it has more efficient operation...and the magical incantation you send to psycopg2 in order for it to use cursors is conn.cursor(x) as opposed to conn.cursor() (which to me is like, totally arbitrary..and at this point i am getting pretty annoyed with how utterly undocumented psycopg2 is...but then again so is MySQLDB). so this is total artifact-land, and it has no implication for SA to grow some new fine-grained cursor-based API, we are here just looking for a hack workaround internally. My first instinct is to say, ok, then we just use the call conn.cursor('x') to get cursors when using the postgres dialect. however, im told that this probably wont work for CRUD (i.e. insert, update etc) operations, so we would have to keep that in mind (SA does scan incoming SQL for INSERT/UPDATE etc so thats not an issue). also, is conn.cursor(x) more efficient in all cases ? or is the overhead of creating a cursor significant, and only worthwhile when you know you are going to fetch only a subset of rows from an enormous result set ? so my second instinct is to perhaps go with the approach of using conn.cursor(x) for all SELECT statements when using the postgres dialect, *if* you tell the dialect you want to use that. im thinking a flag use_named_cursors=True when you create_engine(). Id really rather not pollute the connect()/Connection/execute() API with this concept if at all possible...i was initially thinking of doing something like, conn.with_cursor(mycursor).execute(), but thats pretty explicit for something that should really be an internal optimization. thats my thoughts so far. just FYI if you want to hack your SA to use named cursors in all cases, its line 282 and 322 of lib/ sqlalchemy/engine/base.py . On Jan 9, 2007, at 11:40 AM, Michele Petrazzo wrote: Michael Bayer wrote: On Jan 9, 2007, at 5:24 AM, Michele Petrazzo wrote: Because, like what I wrote on the psycopg2 ml (lists.initd.org/ pipermail/psycopg/2007-January/005250.html) without the server side cursor, the memory usage on the client side is proportionally at the number of records returned by the query (on my example 400k records, 60MB ram). I think because the driver fetch all the data from the server and leave it available with the fetchXXX methods. really ? im not sure about psycopg2 specifically, Here, sure! Assuming the I'm on linux and I have ps utility locally. The db are on the same machine where I'm making tests. PG are 8.1.5 and psycopg2 2.0.4. This is my test: -code- import time import subprocess as SP import psycopg2 NAMED_CURSOR = 1 def get_mem(): #Wait for the memory stabilization time.sleep(0.5) #get the memory usage with ps. output = SP.Popen([ps,-eo,rss,args], stdout=SP.PIPE).communicate()[0] for line in output.split('\n'): if not line: continue mem, pr_name = line.split(None,1) #show only the my python exe if not python + __file__ in pr_name: continue return mem def create_cur(): if NAMED_CURSOR: return con.cursor(test) else: return con.cursor() def exec_q(q): cur = create_cur() print empty cursor , get_mem() cur.execute(q) print cursor after execute , get_mem() print result --, cur.fetchone() print after fetchone , get_mem() cur.close() print cursor closed , get_mem() print python and modules , get_mem() con = psycopg2.connect('user=test password=test dbname=book') print connection , get_mem() exec_q(SELECT COUNT(id) FROM book) exec_q(SELECT * FROM book) /-code- ## with NAMED_CURSOR = 1 michele:~/tmp$ time python test_sql_mem.py python and modules 3928 connection 4048 empty cursor 4068 cursor after execute 4080 result -- (406500L,) after fetchone 4088 cursor closed 4092 empty cursor 4092 cursor after execute 4092 result -- (6087, 'title', 15, '18', 'Sonzogno', 508126) after fetchone 4092 cursor closed 4092 real0m6.107s user0m0.080s sys 0m0.076s michele:~/tmp$ ## with NAMED_CURSOR = 0 michele:~/tmp$ time python test_sql_mem.py python and modules 3932 connection 4048 empty cursor 4068 cursor after execute 4080 result -- (406500L,) after fetchone 4092 cursor closed 4092 empty cursor 4092 cursor after execute 58440 result -- (6087, 'title', 15, '18', 'Sonzogno', 508126) after fetchone 58452 cursor closed 58452 real0m11.499s user0m0.544s sys 0m0.176s michele:~/tmp$ but its been my experience with virtually all database client libraries that as you fetch rows and discard them, you only pull as much as you fetch into memory. i.e. if you had a