[sqlalchemy] Re: psycopg2 default transaction inefficiency

2007-02-12 Thread JP

Yeah, I think I explained badly.

What I was trying to show with the sqlalchemy vs postgres query logs
is that extra BEGIN that psycopg2 is sending before the SELECT that
sqlalchemy executes. The BEGIN is implicitly added by psycopg2 because
it's in its default transaction isolation state of read comitted.
Which I didn't know was the DBAPI default -- so strike what I said
about changing any default behavior.

But It would be nice to be able to have total control over the
transactional state of the connection, so that when I know that I'm
just doing a select or two I don't have to have the overhead of a
BEGIN that I know is useless, but as things stand I can't do that,
because do_begin isn't implemented for any of the dialects. I guess
that's because sqlalchemy is depending on the underlying DBAPI
implementation to handle sending BEGIN statements before the first
query that is outside of a transaction?

I noticed a TODO about implementing true autocommit behavior, which is
what I need -- a way to tell the dbapi module *not* to start those
implicit transactions. If the engine/connection could be put into true
autocommit mode, and issue connection.begin() from do_begin when in
that mode, then I could do everything I want to do and I don't think
anything would break, since the orm calls do_begin before each flush.
I know how to change the isolation level in the sqlite and postgres
dialects -- would you be interested in a patch or work on a branch
where I added an isolation_level property to the engine, implemented
it for those two, and for those two dialects had do_begin issue a
connection.begin() when the isolation_level was set to autocommit? I
could probably tackle mysql as well, but I have no access to mssql or
oracle, so I can't do those.

Hopefully I'm making some more sense this time...

JP


On Feb 12, 12:42 am, Michael Bayer [EMAIL PROTECTED] wrote:
 im confused.  the django thread seems to be saying that it is setting
 psycopg2 *into* autocommit mode, which causes the problem; the default
 setting of autocommit for DBAPI (and maintained in SA) is False.  When
 i wrote frameworks years ago I always thought autocommit mode was
 the way to go but i since learned that issues like this arise  so SA
 *never* uses autocommit mode on the connection; instead it implements
 its own autocommitting behavior in a manner similar to hibernate -
 it looks at the statement being executed, determines if its a CRUD/
 CREATE/DROP expression, and then explicitly calls COMMIT if no SA
 transaction is already in progress.

 the log you have above doesnt prove that anything unusual is going on,
 since you are illustrating a transactional operation, then a SELECT,
 then another transactional operation, then another SELECT. SA does an
 explicit COMMIT for the CREATE statements since they are required to
 complete the table creation operation.

 this test confirms that psycopg2 defaults to autocommit as false and
 doesnt do anything with transaction modes after the connection is
 opened:

 import psycopg2 as psycopg

 conn = psycopg.connect(user='scott', password='tiger',
 host='127.0.0.1', database='test')

 for x in range(0, 5):
 curs = conn.cursor()
 curs.execute(SELECT 1)
 curs.close()

 log output:

 LOG:  statement: SET DATESTYLE TO 'ISO'
 LOG:  statement: SHOW client_encoding
 LOG:  statement: SHOW default_transaction_isolation
 LOG:  statement: BEGIN; SET TRANSACTION ISOLATION LEVEL READ COMMITTED
 LOG:  statement: SELECT 1
 LOG:  statement: SELECT 1
 LOG:  statement: SELECT 1
 LOG:  statement: SELECT 1
 LOG:  statement: SELECT 1
 LOG:  statement: ABORT

 one connection, five new cursors, only one set of setup
 corresponding to the connection.

 On Feb 9, 3:52 pm, JP [EMAIL PROTECTED] wrote:

  I noticed this thread on django-devs:

 http://groups.google.com/group/django-developers/browse_frm/thread/52...

  Which notes that psycopg2 by default starts transactions behind the
  scenes, even for select statements. If you happen to be running a web
  app where each hit starts a new cursor, and you only do a few selects,
  this adds a *lot* of overhead relative to autocommit.

  I wondered if sqlalchemy suffered from the same problem with psycopg2,
  and it does. Take a look at what sqlalchemy thinks it was doing for a
  short example, vs what the postgres query log contains.

  The code:

   import sqlalchemy as sa
   meta = sa.DynamicMetaData()
   users = sa.Table('users', meta, sa.Column('id', sa.Integer, 
   primary_key=True, autoincrement=True), sa.Column('name', sa.String(50)))
   meta.connect('postgres://snip', debug=True)
   meta.create_all()
   users.select().execute()

  sqlalchemy query log:
  select relname from pg_class where lower(relname) = %(name)s
  CREATE TABLE users (
  id SERIAL NOT NULL,
  name VARCHAR(50),
  PRIMARY KEY (id)
  )
  COMMIT
  SELECT users.id, users.name FROM users

  postgres query log:
  LOG:  statement: SET DATESTYLE TO 'ISO'
  LOG:  

[sqlalchemy] Re: psycopg2 default transaction inefficiency

2007-02-12 Thread Michael Bayer


On Feb 12, 10:30 am, JP [EMAIL PROTECTED] wrote:

 But It would be nice to be able to have total control over the
 transactional state of the connection, so that when I know that I'm
 just doing a select or two I don't have to have the overhead of a
 BEGIN that I know is useless, but as things stand I can't do that,
 because do_begin isn't implemented for any of the dialects. I guess
 that's because sqlalchemy is depending on the underlying DBAPI
 implementation to handle sending BEGIN statements before the first
 query that is outside of a transaction?

DBAPI, including psycopg2, has no begin() method; there is only
rollback() and commit().   like the example illustrates, there is no
BEGIN being issued for every SELECT statement when using psycopg2 in
non-autocommit mode, which applies to SQLAlchemy as well.   therefore
there is no performance bottleneck; this is a django issue only.


 I noticed a TODO about implementing true autocommit behavior, which is
 what I need -- a way to tell the dbapi module *not* to start those
 implicit transactions.

it doesnt.  run the example - no implicit transaction beyond the
initial connection (which is required, since it must implement
rollback() and commit()).  SA has been doing its own autocommit on top
of non-autocommitting connections since the 0.2 series at least.

running in non-auto-commit mode is standard practice for databases
like oracle.  theres always a transaction in place.  but its not
started for every select statement.  only at connection time and after
each rollback or commit.


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~--~~~~--~~--~--~---



[sqlalchemy] Re: psycopg2 default transaction inefficiency

2007-02-12 Thread JP

 like the example illustrates, there is no
 BEGIN being issued for every SELECT statement when using psycopg2 in
 non-autocommit mode, which applies to SQLAlchemy as well.   therefore
 there is no performance bottleneck; this is a django issue only.

I guess we're reading the example differently. I see this BEGIN:

LOG:  statement: BEGIN; SET TRANSACTION ISOLATION LEVEL READ COMMITTED
LOG:  statement: SELECT users.id, users.name FROM users

in the postgres log, but as far as I can tell sqlalchemy is not
sending it; it's being implicitly inserted by psycopg2 because the
isolation level of the connection is read-committed and it sees a
statement outside of a transaction. If the connection were in the
autocommit isolation level, doing the same thing in sqlalchemy would
not result in that BEGIN, only the SELECT would be issued.

  I noticed a TODO about implementing true autocommit behavior, which is
  what I need -- a way to tell the dbapi module *not* to start those
  implicit transactions.

 it doesnt.  run the example - no implicit transaction beyond the
 initial connection (which is required, since it must implement
 rollback() and commit()).

Either I'm very confused or I'm having lots of trouble getting my
point across. In my experience, it is absolutely the case that the
dbapi drivers are starting implicit transactions when they aren't in
autocommit mode -- otherwise I don't see how any transactions would
work correctly, since do_begin is a no-op everywhere -- so if the
dbapi modules aren't inserting the BEGIN to start transactions, I
don't see where it could be coming from. What am I missing? If I'm
wrong about what's going on here, how does sqlalchemy start
transactions when it needs to, and where is that BEGIN before the
select in my example coming from?

I have an example that I think illustrates what I'm talking about
pretty well, so I'm going to file a ticket and attach the example and
the postgres query log I see when I run it. Hopefully that will clear
things up one way or another.

JP


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~--~~~~--~~--~--~---



[sqlalchemy] Re: psycopg2 default transaction inefficiency

2007-02-11 Thread Michael Bayer

im confused.  the django thread seems to be saying that it is setting
psycopg2 *into* autocommit mode, which causes the problem; the default
setting of autocommit for DBAPI (and maintained in SA) is False.  When
i wrote frameworks years ago I always thought autocommit mode was
the way to go but i since learned that issues like this arise  so SA
*never* uses autocommit mode on the connection; instead it implements
its own autocommitting behavior in a manner similar to hibernate -
it looks at the statement being executed, determines if its a CRUD/
CREATE/DROP expression, and then explicitly calls COMMIT if no SA
transaction is already in progress.

the log you have above doesnt prove that anything unusual is going on,
since you are illustrating a transactional operation, then a SELECT,
then another transactional operation, then another SELECT. SA does an
explicit COMMIT for the CREATE statements since they are required to
complete the table creation operation.

this test confirms that psycopg2 defaults to autocommit as false and
doesnt do anything with transaction modes after the connection is
opened:

import psycopg2 as psycopg

conn = psycopg.connect(user='scott', password='tiger',
host='127.0.0.1', database='test')

for x in range(0, 5):
curs = conn.cursor()
curs.execute(SELECT 1)
curs.close()


log output:

LOG:  statement: SET DATESTYLE TO 'ISO'
LOG:  statement: SHOW client_encoding
LOG:  statement: SHOW default_transaction_isolation
LOG:  statement: BEGIN; SET TRANSACTION ISOLATION LEVEL READ COMMITTED
LOG:  statement: SELECT 1
LOG:  statement: SELECT 1
LOG:  statement: SELECT 1
LOG:  statement: SELECT 1
LOG:  statement: SELECT 1
LOG:  statement: ABORT

one connection, five new cursors, only one set of setup
corresponding to the connection.


On Feb 9, 3:52 pm, JP [EMAIL PROTECTED] wrote:
 I noticed this thread on django-devs:

 http://groups.google.com/group/django-developers/browse_frm/thread/52...

 Which notes that psycopg2 by default starts transactions behind the
 scenes, even for select statements. If you happen to be running a web
 app where each hit starts a new cursor, and you only do a few selects,
 this adds a *lot* of overhead relative to autocommit.

 I wondered if sqlalchemy suffered from the same problem with psycopg2,
 and it does. Take a look at what sqlalchemy thinks it was doing for a
 short example, vs what the postgres query log contains.

 The code:

  import sqlalchemy as sa
  meta = sa.DynamicMetaData()
  users = sa.Table('users', meta, sa.Column('id', sa.Integer, 
  primary_key=True, autoincrement=True), sa.Column('name', sa.String(50)))
  meta.connect('postgres://snip', debug=True)
  meta.create_all()
  users.select().execute()

 sqlalchemy query log:
 select relname from pg_class where lower(relname) = %(name)s
 CREATE TABLE users (
 id SERIAL NOT NULL,
 name VARCHAR(50),
 PRIMARY KEY (id)
 )
 COMMIT
 SELECT users.id, users.name FROM users

 postgres query log:
 LOG:  statement: SET DATESTYLE TO 'ISO'
 LOG:  statement: SHOW client_encoding
 LOG:  statement: SHOW default_transaction_isolation
 LOG:  statement: BEGIN; SET TRANSACTION ISOLATION LEVEL READ COMMITTED
 LOG:  statement: select relname from pg_class where lower(relname) =
 'users'
 LOG:  statement:
 CREATE TABLE users (
 id SERIAL NOT NULL,
 name VARCHAR(50),
 PRIMARY KEY (id)
 )
 LOG:  statement: END
 LOG:  statement: BEGIN; SET TRANSACTION ISOLATION LEVEL READ COMMITTED
 LOG:  statement: SELECT users.id, users.name FROM users

 I think it would be better for sqlalchemy to set psycopg2 to
 autocommit by default, and implement do_begin in the dialect so that
 transactions are only started when desired, rather than implicitly on
 the first statment seen by a cursor when there's no current
 transaction, as seems to be the case now.

 Mike, would you be interested in a patch that implemented that
 behavior? Does anyone disagree that that would be better?

 JP


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~--~~~~--~~--~--~---