Re: Must COMMIT after SELECT

2008-02-07 Thread M.-A. Lemburg
On 2008-02-07 16:46, Carsten Haese wrote:
> On Thu, 2008-02-07 at 16:33 +0100, M.-A. Lemburg wrote:
>> mxODBC has support for named cursors that you can later
>> use for positioned updates. 
> 
> Since we're on the topic of shameless plugs, InformixDB has this
> feature, too :)
> 
>> However, it's usually better to do updates in the classical
>> way, ie. by referencing a primary key.
> 
> That depends on the notion of "better". It's entirely possible that a
> "WHERE CURRENT OF " clause finds the row in question faster than
> even an indexed access using a primary key could. If you're processing a
> single row, I would agree that using an update cursor is overkill, but
> if you're processing a large number of rows, using an update cursor
> might be beneficial.

Agreed.

This approach is also very useful if you have to do extra analysis
of the rows in question *outside* the database, before selecting them
for update.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Feb 07 2008)
>>> Python/Zope Consulting and Support ...http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Must COMMIT after SELECT

2008-02-07 Thread Carsten Haese
On Thu, 2008-02-07 at 16:33 +0100, M.-A. Lemburg wrote:
> mxODBC has support for named cursors that you can later
> use for positioned updates. 

Since we're on the topic of shameless plugs, InformixDB has this
feature, too :)

> 
> However, it's usually better to do updates in the classical
> way, ie. by referencing a primary key.

That depends on the notion of "better". It's entirely possible that a
"WHERE CURRENT OF " clause finds the row in question faster than
even an indexed access using a primary key could. If you're processing a
single row, I would agree that using an update cursor is overkill, but
if you're processing a large number of rows, using an update cursor
might be beneficial.

-- 
Carsten Haese
http://informixdb.sourceforge.net


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Must COMMIT after SELECT

2008-02-07 Thread M.-A. Lemburg
On 2008-02-07 14:29, Steve Holden wrote:
> Paul Boddie wrote:
>> With PostgreSQL, my impression is that the intended way of using
>> cursors is not entirely compatible with the DB-API: you declare
>> cursors only when you know what the query will be, not in advance, and
>> they can only be used with certain kinds of operations. As far as I
>> recall, pyPgSQL supports cursors fairly transparently, albeit through
>> various ad-hoc measures, whereas psycopg2 only does so for "named
>> cursors" - a concept missing from the DB-API as far as I can see.
>>
> Yes, unfortunately the nomenclature of the DB API conflicts with that of 
> SQL'S DECLARE CURSOR, where the named cursor is effectively a sequence 
> of query results that (under certain isolation levels and patterns of 
> usage) can reflect database changes as they occur. Different 
> implementers have chosen different relationships between DB API cursors 
> and SQL cursors since it was introduced in the SQL 92 standard.
> 
> I believe, without documentary justification, that named cursors were 
> introduced into SQL to support stored procedures, and therefore weren't 
> intended to be used for queries whose results were communicated outside 
> the server.

Cursors defined using DECLARE CURSOR usually live in the
scope of the database engine. They are different from the cursors
defined and used with the database APIs.

MS even warns against mixing them:

http://msdn2.microsoft.com/en-us/library/aa172576(SQL.80).aspx

The Python DB-API is defined at the database API level, so the
same applies in the context of DB-API cursors.

In practice, I've not had problems with accessing named cursors
using DB-API cursors.

The main reason for the MS warning is that
cursors can be used for lots of interesting optimizations
such as auto-updating result sets and scrolling, positioned
updates or deletions, etc. The DB-API doesn't expose all these
nifty features, so doesn't easily run into trouble.

mxODBC has support for named cursors that you can later
use for positioned updates. You declare the name of the
cursor when creating it:

cursor1 = connection.cursor('mycursor')
cursor1.execute('select id, name, value from mytable')

# Position the mycursor on row 10
cursor1.fetchmany(10)

# Update row 10
cursor2 = connection.cursor()
cursor2.execute('update mytable set value = ? where current of mycursor')

cursor1.close()
cursor2.close()

However, it's usually better to do updates in the classical
way, ie. by referencing a primary key.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Feb 07 2008)
>>> Python/Zope Consulting and Support ...http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Must COMMIT after SELECT

2008-02-07 Thread Steve Holden
Paul Boddie wrote:
[...]
> 
> I don't use psycopg2 at the moment, but I did patch it to allow more
> transparent usage of cursors, and there's an unapplied patch for this
> floating around in the bug tracker. Lately, I've been using pyPgSQL
> instead and not really doing huge selects from Python code anyway, but
> I'm still using fetchmany for one or two things.
> 
[Makes note to look at pyPgSQL]

regards
  Steve
-- 
Steve Holden+1 571 484 6266   +1 800 494 3119
Holden Web LLC  http://www.holdenweb.com/

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Must COMMIT after SELECT

2008-02-07 Thread Paul Boddie
On 7 Feb, 14:29, Steve Holden <[EMAIL PROTECTED]> wrote:
>
> That's true, and your remarks clarify cursor usage in the DB API very
> well. Most people most of the time tend to ignore the existence of
> cursor.fetchmany() in the DB API, despite the fact that it can provide
> huge efficiency gains over both .fetchone() (can slow processing by
> requiring too many database interactions per query) and .fetchmany()

fetchall ;-)

> (which can generate large memory overhead in the case of huge result sets).

Indeed. I managed to run into a problem with DB-API-compliant code and
psycopg2 in this respect: select a large number of rows, watch
PostgreSQL do its work, see the Python process suck down the entire
result set. It's a situation reminiscent of that incident involving a
python and an alligator in the Florida swamps, but where the alligator
survives.

I don't use psycopg2 at the moment, but I did patch it to allow more
transparent usage of cursors, and there's an unapplied patch for this
floating around in the bug tracker. Lately, I've been using pyPgSQL
instead and not really doing huge selects from Python code anyway, but
I'm still using fetchmany for one or two things.

Paul
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Must COMMIT after SELECT

2008-02-07 Thread Steve Holden
Paul Boddie wrote:
> On 7 Feb, 08:52, Frank Aune <[EMAIL PROTECTED]> wrote:
>> On Wednesday 06 February 2008 16:16:45 Paul Boddie wrote:
>>
>>> Really, the rule is this: always (where the circumstances described
>>> above apply) make sure that you terminate a transaction before
>>> attempting to read committed, updated data.
>> How exactly do you terminate a transaction then?Do you terminate a 
>> transaction
>> by closing the cursor?
> 
> No, the transaction is controlled using the connection object in the
> DB-API, specifically by the commit and rollback methods.
> 
>> Would this indicate that for each query you perform
>> against the db, you should:
>>
>> - Open cursor
>> - Perform query
>> - Close cursor
> 
> I tend to open a separate cursor for each query whose result set I
> want to keep around. In other words, if I'm doing a number of queries
> whose results will not be manipulated via the DB-API (using fetchone,
> fetchmany, fetchall) after the next query is executed, then I only
> open one cursor - it's like this:
> 
>   cr = c.cursor()
>   try:
> cr.execute(query1)
> do stuff with cr.fetchone/many/all
> cr.execute(query2)
> ...
>   finally:
> cr.close()
> 
> As the DB-API indicates, if you want to manipulate more than one
> result set, you need more than one cursor.
> 
Clearly you need multiple cursors if you want to be reading from the 
database on both cursors simultaneously. Like you, though, I tend to use 
a single cursor whenever I can get away with it (and now I think about 
it, that's probably because of the greater likelihood of seeing db 
consistency due to the read repeatability, though I can honestly say I 
hadn't really formalized that principle).

> With PostgreSQL, my impression is that the intended way of using
> cursors is not entirely compatible with the DB-API: you declare
> cursors only when you know what the query will be, not in advance, and
> they can only be used with certain kinds of operations. As far as I
> recall, pyPgSQL supports cursors fairly transparently, albeit through
> various ad-hoc measures, whereas psycopg2 only does so for "named
> cursors" - a concept missing from the DB-API as far as I can see.
> 
Yes, unfortunately the nomenclature of the DB API conflicts with that of 
SQL'S DECLARE CURSOR, where the named cursor is effectively a sequence 
of query results that (under certain isolation levels and patterns of 
usage) can reflect database changes as they occur. Different 
implementers have chosen different relationships between DB API cursors 
and SQL cursors since it was introduced in the SQL 92 standard.

I believe, without documentary justification, that named cursors were 
introduced into SQL to support stored procedures, and therefore weren't 
intended to be used for queries whose results were communicated outside 
the server.

MySQL (never the most reliable indication of standardized SQL), 
restricts them to use inside stored procedures and triggers. PostgreSQL 
makes them more generally available, and I suppose it is possible to use 
a DB API cursor to execute a FETCH statement to retrieve data from a SQL 
CURSOR though I have never tried to do it.

SQL Server provides two types of cursor, one for use in the standard 
manner and one that conforms more closely with the ODBC cursor concept.

I tend to treat the database as a purely relational store, and not use 
stored procedures, but this is a luxury not everyone can afford - it's 
just necessary to use them for efficiency reasons sometimes, though not 
in the applications I am mostly involved with. Consequently I haven't 
made much use of SQL CURSORs.

>> The concept of cursor in MySQL is apparantly very different from what I
>> originally thought. I always thought the cursor is the "handler" for a
>> certain connection, and that you needed to commit before closing down this
>> handler - else changes were automatically rolled back.
> 
> It's easy to believe this, given the "hierarchical" nature of the API.
> However, cursors are just things which keep track of result sets, and
> I suppose that they are most useful when you perform a query which
> produces a large number of result rows, but where you only want to
> read a limited number of those rows at a time.
> 
That's true, and your remarks clarify cursor usage in the DB API very 
well. Most people most of the time tend to ignore the existence of 
cursor.fetchmany() in the DB API, despite the fact that it can provide 
huge efficiency gains over both .fetchone() (can slow processing by 
requiring too many database interactions per query) and .fetchmany() 
(which can generate large memory overhead in the case of huge result sets).

regards
  Steve
-- 
Steve Holden+1 571 484 6266   +1 800 494 3119
Holden Web LLC  http://www.holdenweb.com/
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Must COMMIT after SELECT (was: Very weird behavior in MySQLdb "execute")

2008-02-07 Thread Paul Boddie
On 7 Feb, 08:52, Frank Aune <[EMAIL PROTECTED]> wrote:
> On Wednesday 06 February 2008 16:16:45 Paul Boddie wrote:
>
> > Really, the rule is this: always (where the circumstances described
> > above apply) make sure that you terminate a transaction before
> > attempting to read committed, updated data.
>
> How exactly do you terminate a transaction then?Do you terminate a transaction
> by closing the cursor?

No, the transaction is controlled using the connection object in the
DB-API, specifically by the commit and rollback methods.

> Would this indicate that for each query you perform
> against the db, you should:
>
> - Open cursor
> - Perform query
> - Close cursor

I tend to open a separate cursor for each query whose result set I
want to keep around. In other words, if I'm doing a number of queries
whose results will not be manipulated via the DB-API (using fetchone,
fetchmany, fetchall) after the next query is executed, then I only
open one cursor - it's like this:

  cr = c.cursor()
  try:
cr.execute(query1)
do stuff with cr.fetchone/many/all
cr.execute(query2)
...
  finally:
cr.close()

As the DB-API indicates, if you want to manipulate more than one
result set, you need more than one cursor.

With PostgreSQL, my impression is that the intended way of using
cursors is not entirely compatible with the DB-API: you declare
cursors only when you know what the query will be, not in advance, and
they can only be used with certain kinds of operations. As far as I
recall, pyPgSQL supports cursors fairly transparently, albeit through
various ad-hoc measures, whereas psycopg2 only does so for "named
cursors" - a concept missing from the DB-API as far as I can see.

> The concept of cursor in MySQL is apparantly very different from what I
> originally thought. I always thought the cursor is the "handler" for a
> certain connection, and that you needed to commit before closing down this
> handler - else changes were automatically rolled back.

It's easy to believe this, given the "hierarchical" nature of the API.
However, cursors are just things which keep track of result sets, and
I suppose that they are most useful when you perform a query which
produces a large number of result rows, but where you only want to
read a limited number of those rows at a time.

Paul
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Must COMMIT after SELECT (was: Very weird behavior in MySQLdb "execute")

2008-02-06 Thread Frank Aune
On Wednesday 06 February 2008 16:16:45 Paul Boddie wrote:
> Really, the rule is this: always (where the circumstances described
> above apply) make sure that you terminate a transaction before
> attempting to read committed, updated data.

How exactly do you terminate a transaction then?Do you terminate a transaction 
by closing the cursor? Would this indicate that for each query you perform 
against the db, you should:

- Open cursor
- Perform query
- Close cursor

The concept of cursor in MySQL is apparantly very different from what I 
originally thought. I always thought the cursor is the "handler" for a 
certain connection, and that you needed to commit before closing down this 
handler - else changes were automatically rolled back.

Thanks,
Frank
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Must COMMIT after SELECT (was: Very weird behavior in MySQLdb "execute")

2008-02-06 Thread Paul Boddie
On 6 Feb, 16:04, Frank Aune <[EMAIL PROTECTED]> wrote:
>
> Whenever I did a SELECT() on the first connection, the cursor would
> stop "seeing" new entries commited in the log table by the other connection.
> I always assumed you needed COMMIT() after adding new content to the
> database, not after every single query, but this perhaps indicate otherwise?

No, you just need to be aware that the database module has probably
started a transaction on your behalf and that, as John points out, for
certain database systems in certain configurations, any repetition of
the same query in the same transaction will produce the same results,
thus avoiding the "nonrepeatable read" problem.

Really, the rule is this: always (where the circumstances described
above apply) make sure that you terminate a transaction before
attempting to read committed, updated data. If you religiously perform
a COMMIT (or ROLLBACK) after every query, you'll just create lots of
transactions for no good reason.

Paul
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Must COMMIT after SELECT (was: Very weird behavior in MySQLdb "execute")

2008-02-06 Thread Frank Aune
On Tuesday 05 February 2008 18:58:49 John Nagle wrote:
> So you really do have to COMMIT after a SELECT, if you are reusing
> the database connection.  CGI programs usually don't have this issue,
> because their connections don't live long, but long-running FCGI (and maybe
> Twisted) programs do.

I've experienced the same thing for long-running tasks even when using 
different connections/cursors against the same db for db queries and log 
writing dbhandlers respectively.

Whenever I did a SELECT() on the first connection, the cursor would 
stop "seeing" new entries commited in the log table by the other connection. 
I always assumed you needed COMMIT() after adding new content to the 
database, not after every single query, but this perhaps indicate otherwise?

Regards,
Frank



-- 
http://mail.python.org/mailman/listinfo/python-list


Must COMMIT after SELECT (was: Very weird behavior in MySQLdb "execute")

2008-02-05 Thread John Nagle
Steve Holden wrote:
> John Nagle wrote:
>> Carsten Haese wrote:
>>> On Mon, 2008-02-04 at 11:30 -0800, John Nagle wrote:
 Restarting the MySQL instance changes the database.  The entry 
 "google.com"
 disappears, and is replaced by "www.google.com".  This must indicate 
 a hanging
 transaction that wasn't committed.

 But that transaction didn't come from the Python IDLE session I've been
 making test calls from.  Those queries should match the graphical 
 client
 exactly.

 So why don't they agree?
>>> I don't have a definitive answer, but I do have a piece of generic
>>> advice. If two database sessions receive differing results, the cause
>>> could be any of the following factors:
>>>
>>> 1) One session is in a transaction, but the other is not.
>>>
>>> 2) Both sessions are in a transaction, but they are at different
>>> isolation levels.
>>>
>>> 3) Both sessions are in a transaction, but the transactions were started
>>> at different times.
>>
>>  I see what's wrong, I think.  I haven't been doing a commit after
>> a SELECT.  I've been careful to commit after write-type actions,
>> but not after read-only actions.  I'm using InnoDB in default mode,
>> which is REPEATABLE READ, and I've recently switched to long-running
>> processes which keep the database connection open for hours.  So the
>> data view for a given connection never changes, regardless of what's
>> happening in other threads.
>>
> 
> 
> I believe you are correct in your belief that a commit() is never 
> required after SELECT. Once a database change is committed it should 
> become visible to all other connections (under normal circumstances).

No.  If you're using InnoDB in default mode, you're in "repeatable read"
mode, which means the same SELECT on the same connection without an
intervening commit will produce the same result, regardless of any
other transactions.

I have one FCGI program which does only SELECT calls, for processing quick
lookup-only requests. This program wasn't picking up changes made by
transactions from other programs, because it's a long-running program
with a persistent database connection, and it didn't do a COMMIT.
Once I made it do a COMMIT after each SELECT, it started picking up
the changes being made elsewhere.

So you really do have to COMMIT after a SELECT, if you are reusing
the database connection.  CGI programs usually don't have this issue,
because their connections don't live long, but long-running FCGI (and maybe
Twisted) programs do.

John Nagle
-- 
http://mail.python.org/mailman/listinfo/python-list