Re: Must COMMIT after SELECT
On 2008-02-07 16:46, Carsten Haese wrote: > On Thu, 2008-02-07 at 16:33 +0100, M.-A. Lemburg wrote: >> mxODBC has support for named cursors that you can later >> use for positioned updates. > > Since we're on the topic of shameless plugs, InformixDB has this > feature, too :) > >> However, it's usually better to do updates in the classical >> way, ie. by referencing a primary key. > > That depends on the notion of "better". It's entirely possible that a > "WHERE CURRENT OF " clause finds the row in question faster than > even an indexed access using a primary key could. If you're processing a > single row, I would agree that using an update cursor is overkill, but > if you're processing a large number of rows, using an update cursor > might be beneficial. Agreed. This approach is also very useful if you have to do extra analysis of the rows in question *outside* the database, before selecting them for update. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 07 2008) >>> Python/Zope Consulting and Support ...http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 -- http://mail.python.org/mailman/listinfo/python-list
Re: Must COMMIT after SELECT
On Thu, 2008-02-07 at 16:33 +0100, M.-A. Lemburg wrote: > mxODBC has support for named cursors that you can later > use for positioned updates. Since we're on the topic of shameless plugs, InformixDB has this feature, too :) > > However, it's usually better to do updates in the classical > way, ie. by referencing a primary key. That depends on the notion of "better". It's entirely possible that a "WHERE CURRENT OF " clause finds the row in question faster than even an indexed access using a primary key could. If you're processing a single row, I would agree that using an update cursor is overkill, but if you're processing a large number of rows, using an update cursor might be beneficial. -- Carsten Haese http://informixdb.sourceforge.net -- http://mail.python.org/mailman/listinfo/python-list
Re: Must COMMIT after SELECT
On 2008-02-07 14:29, Steve Holden wrote: > Paul Boddie wrote: >> With PostgreSQL, my impression is that the intended way of using >> cursors is not entirely compatible with the DB-API: you declare >> cursors only when you know what the query will be, not in advance, and >> they can only be used with certain kinds of operations. As far as I >> recall, pyPgSQL supports cursors fairly transparently, albeit through >> various ad-hoc measures, whereas psycopg2 only does so for "named >> cursors" - a concept missing from the DB-API as far as I can see. >> > Yes, unfortunately the nomenclature of the DB API conflicts with that of > SQL'S DECLARE CURSOR, where the named cursor is effectively a sequence > of query results that (under certain isolation levels and patterns of > usage) can reflect database changes as they occur. Different > implementers have chosen different relationships between DB API cursors > and SQL cursors since it was introduced in the SQL 92 standard. > > I believe, without documentary justification, that named cursors were > introduced into SQL to support stored procedures, and therefore weren't > intended to be used for queries whose results were communicated outside > the server. Cursors defined using DECLARE CURSOR usually live in the scope of the database engine. They are different from the cursors defined and used with the database APIs. MS even warns against mixing them: http://msdn2.microsoft.com/en-us/library/aa172576(SQL.80).aspx The Python DB-API is defined at the database API level, so the same applies in the context of DB-API cursors. In practice, I've not had problems with accessing named cursors using DB-API cursors. The main reason for the MS warning is that cursors can be used for lots of interesting optimizations such as auto-updating result sets and scrolling, positioned updates or deletions, etc. The DB-API doesn't expose all these nifty features, so doesn't easily run into trouble. mxODBC has support for named cursors that you can later use for positioned updates. You declare the name of the cursor when creating it: cursor1 = connection.cursor('mycursor') cursor1.execute('select id, name, value from mytable') # Position the mycursor on row 10 cursor1.fetchmany(10) # Update row 10 cursor2 = connection.cursor() cursor2.execute('update mytable set value = ? where current of mycursor') cursor1.close() cursor2.close() However, it's usually better to do updates in the classical way, ie. by referencing a primary key. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 07 2008) >>> Python/Zope Consulting and Support ...http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 -- http://mail.python.org/mailman/listinfo/python-list
Re: Must COMMIT after SELECT
Paul Boddie wrote: [...] > > I don't use psycopg2 at the moment, but I did patch it to allow more > transparent usage of cursors, and there's an unapplied patch for this > floating around in the bug tracker. Lately, I've been using pyPgSQL > instead and not really doing huge selects from Python code anyway, but > I'm still using fetchmany for one or two things. > [Makes note to look at pyPgSQL] regards Steve -- Steve Holden+1 571 484 6266 +1 800 494 3119 Holden Web LLC http://www.holdenweb.com/ -- http://mail.python.org/mailman/listinfo/python-list
Re: Must COMMIT after SELECT
On 7 Feb, 14:29, Steve Holden <[EMAIL PROTECTED]> wrote: > > That's true, and your remarks clarify cursor usage in the DB API very > well. Most people most of the time tend to ignore the existence of > cursor.fetchmany() in the DB API, despite the fact that it can provide > huge efficiency gains over both .fetchone() (can slow processing by > requiring too many database interactions per query) and .fetchmany() fetchall ;-) > (which can generate large memory overhead in the case of huge result sets). Indeed. I managed to run into a problem with DB-API-compliant code and psycopg2 in this respect: select a large number of rows, watch PostgreSQL do its work, see the Python process suck down the entire result set. It's a situation reminiscent of that incident involving a python and an alligator in the Florida swamps, but where the alligator survives. I don't use psycopg2 at the moment, but I did patch it to allow more transparent usage of cursors, and there's an unapplied patch for this floating around in the bug tracker. Lately, I've been using pyPgSQL instead and not really doing huge selects from Python code anyway, but I'm still using fetchmany for one or two things. Paul -- http://mail.python.org/mailman/listinfo/python-list
Re: Must COMMIT after SELECT
Paul Boddie wrote: > On 7 Feb, 08:52, Frank Aune <[EMAIL PROTECTED]> wrote: >> On Wednesday 06 February 2008 16:16:45 Paul Boddie wrote: >> >>> Really, the rule is this: always (where the circumstances described >>> above apply) make sure that you terminate a transaction before >>> attempting to read committed, updated data. >> How exactly do you terminate a transaction then?Do you terminate a >> transaction >> by closing the cursor? > > No, the transaction is controlled using the connection object in the > DB-API, specifically by the commit and rollback methods. > >> Would this indicate that for each query you perform >> against the db, you should: >> >> - Open cursor >> - Perform query >> - Close cursor > > I tend to open a separate cursor for each query whose result set I > want to keep around. In other words, if I'm doing a number of queries > whose results will not be manipulated via the DB-API (using fetchone, > fetchmany, fetchall) after the next query is executed, then I only > open one cursor - it's like this: > > cr = c.cursor() > try: > cr.execute(query1) > do stuff with cr.fetchone/many/all > cr.execute(query2) > ... > finally: > cr.close() > > As the DB-API indicates, if you want to manipulate more than one > result set, you need more than one cursor. > Clearly you need multiple cursors if you want to be reading from the database on both cursors simultaneously. Like you, though, I tend to use a single cursor whenever I can get away with it (and now I think about it, that's probably because of the greater likelihood of seeing db consistency due to the read repeatability, though I can honestly say I hadn't really formalized that principle). > With PostgreSQL, my impression is that the intended way of using > cursors is not entirely compatible with the DB-API: you declare > cursors only when you know what the query will be, not in advance, and > they can only be used with certain kinds of operations. As far as I > recall, pyPgSQL supports cursors fairly transparently, albeit through > various ad-hoc measures, whereas psycopg2 only does so for "named > cursors" - a concept missing from the DB-API as far as I can see. > Yes, unfortunately the nomenclature of the DB API conflicts with that of SQL'S DECLARE CURSOR, where the named cursor is effectively a sequence of query results that (under certain isolation levels and patterns of usage) can reflect database changes as they occur. Different implementers have chosen different relationships between DB API cursors and SQL cursors since it was introduced in the SQL 92 standard. I believe, without documentary justification, that named cursors were introduced into SQL to support stored procedures, and therefore weren't intended to be used for queries whose results were communicated outside the server. MySQL (never the most reliable indication of standardized SQL), restricts them to use inside stored procedures and triggers. PostgreSQL makes them more generally available, and I suppose it is possible to use a DB API cursor to execute a FETCH statement to retrieve data from a SQL CURSOR though I have never tried to do it. SQL Server provides two types of cursor, one for use in the standard manner and one that conforms more closely with the ODBC cursor concept. I tend to treat the database as a purely relational store, and not use stored procedures, but this is a luxury not everyone can afford - it's just necessary to use them for efficiency reasons sometimes, though not in the applications I am mostly involved with. Consequently I haven't made much use of SQL CURSORs. >> The concept of cursor in MySQL is apparantly very different from what I >> originally thought. I always thought the cursor is the "handler" for a >> certain connection, and that you needed to commit before closing down this >> handler - else changes were automatically rolled back. > > It's easy to believe this, given the "hierarchical" nature of the API. > However, cursors are just things which keep track of result sets, and > I suppose that they are most useful when you perform a query which > produces a large number of result rows, but where you only want to > read a limited number of those rows at a time. > That's true, and your remarks clarify cursor usage in the DB API very well. Most people most of the time tend to ignore the existence of cursor.fetchmany() in the DB API, despite the fact that it can provide huge efficiency gains over both .fetchone() (can slow processing by requiring too many database interactions per query) and .fetchmany() (which can generate large memory overhead in the case of huge result sets). regards Steve -- Steve Holden+1 571 484 6266 +1 800 494 3119 Holden Web LLC http://www.holdenweb.com/ -- http://mail.python.org/mailman/listinfo/python-list
Re: Must COMMIT after SELECT (was: Very weird behavior in MySQLdb "execute")
On 7 Feb, 08:52, Frank Aune <[EMAIL PROTECTED]> wrote: > On Wednesday 06 February 2008 16:16:45 Paul Boddie wrote: > > > Really, the rule is this: always (where the circumstances described > > above apply) make sure that you terminate a transaction before > > attempting to read committed, updated data. > > How exactly do you terminate a transaction then?Do you terminate a transaction > by closing the cursor? No, the transaction is controlled using the connection object in the DB-API, specifically by the commit and rollback methods. > Would this indicate that for each query you perform > against the db, you should: > > - Open cursor > - Perform query > - Close cursor I tend to open a separate cursor for each query whose result set I want to keep around. In other words, if I'm doing a number of queries whose results will not be manipulated via the DB-API (using fetchone, fetchmany, fetchall) after the next query is executed, then I only open one cursor - it's like this: cr = c.cursor() try: cr.execute(query1) do stuff with cr.fetchone/many/all cr.execute(query2) ... finally: cr.close() As the DB-API indicates, if you want to manipulate more than one result set, you need more than one cursor. With PostgreSQL, my impression is that the intended way of using cursors is not entirely compatible with the DB-API: you declare cursors only when you know what the query will be, not in advance, and they can only be used with certain kinds of operations. As far as I recall, pyPgSQL supports cursors fairly transparently, albeit through various ad-hoc measures, whereas psycopg2 only does so for "named cursors" - a concept missing from the DB-API as far as I can see. > The concept of cursor in MySQL is apparantly very different from what I > originally thought. I always thought the cursor is the "handler" for a > certain connection, and that you needed to commit before closing down this > handler - else changes were automatically rolled back. It's easy to believe this, given the "hierarchical" nature of the API. However, cursors are just things which keep track of result sets, and I suppose that they are most useful when you perform a query which produces a large number of result rows, but where you only want to read a limited number of those rows at a time. Paul -- http://mail.python.org/mailman/listinfo/python-list
Re: Must COMMIT after SELECT (was: Very weird behavior in MySQLdb "execute")
On Wednesday 06 February 2008 16:16:45 Paul Boddie wrote: > Really, the rule is this: always (where the circumstances described > above apply) make sure that you terminate a transaction before > attempting to read committed, updated data. How exactly do you terminate a transaction then?Do you terminate a transaction by closing the cursor? Would this indicate that for each query you perform against the db, you should: - Open cursor - Perform query - Close cursor The concept of cursor in MySQL is apparantly very different from what I originally thought. I always thought the cursor is the "handler" for a certain connection, and that you needed to commit before closing down this handler - else changes were automatically rolled back. Thanks, Frank -- http://mail.python.org/mailman/listinfo/python-list
Re: Must COMMIT after SELECT (was: Very weird behavior in MySQLdb "execute")
On 6 Feb, 16:04, Frank Aune <[EMAIL PROTECTED]> wrote: > > Whenever I did a SELECT() on the first connection, the cursor would > stop "seeing" new entries commited in the log table by the other connection. > I always assumed you needed COMMIT() after adding new content to the > database, not after every single query, but this perhaps indicate otherwise? No, you just need to be aware that the database module has probably started a transaction on your behalf and that, as John points out, for certain database systems in certain configurations, any repetition of the same query in the same transaction will produce the same results, thus avoiding the "nonrepeatable read" problem. Really, the rule is this: always (where the circumstances described above apply) make sure that you terminate a transaction before attempting to read committed, updated data. If you religiously perform a COMMIT (or ROLLBACK) after every query, you'll just create lots of transactions for no good reason. Paul -- http://mail.python.org/mailman/listinfo/python-list
Re: Must COMMIT after SELECT (was: Very weird behavior in MySQLdb "execute")
On Tuesday 05 February 2008 18:58:49 John Nagle wrote: > So you really do have to COMMIT after a SELECT, if you are reusing > the database connection. CGI programs usually don't have this issue, > because their connections don't live long, but long-running FCGI (and maybe > Twisted) programs do. I've experienced the same thing for long-running tasks even when using different connections/cursors against the same db for db queries and log writing dbhandlers respectively. Whenever I did a SELECT() on the first connection, the cursor would stop "seeing" new entries commited in the log table by the other connection. I always assumed you needed COMMIT() after adding new content to the database, not after every single query, but this perhaps indicate otherwise? Regards, Frank -- http://mail.python.org/mailman/listinfo/python-list
Must COMMIT after SELECT (was: Very weird behavior in MySQLdb "execute")
Steve Holden wrote: > John Nagle wrote: >> Carsten Haese wrote: >>> On Mon, 2008-02-04 at 11:30 -0800, John Nagle wrote: Restarting the MySQL instance changes the database. The entry "google.com" disappears, and is replaced by "www.google.com". This must indicate a hanging transaction that wasn't committed. But that transaction didn't come from the Python IDLE session I've been making test calls from. Those queries should match the graphical client exactly. So why don't they agree? >>> I don't have a definitive answer, but I do have a piece of generic >>> advice. If two database sessions receive differing results, the cause >>> could be any of the following factors: >>> >>> 1) One session is in a transaction, but the other is not. >>> >>> 2) Both sessions are in a transaction, but they are at different >>> isolation levels. >>> >>> 3) Both sessions are in a transaction, but the transactions were started >>> at different times. >> >> I see what's wrong, I think. I haven't been doing a commit after >> a SELECT. I've been careful to commit after write-type actions, >> but not after read-only actions. I'm using InnoDB in default mode, >> which is REPEATABLE READ, and I've recently switched to long-running >> processes which keep the database connection open for hours. So the >> data view for a given connection never changes, regardless of what's >> happening in other threads. >> > > > I believe you are correct in your belief that a commit() is never > required after SELECT. Once a database change is committed it should > become visible to all other connections (under normal circumstances). No. If you're using InnoDB in default mode, you're in "repeatable read" mode, which means the same SELECT on the same connection without an intervening commit will produce the same result, regardless of any other transactions. I have one FCGI program which does only SELECT calls, for processing quick lookup-only requests. This program wasn't picking up changes made by transactions from other programs, because it's a long-running program with a persistent database connection, and it didn't do a COMMIT. Once I made it do a COMMIT after each SELECT, it started picking up the changes being made elsewhere. So you really do have to COMMIT after a SELECT, if you are reusing the database connection. CGI programs usually don't have this issue, because their connections don't live long, but long-running FCGI (and maybe Twisted) programs do. John Nagle -- http://mail.python.org/mailman/listinfo/python-list