[sqlalchemy] Re: DB Error handling in SQLAlchemy

Michael Bayer Tue, 13 May 2008 18:44:22 -0700


On May 13, 2008, at 5:15 PM, Randall Nortman wrote:


> handling.  I'm curious how SA handles this, especially transient
> errors where a retry is the right thing to do.  For example, in
> several of my current sqlite apps (using bare DB-API), I get
> occasional locking errors ("Database is locked"), which happens
> sometimes with sqlite when there are concurrent processes accessing
> the DB.  The right thing to do is retry, but it's a real pain to write
> *every* database operation with retry logic.  Does SA handle retries
> in cases like this automatically?

no, if a DBAPI error occurs, the error is thrown.   We wrap the error  
in a DBAPI-neutral wrapper which has the same type as the underlying  
error (i.e. OperationalError, IntegrityError).

Retries are definitely something a user-application needs to work out  
on its own; SQLAlchemy is not an application framework, just a library.

> (Admittedly, I think that more
> recent versions of pysqlite handle locked DBs better than the somewhat
> stale version I'm currently using, but I'm after the general principle
> rather than trying to solve that particular problem.)

I dont think "database is locked" errors are all that common in  
practice.  SQLite is supposed to block until the file is available.   
If you have a highly concurrent transactional application, SQLite is  
the wrong choice anyway.

> Aside from that, how in general does SA handle errors returned by the
> underlying DB?

For most situations, you write your application such that they don't  
occur within normal use cases (i.e. dont use IntegrityError as a way  
to check if a row is present, select from the table explicitly  
instead; dont overuse locking and/or use a unit-of-work so that  
deadlocks don't occur).  An error thrown by the DB should be  
considered an application failure.

> Does it automatically do a rollback and

When using the ORM, if a flush() fails, a ROLLBACK is raised  
unconditionally, and the error is thrown.  At that point, the end user  
must rollback the Session under normal usage (what Session calls  
"transactional=True", or in 0.5 is called "autocommit=False"). This is  
actually not the most ideal behavior as it would be nice someday for a  
flush() to be able to make partial progress, raise the error without  
rolling back, and then be able to complete after the state of the  
Session has been corrected, its something we might do someday.   
However, some databases themselves prevent this from being doable;  
current versions of Postgres, for example, require that you ROLLBACK  
the transaction if an IntegrityError (the most common error) occurs -  
further statements will raise errors unconditionally until ROLLBACK.


> /or reconnect
> if necessary?

Here's the one place we do something with a raised error.  We do  
detect those errors which indicate a dropped connection, and in  
response to that we recycle the entire connection pool.  The  
assumption here is that if a disconnect occurs, it is probably due to  
the database being restarted and it is harmless in any case for the  
connection pool being recycled.

However, we still raise the error, albeit exactly once as opposed to  
once for each connection in the pool (this is a distinct advantage to  
the usage of a pool).   In this case its virtually impossible for us  
to "retry" as when a conneciton is lost, an entire transaction is  
gone.   SQLA is not capable of replaying the full script of SQL  
statements which occured within the transaction since it does not take  
on the complexity and overhead intrinsic in such an operation.  But if  
you have a multithreaded web app, and you bounce the DB, you would get  
a minimum of failures.

There is also a way to have connetions "pre-ping" before each usage  
(and recycle if connection lost), which is another way to prevent most  
"disconnect" errors.  We have a "connection pool events" API which I  
believe some people have used to achieve this result.

There's another option on the pool called "pool_recycle=<num of  
seconds>" which is used to deal with database client libraries (like  
MySQL's) that time out after a certain number of hours.  So again, the  
emphasis here with reconnecting is to minimize the amount of errors  
thrown in the first place.

> If you're using the ORM layer is it able to still
> figure out what objects/fields are dirty if an error was encountered
> while trying to flush changes to disk?

the unit of work executes its SQL within a transaction, but does not  
change the bookkeeping on any affected objects until the transaction  
is committed.   In the 0.4 series, there is one exception to this,  
which is primary key and foreign key identifiers which are set during  
the transaction currently do not get rolled back on instances, which  
can create issues.   In 0.4 we've generally recommended closing out  
the session when a flush error occurs (as a side note, this is the  
long-standing behavior of Hibernate as well which also cant rollback  
the state of in-memory instances).

In version 0.5 which is available now via the trunk, the rollback  
procedure fully expires all objects present in the session, expunges  
new objects, and promotes deleted objects back to the persistent  
status - so while a flush() error still implies a rollback() must  
occur, the state of the session is fully rolled back in tandem with  
the transaction and may be reused (this is described here: 
http://www.sqlalchemy.org/docs/05/session.html#unitofwork_using_rolling 
  ).  We also have a branch where we worked with an in-memory rollback  
of attributes but I think this is overly complex; SQLA's Session, in  
0.5 more than ever, uses the underlying transaction as a guide to the  
current database conversation and the Python side is held closely to  
that.

> Is the programmer responsible
> for differentiating between errors that indicate that a reconnect is
> necessary and errors where the connection can continue to be used?

SQLA handles reconnect for you automatically, but still propagates the  
error for those connections which receive it.  If you are working with  
the SQL level, and you're trying to be able to "retry" things (which  
you might be noticing is a practice we dont encourage), then yes you'd  
have to figure out that a dropped connection has also lost your  
transaction, whereas an IntegrityError has not (unless youre on  
postgres, in which case you'd get more errors if you try to continue).

> including not losing track of what changes still need to be committed
> at the ORM layer.

if youre using the session in autocommit mode and disable autoexpire,  
whatever changes are pending on objects still remain after an error is  
raised, and the transaction rollback is limited to just the flush()  
itself since theres otherwise no transaction going on.  But as i said  
if you dont expire instances, new PK identifiers will remain assigned  
which may or may not create problems depending on DB backend.


> Ideally, if the reconnect is successful, I never
> want to know anything went wrong -- I just want SA to retry a
> reasonable number of times and only throw an exception if it continues
> to fail.

cant be done, since the transaction in progress is thrown away.    I  
think it would be inappropriate for SQLA to log the entire transaction  
in its own buffers (enormous overhead and complexity) and then run the  
whole thing a second time without being told to (debugging failures  
here would be extremely complicated).


>  But if the DB just threw a primary key constraint violation,
> the connection is still healthy, and I want my DB layer to just let me
> know so I can handle the situation appropriately (such as switching
> the insert to an update, or deleting the existing record, or passing
> the error on up to the user).

The patterns encouraged by the ORM remove the need to guess if a row  
needs an INSERT or UPDATE, and do not use database exceptions to  
achieve use cases.   If you are writing your application based on  
heavy emphasis of "retrying things that failed within the  
transaction", the SQLA ORM is not designed around that paradigm.

- mike

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~----------~----~----~----~------~----~------~--~---

[sqlalchemy] Re: DB Error handling in SQLAlchemy

Reply via email to