On May 13, 2008, at 5:15 PM, Randall Nortman wrote:
handling. I'm curious how SA handles this, especially transient
errors where a retry is the right thing to do. For example, in
several of my current sqlite apps (using bare DB-API), I get
occasional locking errors (Database is locked), which happens
sometimes with sqlite when there are concurrent processes accessing
the DB. The right thing to do is retry, but it's a real pain to write
*every* database operation with retry logic. Does SA handle retries
in cases like this automatically?
no, if a DBAPI error occurs, the error is thrown. We wrap the error
in a DBAPI-neutral wrapper which has the same type as the underlying
error (i.e. OperationalError, IntegrityError).
Retries are definitely something a user-application needs to work out
on its own; SQLAlchemy is not an application framework, just a library.
(Admittedly, I think that more
recent versions of pysqlite handle locked DBs better than the somewhat
stale version I'm currently using, but I'm after the general principle
rather than trying to solve that particular problem.)
I dont think database is locked errors are all that common in
practice. SQLite is supposed to block until the file is available.
If you have a highly concurrent transactional application, SQLite is
the wrong choice anyway.
Aside from that, how in general does SA handle errors returned by the
underlying DB?
For most situations, you write your application such that they don't
occur within normal use cases (i.e. dont use IntegrityError as a way
to check if a row is present, select from the table explicitly
instead; dont overuse locking and/or use a unit-of-work so that
deadlocks don't occur). An error thrown by the DB should be
considered an application failure.
Does it automatically do a rollback and
When using the ORM, if a flush() fails, a ROLLBACK is raised
unconditionally, and the error is thrown. At that point, the end user
must rollback the Session under normal usage (what Session calls
transactional=True, or in 0.5 is called autocommit=False). This is
actually not the most ideal behavior as it would be nice someday for a
flush() to be able to make partial progress, raise the error without
rolling back, and then be able to complete after the state of the
Session has been corrected, its something we might do someday.
However, some databases themselves prevent this from being doable;
current versions of Postgres, for example, require that you ROLLBACK
the transaction if an IntegrityError (the most common error) occurs -
further statements will raise errors unconditionally until ROLLBACK.
/or reconnect
if necessary?
Here's the one place we do something with a raised error. We do
detect those errors which indicate a dropped connection, and in
response to that we recycle the entire connection pool. The
assumption here is that if a disconnect occurs, it is probably due to
the database being restarted and it is harmless in any case for the
connection pool being recycled.
However, we still raise the error, albeit exactly once as opposed to
once for each connection in the pool (this is a distinct advantage to
the usage of a pool). In this case its virtually impossible for us
to retry as when a conneciton is lost, an entire transaction is
gone. SQLA is not capable of replaying the full script of SQL
statements which occured within the transaction since it does not take
on the complexity and overhead intrinsic in such an operation. But if
you have a multithreaded web app, and you bounce the DB, you would get
a minimum of failures.
There is also a way to have connetions pre-ping before each usage
(and recycle if connection lost), which is another way to prevent most
disconnect errors. We have a connection pool events API which I
believe some people have used to achieve this result.
There's another option on the pool called pool_recycle=num of
seconds which is used to deal with database client libraries (like
MySQL's) that time out after a certain number of hours. So again, the
emphasis here with reconnecting is to minimize the amount of errors
thrown in the first place.
If you're using the ORM layer is it able to still
figure out what objects/fields are dirty if an error was encountered
while trying to flush changes to disk?
the unit of work executes its SQL within a transaction, but does not
change the bookkeeping on any affected objects until the transaction
is committed. In the 0.4 series, there is one exception to this,
which is primary key and foreign key identifiers which are set during
the transaction currently do not get rolled back on instances, which
can create issues. In 0.4 we've generally recommended closing out
the session when a flush error occurs (as a side note, this is the
long-standing behavior of Hibernate as well which also cant rollback
the state of