Re: [sqlalchemy] Long transactions with SQLite (and of course locks)

Eric Smith Wed, 01 Apr 2015 09:09:07 -0700

Thanks Michael -- I appreciate how responsive you are to questions.

Yes, it would have been more precise to talk about pysqlite's locking model 
rather than SQLite's. I'm hesitant to increase the lock timeout because I 
don't think I want any transactions that are that long, and would rather 
root out any that are.


To address the false positives, I'm looking at a different approach this 
morning. Since the engine log seems to match precisely what is going on 
with the database (the SQL being sent), why not put in a logging Filter 
instead of using events? Then I'd have something like:

    def filter(self, record):
        first_word = record.msg.split(' ')[0]
        if first_word in ['INSERT', 'UPDATE', 'DELETE']:
            self._begin_timing()
        elif first_word in ['COMMIT', 'ROLLBACK']:
            self._stop_timing()
        return True

Of course it only works if logging is on, but this is kind of a debugging 
tool anyway.


On Tuesday, March 31, 2015 at 9:38:35 PM UTC-6, Michael Bayer wrote:
>
>  
>
> On 3/31/15 7:23 PM, Eric Smith wrote:
>  
> Environment: SQLAlchemy 0.9.7, SQLite back-end, autoflush=True, mostly 
> using scoped session, multi-platform Python 2.7 
>
>  I'm working on a multi-threaded app, but with fairly low concurrency 
> (user actions and scheduled tasks are on different threads, for example). I 
> think I have a fair understanding of SQLite's locking model, the 
> implications of having autoflush=True, etc. but I still occasionally get 
> the "OperationError database is locked". 
>
>  The challenge with this error is that it is raised on a thread that 
> could be doing everything right while some other thread is keeping a 
> transaction open for too long. Plus, unless another thread tries to write, 
> the "misbehaving" thread goes mostly unnoticed.
>
>  So I thought I'd try to write a "long transaction detector" that would 
> start a timer when when a flush happens, and cancel it if a commit or 
> rollback happens. If the timer expires (after a few seconds), a warning is 
> logged with the stack trace of the flush. The idea is that I can see 
> information about the misbehaving thread rather than the innocent victim, 
> and I'll see problems even without actual contention.
>
>  While I'm still experimenting to see if this strategy will work, it 
> looks like I'm getting some false positives. It seems like sometimes I get 
> an after_flush event even when there are no modifications to the database. 
>  
> A flush can proceed when objects are marked as dirty, but in some cases it 
> turns out the changes on those objects are not net changes; e.g. an object 
> attribute was set to a new value that matches the old one.  that's the case 
> where you might see a flush that doesn't actually do anything.
>
> As far as database is locked, I'd just increase the timeout (Pysqlite 
> setting, defaults to 5 seconds), so that the locking acts just like any 
> other mutex that you'd place into your application.   
>
> As far as "SQLite's locking model", note that Pysqlite has extremely 
> specific behavior here which makes the locking model much more forgiving.  
> It only locks the database when the SQL on the current transaction turns 
> into DML (e.g. INSERT / UPDATE / DELETE).   So it should not be hard to 
> write the app such that things that actually emit DML aren't taking up that 
> much time.   Note that when "rollback" or "commit" happens, the transaction 
> is over, nothing is locked.
>
> If the app truly needs DML on multiple threads lots of the time and you're 
> getting a lot of contention (even with Pysqlite's forgiving model), then 
> SQLite isn't appropriate for that level of concurrency.
>
>
>
>  This is based on having the engine logging turned on so I can see the 
> SQL being emitted.  I'll see something like:
>
>  BEGIN (implicit)
>  SELECT
>  SELECT
>  ...
> after_flush event
>  SELECT
>  SELECT
>  ...
> My timer expires
>
>  Code is below -- the class is passed as the class_ parameter to 
> sessionmaker. 
>
>  Am I doing something dumb? Is this a reasonable strategy for my goals? 
> Any ideas on the false positive?
>
>  Thanks,
> Eric
>
>  
>  
>  class LongTransactionDetector(sqlalchemy.orm.Session):
>
>      TIMEOUT = 3
>
>      def __init__(self, *args, **kwargs):
>         super(LongTransactionDetector, self).__init__(*args, **kwargs)
>         self._timer = None
>         self._stack = None
>         self._flush_time = None
>         event.listen(self, 'after_flush', self._after_flush)
>         event.listen(self, 'after_commit', self._after_commit)
>         event.listen(self, 'after_rollback', self._after_rollback)
>         event.listen(self, 'after_soft_rollback', 
> self._after_soft_rollback)
>
>      def close(self):
>         # Calling close on a session automatically emits a ROLLBACK, but we
>         # don't seem to get an event for it, so we'll just hook it here.
>         super(LongTransactionDetector, self).close()
>         self._stop_timing()
>
>      def _after_flush(self, session, flush_context):
>         if any([session.new, session.dirty, session.deleted]):
>             self._begin_timing()
>
>      def _after_commit(self, session):
>         self._stop_timing()
>
>      def _after_rollback(self, session):
>         self._stop_timing()
>
>      def _after_soft_rollback(self, session, previous_transaction):
>         self._stop_timing()
>
>      def _begin_timing(self):
>         if self._timer is None:
>             logger.debug('set transaction timer')
>             self._flush_time = datetime.datetime.now()
>             self._stack = traceback.extract_stack()
>             self._thread = threading.current_thread()
>             self._timer = threading.Timer(self.TIMEOUT, self._on_timer)
>             self._timer.start()
>
>      def _stop_timing(self):
>         if self._timer:
>             logger.debug('clear transaction timer')
>             self._timer.cancel()
>             self._timer = None
>             self._stack = None
>
>      def _on_timer(self):
>         trace = ''.join(traceback.format_list(self._stack))
>         logger.warning('long transaction detected on {!r} {}\n{}\n{}'
>                        .format(self._thread, self._flush_time, trace))
>  -- 
> You received this message because you are subscribed to the Google Groups 
> "sqlalchemy" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to sqlalchemy+...@googlegroups.com <javascript:>.
> To post to this group, send email to sqlal...@googlegroups.com 
> <javascript:>.
> Visit this group at http://groups.google.com/group/sqlalchemy.
> For more options, visit https://groups.google.com/d/optout.
>
>
>  

-- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.

Re: [sqlalchemy] Long transactions with SQLite (and of course locks)

Reply via email to