ashb opened a new pull request, #44839:
URL: https://github.com/apache/airflow/pull/44839

   Since 2010(!) sqlite has had a WAL, or Write-Ahead Log mode of journalling
   which allos multiple concurrent readers and one writer. More than good enough
   for us for "local" use.
   
   The primary driver for this change was a realisation that it is possible and
   to reduce the amount of code in complexity in DagProcessorManager before
   reworking it for AIP-72 support :- we have a lot of code in the
   DagProcessorManager to support `if async_mode` that makes understanding the
   flow complex.
   
   Some useful docs and articles about this mode:
   
   - [The offical docs](https://sqlite.org/wal.html)
   - [Simon Willison's 
TIL](https://til.simonwillison.net/sqlite/enabling-wal-mode)
   - [fly.io article about scaling read 
concurrency](https://fly.io/blog/sqlite-internals-wal/)
   
   This still keeps the warning against using SQLite in production, but it
   greatly reduces the restrictions what combos and settings can use this. In
   short, when using an SQLite db it is now possible to:
   
   - use LocalExecutor, including with more than 1 concurrent worker slot
   - have multiple DAG parsing processes (even before AIP-72/TaskSDK changes to
     that)
   
   We execute the `PRAGMA journal_mode` every time we connect, which is more
   often that is strictly needed as this is one of the few modes thatis
   persistent and a property of the DB file just for ease and to ensure that it
   it is in the mode we want.
   
   I have tested this with `breeze -b sqlite start_airflow` and a kicking off a
   lot of tasks concurrently.
   
   Will this be without problems? No, not entirely, but due to the
   scheduler+webserver+api server process we've _already_ got the case where
   multiple processes are operating on the DB file. This change just makes the
   best use of that following the guidance of the SQLite project: Ensuring that
   only a single process accesses the DB concurrently is not a requirement
   anymore!
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to