On Thu, 2 Jul 2015 10:09:12 -0400 Kathleen Alexander <katcalex at mit.edu> wrote:
> Essentially, I have written an application in C++ that interfaces > (reads and writes) with a SQLite database, and I am getting lots of > 'database is locked' errors. [...] > > My application runs on Linux (ubuntu 13.10), and is driven by a bash > script that spawns many (~60 on a 64 core workstation) instances of a > serial, C++ program, each of which opens its own connection to the > database and performs reads and writes. It may be that SQLite is not a good fit for your application. Concurrent update is SQLite's Achilles heel. Each insert/update/delete in SQLite requires exclusive access. In WAL mode, it requires exclusive access to the table; else it requires exclusive access to the whole database. That means, by default, only one process can update the database at a time. If you have 60 processes, 59 will wait. Depending on your requirements, that may still be OK. If the updates run quickly enough for your purposes, then increasing the timeout may do the trick. Estimate the processing time and number of updates to compute an overall throughput. If that's acceptable, problem solved. Most DBMSs, the heavy kind, devote extensive resources to support concurrent update. Usually contention is managed at the page or row level, and a whole section of the manual covers how the system implements SQL's "isolation levels". Just exactly how many processes can update the database at once is a function of almost everything: DBMS configuration, table design, index definition, and isolation level, not to mention hardware capacity and the OS I/O subsystem. Your mileage will most certainly vary. > B. If SQLite is not a good fit for my program, do you have any > suggestions of an alternative database engine that is free or might > be free or inexpensive for academic use? *If* that's the case, your best option IMO is Postgres. If you want to handle ~1000 concurrent updates, though, you will have to pay attention to the details, and may have to arrange to minimize resource contention in the DBMS. It all depends on the throughput and response-time requirements. HTH. --jkl