Hello,
I'm currently working on a project which is very likely to use SQLite as data storage. Since performance is important for us, I already found on the SQLite website many ways to optimise the code (always working in transactions where possible, using a page size of 4096 since it's running on the Windows platform, using integers primary keys, ...). But I have one problem that I "solved" in an unorthodox way; it works, but maybe you have a better solution than mine? I've been doing some test with a ~100 MB database, in which I have three tables: one for structured data (Objects, 2000 entries), one for the blobs we have to store (ObjectBlobs ID INTEGER PRIMARY KEY, Data BLOB) (8000 entries), and one which binds the structured data and the blobs (8000 entries). As you can imagine, each Object has 4 blobs linked to it; the blobs can be quite large (let's say up to 1 MB). My (C++) application just has to read the table "Objects", and one of the Blobs for each Object. Now the first time I run my application, it takes quite a long time (30s) to load the blobs. But if I re-run the app, it only takes 1s to load them. It's clearly a disk caching issue: if I copy huge files to the disk between two runs, it takes again 30s to load the blobs (i.e. the DB is no more in the disk cache). Profiling the application indicates sqlite::winRead() is the bottleneck. I then had the following idea: SQLite is probably reading the file randomly, depending on where the data lies. If I can force the DB to be cached, everything should be fine. So before connecting the database, I first read it sequentially (using a C++ ifstream) until the end of file. It perfectly solves the database problem, even though I still notice a difference (3s to read the file on 1st run, 0.2s later). But 3s is OK where 30s was worrying me. I hope I explain the situation clear enough, and ask you now: is it the only way to do it? I find the trick a bit nasty and don't like it; maybe I missed something? Before you ask: I tried to VACUUM the DB, it only reduced the timings to 16s, which was still bad for our requirements. Tests with a larger DB (it can get much bigger than my example) and on different machines tend to confirm my theory. Thanks in advance (and a big thank for SQLite which is really nice and easy to use !), Julien Renggli P.S.: Some technical informations: - sqlite v.3.3.16 (we will upgrade to the latest version later), C API - Windows XP SP2 - Timings on Pentium 4 3.4GHz, 2GB RAM