Remember to make the blob field the very last field in your table schema, and avoid "select * from", otherwise you can create performance problems. The data for a record is stored in the same order as the fields are defined, and overflow pages are a linked list. So if you have your 100 MB blob first, and your ID second, then you will have to read through all 100 MB of the blob to get to your ID.
For the freeing up of space check out the incremental version of auto vacuum. Rather than freeing it all up at once it might be good to run it "incrementally" to avoid 1 big performance hit. Maybe have something run every so often to free up a set amount of pages, similar to a scheduled WAL checkpointer. -----Original Message----- From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On Behalf Of Jens Alfke Sent: Wednesday, March 29, 2017 1:14 PM To: SQLite mailing list Subject: [sqlite] Questions on big blobs and blob I/O I’m thinking of transitioning from external to internal storage of large blobs. That is, currently I store these as individual files in a directory next to the SQLite database. But it sounds like today’s new improved SQLite is better at managing big blobs, and supports streaming reads; and it would simplify my code to not have to deal with all those files. (These blobs are created by a “document attachments” API of the library I work on. What’s in them is up to the developer. They’re often fairly small, e.g. a thumbnail JPEG, but they could be large media files like movies, in the tens or hundreds of megabytes. There’s no upper limit.) I have two questions: * Do big blobs cause any performance problems for the db as a whole, like fragmentation? In particular, is the blob data stored contiguously in the file? * After a big blob is deleted, my understanding is that the db file won’t shrink, so the space isn’t reclaimed to the filesystem. Instead the freed pages in the db file will be reused later on. If I urgently need to shrink the file, do I have to run a full (expensive) vacuum? (The problem with vacuum is that it requires lots of free space to work, but blob deletions may be happening on behalf of the user, to free up storage when the [mobile] device is running out of free space, creating a catch-22…) * Is there any way to avoid a sqlite3_blob handle being invalidated when the blob’s row is deleted (via a DELETE on another db handle?) I’d prefer to have behavior like file deletion in Unix, where open file handles keep working even after a file is deleted. I’m wondering whether I can accomplish this by opening a read transaction before opening the blob, then leaving it open until the blob is closed. Since I’m using WAL, will this db handle keep a snapshot of the time before the blob’s row was deleted, allowing the blob handle to keep working? Thanks! —Jens _______________________________________________ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users _______________________________________________ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users