Remember to make the blob field the very last field in your table schema, and 
avoid "select * from", otherwise you can create performance problems. The data 
for a record is stored in the same order as the fields are defined, and 
overflow pages are a linked list. So if you have your 100 MB blob first, and 
your ID second, then you will have to read through all 100 MB of the blob to 
get to your ID.

For the freeing up of space check out the incremental version of auto vacuum. 
Rather than freeing it all up at once it might be good to run it 
"incrementally" to avoid 1 big performance hit. Maybe have something run every 
so often to free up a set amount of pages, similar to a scheduled WAL 
checkpointer.


-----Original Message-----
From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On 
Behalf Of Jens Alfke
Sent: Wednesday, March 29, 2017 1:14 PM
To: SQLite mailing list
Subject: [sqlite] Questions on big blobs and blob I/O

I’m thinking of transitioning from external to internal storage of large blobs. 
That is, currently I store these as individual files in a directory next to the 
SQLite database. But it sounds like today’s new improved SQLite is better at 
managing big blobs, and supports streaming reads; and it would simplify my code 
to not have to deal with all those files.

(These blobs are created by a “document attachments” API of the library I work 
on. What’s in them is up to the developer. They’re often fairly small, e.g. a 
thumbnail JPEG, but they could be large media files like movies, in the tens or 
hundreds of megabytes. There’s no upper limit.)

I have two questions:

* Do big blobs cause any performance problems for the db as a whole, like 
fragmentation? In particular, is the blob data stored contiguously in the file?

* After a big blob is deleted, my understanding is that the db file won’t 
shrink, so the space isn’t reclaimed to the filesystem. Instead the freed pages 
in the db file will be reused later on. If I urgently need to shrink the file, 
do I have to run a full (expensive) vacuum? (The problem with vacuum is that it 
requires lots of free space to work, but blob deletions may be happening on 
behalf of the user, to free up storage when the [mobile] device is running out 
of free space, creating a catch-22…)

* Is there any way to avoid a sqlite3_blob handle being invalidated when the 
blob’s row is deleted (via a DELETE on another db handle?) I’d prefer to have 
behavior like file deletion in Unix, where open file handles keep working even 
after a file is deleted. I’m wondering whether I can accomplish this by opening 
a read transaction before opening the blob, then leaving it open until the blob 
is closed. Since I’m using WAL, will this db handle keep a snapshot of the time 
before the blob’s row was deleted, allowing the blob handle to keep working?

Thanks!

—Jens
_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to