Re: [sqlite] Writing in a blob

Richard Hipp Tue, 23 Apr 2013 07:29:09 -0700

On Mon, Apr 22, 2013 at 7:39 AM, Paolo Bolzoni <
paolo.bolzoni.br...@gmail.com> wrote:


> I noticed that sqlite3_blob_write cannot increase the size of the pointed
> open blob. So I ask, there is a way to treat a blob as a stream so I can
> write
> or read values in it with ease?
>

Short answer:  No.

SQLite packs information tightly together to conserve space in the database
file.  So if you have a row with 5 columns, A through E, then the content
of those 5 columns are immediately adjacent to each other.  If you increase
the size of the blob in column C, then content of columns D and E have to
be moved out of the way.  Similarly, multiple rows are packed tightly
together, so if you increase the size of the blob in column E, than means
subsequent rows have to be moved out of the way, or else the entire row
being changed has to be rewritten into a different spot in the file.

Going even further into the weeds: even if you happen to have some free
space in file after the current row and the blob is the last column in the
row, due to some encoding decisions that were made way back in 2004, we
cannot increase the size of a row without rewriting the entire row.  To do
otherwise would result in an incompatible file format.

So, no, you cannot arbitrarily increase the size of a BLOB using
sqlite3_blob_write().  However, as Roger mentioned, you can populate the
blob with empty space initially using zeroblob() and then overwrite it with
the content you need, incrementally.

Or, if you really want file-system semantics on top of your database (and
this is a reasonably idea, actually) you could write a wrapper library that
implemented file-system semantics on top of SQLite.  I propose an API
something like this:

  SQLFILE *sqlite3fs_fopen(const char *zPath, const char *zMode);
  size_t   sqlite3fs_fread(void*, size_t, size_t, SQLFILE*);
  size_t   sqlite3fs_fwrite(const void*, size_t, size_t, SQLFILE*);
  int      sqlite3fs_fseek(SQLFILE*, size_t offset, int whence);
  size_t   sqlite3fs_ftell(SQLFILE*);
  void     sqlite3fs_rewind(SQLFILE*);
  int      sqlite3fs_ftruncate(SQLFILE*, size_t);
  int      sqlite3fs_rename(const char *zFrom, const char *zTo);
  int      sqlite3fs_unlink(const char *zPath);
  int      sqlite3fs_stat(const char *zPath, struct stat*);
  SQLDIR  *sqlite3fs_opendir(const char *zPath);
  dirent  *sqlite3fs_readdir(SQLDIR*);
  int      sqlite3fs_closedir(SQLDIR*);

The "files" in this wrapper could be stored as one or more page-sized
chunks in separate rows of some backing-store table with a designated
name.  The multiple small blobs could then be pulled together transparently
by the sqlite3fs_fread() routine.  And large writes would be split apart
into several separate INSERTs by sqlite3fs_write().  Since each chunk is a
separate row, files could be increased or decreased in size simply by
adding and deleting rows from the backing store table.  Changing or adding
a single byte to a file means rewriting the last chunk, but for a large
file that is a lot cheaper than rewriting the whole file.  Note also that
real filesystems also have to write in chunks because disk drives work in
chunks.  There really is no such thing as adding a single byte to a file -
you always really overwrite the whole last page.

Putting together a library such as outlined above would provide a
convenient and simple migration path for applications that wanted to
transition from a pile-of-files storage format toward using an SQLite
database as their application file format.

Imagine what could happen if, for example, git where to start using this
library.  Instead of the ".git" directory containing lots and lots of
separate files, your repository would be a single ".git" file which was
really an SQLite database accessed through the "sqlite3fs" wrapper.
Minimal changes (renaming file I/O routines) would be required to the git
core.  But once you have an SQLite database used in place of a
pile-of-files, you can start adding new tables to do lots of interesting
things like hosting bug reports or a wiki as part of your repository (as
Fossil does).

As another example, consider the OpenOffice presentation application whose
current file format (the *.odp file) is a ZIP archive containing lots of
image and XML files.  If the *.odp file where replaced by an SQLite
database with the "sqlite3fs" wrapper, then you would be able to make
incremental changes to an OpenOffice presentation, instead of having to
recreate the entire multi-megabyte ZIP archive in the *.odp file simply
because of a one-character change on one slide.  And your changes would be
atomic, obviating the need for those annoying backups that OpenOffice makes
periodically.

In summary:  No, a bare SQLite blob does not provide file-system
semantics.  But you can write a wrapper library around SQLite that does
provide file-system semantics for large blobs, and doing so would have many
advantages and be a worth-while project, I think.

If anyone reading this wants to undertake such a project, I can provide you
with technical advice, website space with a suitable domain name (ex:
filesystem.sqlite.org), and a Fossil repository to keep the code in.
-- 
D. Richard Hipp
d...@sqlite.org
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Writing in a blob

Reply via email to