Thomas Fjellstrom wrote:
On February 21, 2007, [EMAIL PROTECTED] wrote:

"Brett Keating" <[EMAIL PROTECTED]> wrote:

Hi,

I'm curious about what the effect of having a blob in the database may
be on performance. I have two design options: 1) put a small image file
(15-30kbyte) into the database as a blob, and 2) store the image in a
separate file on disk and hold the filename in the database. My table
has around 20 rows in it, about half are strings/smaller blobs and half
are integers.

Option number one, for various reasons, is far more elegant and simple
in terms of its impact on the rest of the code. However, I am concerned
that holding such large amounts of data per record might impact
performance. I could be worried about nothing though, which is why I'm
writing to this list :).

When I was designing the SQLite file format, I made the assumption
that BLOBs would be used infrequently and would not be very big.
The file format design is not optimized for storing BLOBs.  Indeed,
BLOBs are stored as a singly-linked list of database pages.  It
is hard to imagine a more inefficient design.

Much to my surprise, people begin putting multi-megabyte BLOBs
in SQLite databases and reporting that performance really was not
an issue.  I have lately taken up this practice myself and routinely
uses SQLite database with BLOBs that are over 10MiB is size.  And
it all seems to work pretty well here on my Linux workstation.  I
have no explanation for why it works so well, but it does so I'm not
going to complain.

If your images are only 30KiB, you should have no problems.

Here's a hint though - make the BLOB columns the last column in
your tables.  Or even store the BLOBs in a separate table which
only has two columns: an integer primary key and the blob itself,
and then access the BLOB content using a join if you need to.
If you put various small integer fields after the BLOB, then
SQLite has to scan through the entire BLOB content (following
the linked list of disk pages) to get to the integer fields at
the end, and that definitely can slow you down.


I still wonder about the utility of storing binary data in the db itself. Maybe it makes it more easy to distribute that way, but how often does one distribute an entire database in a "vendor specific" format?

I'm quite interested in hearing people's reasoning for going the blob route, when you have a perfectly good "database" format for "blobs" already (various filesystems).

The BLOB method has two major advantages. The first is that the data all reside in one file. The second is that it does not have a directory size limit, a curse when storing files and one which requires extra logic to build a tree structure of directories or some other way of limiting directory size.

The downside is the relative slowness of retrieving large BLOBs, although the anecdotal evidence is that this is not as much of a problem as one would expect.

On balance I would use BLOBs for binary storage except in the case where there is a predominance of very large files.


--
D. Richard Hipp  <[EMAIL PROTECTED]>


---------------------------------------------------------------------------
-- To unsubscribe, send email to [EMAIL PROTECTED]
---------------------------------------------------------------------------
--






-----------------------------------------------------------------------------
To unsubscribe, send email to [EMAIL PROTECTED]
-----------------------------------------------------------------------------

Reply via email to