One suggestion I would make is to keep your blobs in a separate table,
or at least in a table in which you don't anticipate searching on
anything but the primary key. The reason is due to how SQLite manages
blobs. If a blob grows larger than a page's payload (approx page
size), it will spill over onto a special "overflow" pages, which are
just blank pages to hold the extra binary data. Depending on how big
the blob is, it may create multiple overflow pages.

While this in itself doesn't really hurt you, it will potentially
affect queries on that table, specifically queries which search on any
non-indexed or non-key columns. And as your table grows, or your blobs
(or both) the problem only gets worse. For example, say you have the
following table:

create table pictures (
id integer primary key,
name text,
date text,
description text,
data blob );

And say you do a query like:

SELECT name from pictures where date > '2007-11-01' and date < '2008-01-01';

Even though this query only has constraints on the name column, the
data (blob) field will impose a performance hit by its very existence
in the table. The reason for this is the overflow pages. Specifically,
as SQLite scans the table, it must read in the entire row in order to
obtain its values. When a row has overflow pages, they must all be
fetched from disk and read in as part of the row. (While this approach
is a good thing for handling variable length columns, blobs are the
one case where this doesn't work out so well.) As this query is
handled by a sequential scan (no indexes defined here), overflow pages
must be fetched for *every* row in the table. All in all, this
ultimately requires

1. additional page fetches (proportional to the size of each blob field)
2. more memory consumption (to hold the blob field you don't even care about),
3. more page swapping (to make room for each new blob field when the
page cache fills up)

and generally just having to deal with more IO -- all to do a simple
sequential scan that has nothing to do with the blob column at all
(and for rows that may not even match the query: what if you were only
looking for one row? ).

Now, you could possibly avoid this scenario if you put an index on the
name column (I won't go into why), but then the general rule would be
that you must index every column you might ever possibly search, which
just introduces a whole new set of issues you'd rather avoid.

The better way to deal with this is to split blobs out into their own table:

create table pictures (
id integer primary key,
name text,
date text,
description text,
data_id integer
);

create table data (
id integer primary key,
data blob
);

Now when you do your queries, you can still join the two tables
together so that they appear as a single table, e.g:

SELECT p.name from pictures p, data d where p.data_id=d.id and date >
'2007-11-01' and date < '2008-01-01';

The difference in this case is that only the blobs for the matching
records will be read in. Why? because SQLite scans the picture table
first, pulling out only the matching records, then joins them to the
data table (do an EXPLAIN on an example query and you will see this --
my VDBE reading is rusty but it looked that way to me). And better
yet, you can now just search the picture table by itself without
incurring any overhead of blobs or overflow pages.

I'm pretty sure this is correct, but I'll include the standard
disclaimer that I could be just plain wrong. I did happen to pass
through this part of the internals (row handling) some months ago and
this how it seemed to work then.

-- Mike




On Sun, Mar 30, 2008 at 5:03 AM, lukasz p. <[EMAIL PROTECTED]> wrote:
>
>
>
>  Martin Engelschalk wrote:
>  >
>  > Hello Lukasz,
>  >
>  > to insert a picture (or any binary data for that matter), you can read
>  > the data from the file and use sqlite3_bind_blob to insert (or update) it.
>  > When selecting it from the database, use sqlite_column_blob to retrieve
>  > the data and sqlite3_column_bytes to get the length.
>  >
>  > An alternative is to convert the binary data to a string before storing
>  > it in the database. For this you can use base64 - encoding or sonething
>  > like this.
>  >
>  > See
>  > http://www.sqlite.org/capi3ref.html#sqlite3_bind_blob
>  > http://www.sqlite.org/capi3ref.html#sqlite3_column_blob
>  > and the sqlite3_blob* - functions.
>  >
>  > On how to design your database, that would depend on what you are
>  > planning to to and how to use the data. Perhaps you could provide more
>  > information.
>  >
>  > Hope this helps,
>  > Martin
>  >
>  > lukasz p. wrote:
>  >> Hi.
>  >> I would like to write an application wich helps me to manage with my
>  >> cd/dvd
>  >> collections. It will scan all files on cd/dvd add it to a datebase and it
>  >> could also insert pictures (cd/dvd covers). My questions are:
>  >> - is there any possiblity to add in sqlite datebase pictures like gif,
>  >> jpg ?
>  >> - how to design datebase to scan cd/dvd and make tree layout (Folders,
>  >> files
>  >> etc.) ?
>  >> Thx for any sugestions.
>  >>
>  >
>  > _______________________________________________
>  > sqlite-users mailing list
>  > sqlite-users@sqlite.org
>  > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>  >
>  >
>
>  Thx for help Martin.
>  When I'm scanning my CD/DVD for a files I'll scan all path to it for
>  example:
>  \etc\lilo.conf
>  \etc\x11\xorg.conf
>  and after that I'll be creating tree layout in application, but I don't know
>  is it a good idea maybe  there is something simpler. How to keep those data
>  in datebase ? Could it be a blob or string?
>  --
>  View this message in context: 
> http://www.nabble.com/Sqlite-catalog---datebase-with-pictures-tp16369126p16380197.html
>
> Sent from the SQLite mailing list archive at Nabble.com.
>
>  _______________________________________________
>
>
> sqlite-users mailing list
>  sqlite-users@sqlite.org
>  http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to