The main reason why SQLite's practical limit is in the 10s of GBs as
opposed to TBs (theoretical) is due to how it tracks dirty pages. This
is described in the "Appropriate Uses" page
(http://www.sqlite.org/whentouse.html) but I'll rehash it here for
convenience. SQLite tracks dirty pages with a bitmap which is
allocated before each transaction. The size of the bitmap is
proportional to the size (not in rows but in pages) of the database
(256 bytes for every 1Mb of database), so as the database grows, the
amount of memory allocated before each transaction grows. When you get
into the GB range, you are starting to allocate in the MB range of
dirty page map memory per transaction, which starts to take its toll
on performance.

I could be wrong, but from what I know about virtual tables, there is
no such correlation between virtual table size and the dirty page
bitmap, as SQLite has no idea how big a virtual table is, nor does it
manage the data within the vtable. Furthermore, all SQLite really does
in a SELECT statement on a vtable is call your code to iterate over
it. So really the only performance issue is how long it takes your
code to iterate over your vtable. Thus, your table could be in the TB
range, and as long as you are fine with iterating over its contents,
there is no additional performance issues to speak of. There are ways
to implement virtual tables such that you can limit how much of the
table is scanned for certain queries, avoiding having to scan the
whole thing every time. I wrote an article that touches on this using
the match() function. Its available online:

http://www.ddj.com/database/202802959

IMO, virtual tables are one of the most powerful and unique features
of SQLite. There is a bit of a learning curve, but it's amazing what
you can do with them. It sounds like you going to have to iterate over
your external table one way or the other. I see no reason why the
vtable approach would be any slower than any other approach that
iterates over the data.

Having said that, while iterating over a large vtable is not a big
deal (as your program will just step through it one row at a time),
you need to be careful about getting too fancy with your SQL as you
may end up triggering a lot of background IO. For example, if you tack
on an ORDER BY which sorts one of the columns of your vtable, SQLite
will end up essentially copying the vtable contents into a temporary
file and sorting it, which may or may not be a strain on your system
depending on how big your table is (e.g. your vtable is 30Gb and your
/tmp folder is on a 10Gb partition). So think through what you are
doing when going beyond a simple SELECT * from big_vtable.

-- Mike

On Tue, Apr 1, 2008 at 3:12 PM, Aladdin Lampé <[EMAIL PROTECTED]> wrote:
>
>  Hi all!
>
>  Very often, when people ask this list why they have trouble managing in 
> sqlite a "big table" (50 million lines or more than 10 Go), they are told 
> that sqlite is an embedded database and is not meant to be used for very big 
> databases/tables.
>
>  I'm currently in the process of designing a specific, read-only, sqlite 
> "virtual table" in order to enable sqlite to access data stored in an 
> external database which is specially designed to handle very big tables.
>
>  My final objective is to be able to easily query a big external table 
> (stored in another database) through the - excellent - sqlite interface.
>
>  Now I have this terrible doubt: will the existing sqlite "limitations" for 
> big sqlite tables also apply to my read-only virtual tables?
>
>  Thus... am I currently losing my time developing such a "virtual table" with 
> this objective in mind? Or is there a better way to achieve my objective?
>
>  Thank you for your help!
>
>
>  _________________________________________________________________
>  Votre contact a choisi Hotmail, l'e-mail ultra sécurisé. Créez un compte 
> gratuitement !
>  http://www.windowslive.fr/hotmail/default.asp
>  _______________________________________________
>  sqlite-users mailing list
>  sqlite-users@sqlite.org
>  http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to