Re: [sqlite] Best way to "split" a large DB into two DBs

Stephen Chrzanowski Wed, 07 Sep 2016 09:25:28 -0700

What are the business logic reasons you want to split the database?  Is
your application going to be able to handle dealing with data that'd exist
in multiple places?


When I know I'm going to be putting large blobs within a database, those
blob tables have two fields, one ID, and one for the actual content.
Perhaps even a field to indicate what kind of blob it is, but, I'm also the
type that'd just make a new table for the new type of blob information.

The ID is a 1:1 PK:FK relationship between a field in the "master" table
and this blob data, so the master table would have a direct reference to
the blob record.  This way, whenever I need to do queries, I pull from the
master table, and IF I need to get content about the blob (Say image, MP3,
or whatever the case) I'll open a new query to the blob table, get my data,
then shut it down. I'd also set a unique constraint on the blob ID field as
well, which automatically creates an index against it for fast lookups.


On Wed, Sep 7, 2016 at 12:08 PM, Dominique Devienne <ddevie...@gmail.com>
wrote:

> Imagine there's a 25GB SQLite DB file in version v1 of an application.
>
> Because the way this DB is structured, with large to very large blobs, this
> leads to problems for the app. That DB has 35 tables, but 5 of those
> represent 95% of the DB size (because of those blobs).
>
> So for v2 of the app, there's a new design where the 30 "lightweight"
> tables move into a new DB file (with additional new tables), and the 5
> "blob heavy" tables remain in the existing DB file (but the 30 "moved"
> tables should be dropped from it).
>
> Initial design was to copy the DB file (app is "shutdown", so no connection
> to that DB file).
> Drop the 5 big tables. Vaccum it (the new DB). Add the new tables.
> But the 25GB copy was deemed too long in that case (several minutes).
> (after all, copying 95% of 25GB to discard all those GBs is inefficient)
>
> So alternate design was to create the schema empty in the new DB file,
> attach the old one, and insert the data from the 30 tables into the new
> empty (with identical schema/structure) tables. But that's also very slow
> apparently.
>
> Even though it's not my app, I know there are indexes and triggers on those
> tables (but the triggers are mostly on deletes, so don't really apply
> here), and I suggested adding those after the selects, but I doubt it's
> going to make a dramatic difference time-wise.
>
> Conceptually, the ideal situation would be to just copy the DB header,
> sqlite_master page(s) (dropping the 5 big tables definitions and related
> indexes/triggers from sqlite_master), and only the 5% of pages related to
> the other tables (and related objects).  (chaining of pages and root pages
> in sqlite_master have to be updated of course). It's almost like a form of
> VACCUM, except it's not in-place and works on a subset of the tables.
> IO-wise, that has the potential to be 20x faster I imagine.
>
> But of course there's no such "out-of-place" "partial" VACCUM...
>
> Then I though maybe .backup, which also works at the page level (I
> believe), perhaps can subset what tables to backup. But no luck there
> either. backup works for the whole DB, not a subset.
>
> Am I missing other ways to achieve this "split" efficiently?
> Any chance the backup API could group table-specific (and related objects)
> alternate form?
>
> Any advice would be appreciated. Thanks, --DD
> _______________________________________________
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Best way to "split" a large DB into two DBs

Reply via email to