On Wed, Sep 01, 2010 at 11:41:00AM -0700, Taras Glek scratched on the wall:
> Hi,
> Currently VACUUM takes care of sqlite-level fragmentation. Unfortunately
> it does little for fs-level fragmentation since the same file is being
> reused. It would be really beneficial for Mozilla performance if we
> could get a vacuum/hotcopy mode.
>
> As I understand it, currently vacuum works by copying out the data to a
> new file, then copying it back in and truncating the original db file.
>
> It would be really nice to instead do something like:
>
> copy the data to a new file
You're good up to here. This is basically the "VACUUM TO <file>"
idea I threw out some months ago, and Max referenced.
> swap the underlying filehandles to point at new file
>
> remove old file
>
> rename the new file to old name.
...and watch every other database connection that was accessing the
database file get very, very confused.
You're also breaking transactional integrity. You need the option of
backing-out of the operation right up until the moment it works, and
this procedure can't do that. For example, if you lose power right
after "remove old file", your database is no longer there.
...And before someone says something like "but the application can just
open the other one", I agree-- that it is an application problem.
> Currently we can copy the data to a new file, but it is a pretty
> invasive change to swap all of the current sqlite connections to the new
> file.
You'd have to do that anyways, for all connections other than the one
that issued the VACUUM command. Coordinating FD swaps across multiple
connections in the same process would be confusing enough-- there
would be no possible way to do it across database connections in
multiple processes.
And I really doubt you could do any of this in a transactional-safe way.
It's application level stuff.
> Things like prepared statements, etc need to be updated for every
> single db consumer. Thus it would make sense to have this feature on the
> sqlite side.
I'm not sure I agree that it makes sense. As soon as you start to
talk about filesystem operations (such as rename) you're way outside
of the scope of SQLite. Again, doing this kind of thing in a
transactional-safe way is nearly impossible.
Even if we stick with the existing procedure, I'm wondering if it is
really that bad (that's an honest question; I don't have data one way
or the other). In the last step of the VACUUM process, the data is
copied back to the original file, page by page. If I understand it
correctly, this is more or less a blanket write, blasting pages from
the temp DB to the original DB, front to back (intermixed with copies
to the journal file). There is no reason to assume the filesystem
will over-write the existing allocations, rather than just create new
ones, especially if the pages are shuffled in groups... although if
the goal is to defragment the file, we're kind of depending on that,
which is unwise.
Maybe there would be some way to pre-populate the rollback journal
with the full contents of the original database. Then the file could
be truncated before the copy-back procedure. That would make it
clear to the OS that it is free to allocate whatever file blocks it
wants, hopefully in better patterns. The copy back could also be
done in very large chunks.
I'm not totally sure, but I'm guessing that would require several
modifications and enhancements to the rollback and B-tree code.
You would need the ability to force a full rollback copy and the
ability do a blind, non-journaled copy, but it seems like it could
be done without modifications to the existing VACUUM semantics, all
while keeping the transactional integrity of the VACUUM command.
On a personal level, I don't think it is worth it. In the end, you're
still hoping the OS and filesystem will make smart choices about block
allocations. An application shouldn't need to be spending a lot
of time worrying about this level of filesystem performance. No
matter what, you're just hinting and setting up conditions that
should allow the filesystem driver to do something smart and fast.
It may, or it may not, actually do so.
-j
--
Jay A. Kreibich < J A Y @ K R E I B I.C H >
"Intelligence is like underwear: it is important that you have it,
but showing it to the wrong people has the tendency to make them
feel uncomfortable." -- Angela Johnson
_______________________________________________
sqlite-users mailing list
[email protected]
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users