On Mon, Feb 13, 2017 at 12:13 PM, Richard Hipp <d...@sqlite.org> wrote:
>
> Scott:  The motivation for your patch seem to be to get auto-vacuum to
> run a little faster.  But if performance is your goal, why not just
> turn auto-vacuum off?  Or, failing that, set it to INCREMENTAL and
> then run "PRAGMA incremental_vacuum" when "PRAGMA freelist_count"
> reaches some threshold?
>

My first implementation did this manually using incremental-vacuum in a
layer above SQLite, because I already had a convenient function which
detected when updates had happened.  For explicit transactions, this is
pretty reasonable, albeit inelegant depending on whether you provide a
Commit() function or have to detect a COMMIT statement using string
operations.  But for auto-commit statements you can't get your incremental
vacuum into the commit scope.  [AFAICT, the commit and update hooks do not
allow running anything against the database.]  This version is certainly
doable if my suggested patch is considered beyond the pale.

I then reimplemented as more of an auto-incremental-vacuum, basically the
same code as I posted except that it keyed off the incremental-vacuum
flag.  It was pretty clean, but I did find myself thinking that it wouldn't
work well if you wrote code assuming the periodic cleaning was happening
automatically, but someone had neglected to setup the slack-sizing pragma.
In that case it would just stop collecting garbage, and your code would
never do it explicitly.  With the auto-vacuum version, it would just fall
back to auto-vacuum-every-time.  [I'm not sure which is actually worse, in
the end.]

Mostly, my motivation was that it felt like there was a gap between the
SQLITE_FCNTL_CHUNK_SIZE feature and auto_vacuum/incremental_vacuum, and
when I looked the gap was pretty clean to fill.  Having it key off of the
actual "Should I vacuum free pages" decision seemed more reasonable than
writing external code which makes educated guesses about what's going on.

I'm not really looking at "Should I rewrite all of the auto-vacuum
databases I can find to use manual vacuum".  I think a change like this
would make auto-vacuum databases behave a bit more like non-auto-vacuum
databases in terms of reuse of free space.

---

WRT your list of reasons for "Why even bother", another consideration to
add is that fragmentation on SSDs may not be the big problem it is on hard
drives.  If you have your page sizes reasonably aligned with the units of
the underlying filesystem and hardware, it may not matter much whether a
particular page is next to other pages which are logically adjacent in the
btree.  Of course, there's a lot of complexity in there, like whether the
OS continues to do read-ahead for SSDs.

That said, AFAICT, there's no code in place to make page placement
decisions based on locality, so I would expect that a
non-auto/incremental-vacuum database being actively updated would also
generate fragmentation.

---

I'm not going to get into the "Why not just use VACUUM".  My experience is
that this is a tough issue with databases in general, and developers often
have troubles wrapping their heads around it.  In my case, I'm thinking of
how to mitigate some concerns developers had with existing code, not how
they might write greenfield code going forward.  [As if they'd even ask my
advice before writing code and shipping it.  They only ask me things after
they've already shipped a problem :-).]

-scott
_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to