Re: [sqlite] sqlite versus MySQL automatic defragmentation on SSDs?

Richard Hipp Mon, 29 Oct 2012 06:28:50 -0700

On Mon, Oct 29, 2012 at 5:58 AM, David Barrett <dbarr...@expensify.com>wrote:


> On Mon, Oct 29, 2012 at 2:16 AM, Richard Hipp <d...@sqlite.org> wrote:
>
> > It sounds like you are pushing SQLite well beyond what it was intended to
> > do.  Remember the motto:  SQLite is not intended to replace Oracle, it is
> > intended to replace fopen().  SQLite does a great job for roles such as
> > data storage for a desktop application, or for databases in cellphones or
> > other gadgets, or as a local cache to a enterprise network database.  But
> > SQLite was never designed or intended to replace an enterprise-level
> > RDBMS.  And I think you probably need an enterprise-level RDBMS at this
> > point.
> >
>
> So what specifically do you feel is the problem with sqlite at scale?
>
> And to be clear, I'd like to pre-empt "well it doesn't do X, which you'll
> probably also want to do" -- all those X's are already covered and working
> great.  I just mean, what in particular do you feel about sqlite works
> great for a 50MB database, but doesn't work at a 50GB database?  I'm very
> open to being convinced -- and you'd be the person to convince me.  But I
> don't like to make decisions based on vague fears.  "Best practices" often
> aren't.
>
> Because in practice, as someone actually doing it (as opposed to theorizing
> about it), it works great.  The MySQL portions of our service are always in
> a semi-constant state of emergency, while our sqlite portions just hum
> along   And given that we're switching to SSDs, I expect they will hum even
> better.  What problems would you expect me to be seeing that I can happily
> report I'm not, or what problems have I not yet encountered but will -- at
> 100GB, or 1TB?
>

Generally the argument in favor of client/server databases versus SQLite
comes down to (1) Concurrency, (2) Replication, and (3) Network access.
The size of the database file shouldn't really be a factor.  Or it least it
hasn't as far as we know.  On the other hand, we haven't heard from people
pushing terabyte databases into SQLite before....

SQLite supports some concurrency (multiple readers, one writer) but not as
much as typical client/server databases.  SQLite is optimized more for the
single-user case.

SQLite has its backup API, which provides some basic replication
capabilities.  But it isn't the same thing has having a full-blown
real-time replicator like you find in most client/server installations.

And finally, SQLite, really really prefers to have its content on the local
disk.  Any application that uses a database can be conceptually divided
into three pieces:  (A) the application (B) the database engine and (C) the
content on disk.  If you have to cross a network to go between A and C (if
the content is on a different machine from where the application is
running) then it is better to cross that network at the A-B junction rather
than at the B-C junction because the A-B junction requires less bandwidth.
Client/server database engines use the A-B junction whereas SQLite on a
network filesystem uses the B-C junction.


>
> Thanks!
>
> -david
>
> PS: Also, if anybody does know anything about MySQL fragmentation, I'd
> still love some pointers.  I'm not sure how my casual request became
> interpreted as an obsession, but either way, I'd still love the benefit of
> your knowledge.
>
>
>
>
>
> >
> > MySQL is a good choice.  But here is another data point to consider:
>  When
> > we were writing the SqlLogicTest test suite for SQLite, we ran the test
> > vectors on a wide variety of server-class database engines in addition to
> > SQLite.  And in every case (including SQLite) we found cases that would
> > crash the server.  Every case, that is, except one.  We were never able
> to
> > crash PostgreSQL, nor find a case where PostgreSQL gave the wrong answer.
> >
> > Furthermore, whenever there is a question about what the behavior of some
> > obscure SQL construct ought to be and whether or not SQLite is doing it
> > right, usually the first thing we check is how PostgreSQL responds to the
> > same query.  When in doubt, we try to get SQLite to do the same thing as
> > PostgreSQL.
> >
> > Far be it from me to recommend one client/server database engine over
> > another.  But in my experience.....  well, you can fill in the rest,
> > probably...
> >
> > On Sun, Oct 28, 2012 at 10:48 AM, David Barrett <dbarr...@expensify.com
> > >wrote:
> >
> > > Wow, I didn't realize this was such a controversial question.
> > >
> > > I'm a huge sqlite fan.  Expensify is built on sqlite.  We have a 40GB
> > > database, replicated using our custom distributed transaction layer
> > across
> > > 5 severs in three different datacenters.[1]  It's been powering all of
> > > Expensify (including our direct deposit reimbursement engine and credit
> > > card import layer -- both of which contain incredibly sensitive
> > > information, with mistakes causing millions of dollars to move in the
> > wrong
> > > direction).  On the back of sqlite, we've grown to over million users,
> > > processing millions of dollars in expense reports every day.
> > >
> > > However, we're starting to see problems.  There is so much activity on
> > some
> > > servers that there is never a chance for our checkpointing thread to do
> > its
> > > thing, so our WAL file often ballons up to 30GB or more.  This makes
> > query
> > > times plummet.  We regularly checkpoint manually, and often vacuum, all
> > in
> > > an effort to keep queries moving quick.  We also do things to trick out
> > our
> > > indexes in order to ensure proper disk ordering, pay particular
> attention
> > > to block and cache amounts, etc.  This isn't premature optimization for
> > the
> > > sake of having fun, these are in response to real performance problems
> > > affecting our product.
> > >
> > > In light of that, there is a contingent pushing to drop sqlite in favor
> > of
> > > MySQL.  There are a wide range of reasons -- it has its own
> replication,
> > > better write concurrency, clustered indexes, and better edge-case data
> > > integrity (because we use 2 DBs and WAL, ATTACH doesn't provide atomic
> > > commit advantages).  And for each I have a corresponding answer --
> > MySQL's
> > > replication isn't as good as ours, concurrency doesn't matter because
> we
> > > serialize writes and have a single threaded server anyway, clustered
> > > indexes would be nice but we can get close enough with custom ROWIDs,
> and
> > > the extremely rare situation where there's a cross-database integrity
> > > problem, we can detect and recover from any of the other slaves.  And I
> > > also add in that sqlite can never crash because it's built into the
> > server;
> > > its performance is fantastic because it runs in the same process; in
> > years
> > > of operation we've never once seen it corrupt data; it's so easy to
> use;
> > > etc etc.
> > >
> > > But there's an argument I've heard come up to which I don't have a
> > > response: MySQL handles fragmentation better, and by extension would
> give
> > > us better performance on the same hardware.   I'd like to know more
> about
> > > it, which is why I've asked.  Thanks!
> > >
> > > -david
> > >
> > > [1] http://permalink.gmane.org/gmane.comp.db.sqlite.general/71868
> > >
> > >
> > > On Sun, Oct 28, 2012 at 2:12 PM, Pavel Ivanov <paiva...@gmail.com>
> > wrote:
> > >
> > > > OK. Curiosity is a good thing in certain situations. But could you
> > > > kindly tell me what will you do with this information (assuming it's
> > > > possible to obtain it of course)?
> > > >
> > > > Pavel
> > > >
> > > > On Sat, Oct 27, 2012 at 11:54 PM, David Barrett <
> > dbarr...@expensify.com>
> > > > wrote:
> > > > > I completely understand the wide and varied differences.  I'm just
> > > *also*
> > > > > interested in this very specific issue.
> > > > >
> > > > > -david
> > > > >
> > > > > On Sun, Oct 28, 2012 at 12:38 PM, Pavel Ivanov <paiva...@gmail.com
> >
> > > > wrote:
> > > > >
> > > > >> > That said, I'd still welcome any quick summary of the
> differences
> > > > between
> > > > >> > sqlite and mysql when it comes to fragmentation.
> > > > >>
> > > > >> This is far from main differences between sqlite and mysql that
> you
> > > > >> should consider if you want to choose between them unless of
> course
> > > > >> your question is just about an academic interest. As you are
> talking
> > > > >> about employees I guess you are not in some purely academic
> > exercise.
> > > > >> In this case think more about in-process code vs pumping through
> > > > >> TCP/IP stack, designed mostly to be accessible from machine-local
> > > > >> processes only vs accessible to anyone on the network, plain
> access
> > to
> > > > >> everything vs versatile and complicated authorization and
> > > > >> authentication mechanisms, and so on and so forth. Database format
> > is
> > > > >> never a part of the decision which DBMS you want to use.
> > > > >>
> > > > >> Pavel
> > > > >>
> > > > >>
> > > > >> On Sat, Oct 27, 2012 at 9:32 PM, David Barrett <
> > > dbarr...@expensify.com>
> > > > >> wrote:
> > > > >> > Thanks, this is really helpful!  (And I lecture my employees
> about
> > > the
> > > > >> > evils of premature optimization all the time.  In fact, I'll
> > lecture
> > > > >> anyone
> > > > >> > in earshot, so frequently that it's the butt of jokes.)
> > > > >> >
> > > > >> > That said, I'd still welcome any quick summary of the
> differences
> > > > between
> > > > >> > sqlite and mysql when it comes to fragmentation.  I often get in
> > > > debates
> > > > >> > about sqlite versus other datbases, and I'm always eager to be
> > > > informed.
> > > > >> >  Thanks!
> > > > >> >
> > > > >> > -david
> > > > >> >
> > > > >> > On Sat, Oct 27, 2012 at 11:42 PM, Simon Slavin <
> > > slav...@bigfraud.org>
> > > > >> wrote:
> > > > >> >
> > > > >> >>
> > > > >> >> On 27 Oct 2012, at 11:38am, David Barrett <
> > dbarr...@expensify.com>
> > > > >> wrote:
> > > > >> >>
> > > > >> >> > I'm trying to learn more about MySQL versus sqllite when it
> > comes
> > > > to
> > > > >> >> > vacuuming and fragmentation, especially as it relates to SSD
> > > > storage.
> > > > >> >>
> > > > >> >>
> > > > >> >> Rather than answer your questions point-by-point, I'm going to
> > give
> > > > you
> > > > >> >> the current state of play.  Your understanding of how the
> various
> > > > DBMSes
> > > > >> >> work is right, but your excellent question "in a world with
> SSDs
> > > and
> > > > a
> > > > >> >> tremendous amount of RAM, does vacuuming matter nearly as much
> as
> > > on
> > > > a
> > > > >> >> spinning disk with constrained RAM?" cuts to the heart of all
> > your
> > > > other
> > > > >> >> points.  The following involves a little simplification and
> > > > handwaving
> > > > >> >> because otherwise it would be two chapters long and you'd have
> to
> > > do
> > > > >> >> homework.
> > > > >> >>
> > > > >> >> Fragmentation made a big difference to apps running on Windows,
> > but
> > > > very
> > > > >> >> little on any other platform.  This is because Windows does
> > > something
> > > > >> >> called 'read-ahead caching' which assumes that if you read
> block
> > B,
> > > > >> you're
> > > > >> >> soon going to want to read block B+1, so at quite a low level
> it
> > > > >> helpfully
> > > > >> >> pre-reads it for you.  Other operating systems don't make this
> > > > >> assumption.
> > > > >> >>  This is why Windows users talk about defragmentation so much,
> > but
> > > > Unix
> > > > >> >> users don't care about it.
> > > > >> >>
> > > > >> >> SSDs negate the whole point of defragmentation.  On a
> rotational
> > > disk
> > > > >> it's
> > > > >> >> faster to read blocks B, B+1, B+2, B+4, B+5 than it is to read
> > five
> > > > >> random
> > > > >> >> blocks from the disk, because the read heads are already
> > positioned
> > > > in
> > > > >> the
> > > > >> >> right place, and the disk is going to rotate to show those five
> > > > blocks
> > > > >> in
> > > > >> >> order.  SSDs are just like RAM: they're Random Access Memory.
> > > >  Reading
> > > > >> any
> > > > >> >> five blocks in any order takes roughly the same time.  So
> nobody
> > > > cares
> > > > >> >> about fragmentation on an SSD.  Read whatever blocks you want
> in
> > > > >> whatever
> > > > >> >> order you want.
> > > > >> >>
> > > > >> >> As to clever management of disk block alignment with respect to
> > > rows
> > > > and
> > > > >> >> columns, this is rarely worth attention these days.  The amount
> > of
> > > > >> >> programming and debugging time it takes to get this right, and
> > the
> > > > >> amount
> > > > >> >> of extra processing and disk access you need to do, give you
> less
> > > > >> return on
> > > > >> >> investment than if you spent the same money on buying a faster
> > hard
> > > > >> disk.
> > > > >> >>  It's "premature optimization" (look up the term) except for
> two
> > > > cases:
> > > > >> >> overnight runs and realtime 3D graphics.  If your overnight run
> > > takes
> > > > >> more
> > > > >> >> than one night, you have a problem.  If you're programming
> > realtime
> > > > 3D
> > > > >> >> graphics and they're jerky, your users won't enjoy your
> > simulation.
> > > >  But
> > > > >> >> you wouldn't be using a SQL engine for 3D graphics anyway.
> > > > >> >>
> > > > >> >> The matters you mentioned were all worth attention back in the
> > > 1980s
> > > > >> when
> > > > >> >> storage and bandwidth were expensive.  As you pointed out near
> > the
> > > > end
> > > > >> of
> > > > >> >> your post, these things matter less now.
> > > > >> >>
> > > > >> >> Simon.
> > > > >> >>
> > > > >> >> _______________________________________________
> > > > >> >> sqlite-users mailing list
> > > > >> >> sqlite-users@sqlite.org
> > > > >> >> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
> > > > >> >>
> > > > >> > _______________________________________________
> > > > >> > sqlite-users mailing list
> > > > >> > sqlite-users@sqlite.org
> > > > >> > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
> > > > >> _______________________________________________
> > > > >> sqlite-users mailing list
> > > > >> sqlite-users@sqlite.org
> > > > >> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
> > > > >>
> > > > > _______________________________________________
> > > > > sqlite-users mailing list
> > > > > sqlite-users@sqlite.org
> > > > > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
> > > > _______________________________________________
> > > > sqlite-users mailing list
> > > > sqlite-users@sqlite.org
> > > > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
> > > >
> > > _______________________________________________
> > > sqlite-users mailing list
> > > sqlite-users@sqlite.org
> > > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
> > >
> >
> >
> >
> > --
> > D. Richard Hipp
> > d...@sqlite.org
> > _______________________________________________
> > sqlite-users mailing list
> > sqlite-users@sqlite.org
> > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
> >
> _______________________________________________
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>



-- 
D. Richard Hipp
d...@sqlite.org
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] sqlite versus MySQL automatic defragmentation on SSDs?

Reply via email to