On 2018/07/17 1:20 PM, Rob Willett wrote:
Thanks for this. You make some interesting points about cache hits and
misses. Up until now, we hadn't seen a massive performance hit, whilst
some database operations didn't work as fast as we would like them to,
nothing works as fast we would like.
This is actually a common thread seen in this industry. It's quite
normal for engineers or devs to put a system (of which they typically
have only one at the start) and then simply assume that whatever speed
it is going at is the correct speed, and often only bother to check
actual speed deficiencies once they hit some real time barrier (which
typically only happens much later) - and this is mostly just fine.
I propose as a test in future, to make a great big DB in whatever format
you intend to use, then run it on some fast hardware, and then the
typical VM, see the difference and decide if it's horrible or not and
need further investigation or not.
That said, even in your current problem, you probably wouldn't have
noticed any real performance hit right until you decided to drop that
49GB table. Even now I'm thinking, you've been using your system for
ages, it's been working great, one sunny afternoon you had to wait half
a day for one maintenance operation... is that really worth changing an
architecture for? Half-days are ten-a-penny.
I'm not familiar with these issues with virtualisation. The VPI we use
has OpenVZ at it's core (no pun intended). We can see a little on the
internet about this (well one article specifically about MySQL and
OpenVZ) but nothing else.
We are in the process of evaluating whether to move to a different
VPS, some use OpenVZ, some use KVM, internally we use VMWare ESXI.
Very little real DB work is done on the actual metal, its all
virtualised.
I am not familiar with usage figures on OpenVZ, but I don't think it
should be worse than any other virtualization. The problems described by
others will exist for all of them, and if it is particularly worse in
any area, Google should find a zillion results on the problem. The mere
scantness of results tells me it probably works just fine.
Do you have any pointers to stuff we can read up on? We don't
understand your comment "SQLite spent that whole time accessing your
50GB database file in an apparently random order." and would like to
try and get more information about it.
Simon (and previously Richard) simply referred to the way in which an
SQLite table stores information is not similar to the way a file system
stores it, and as such may take hits (especially in the cache misses)
because the VM is optimized to handle file access in the way that normal
programs (the other 98% of stuff out there) read files - sequentially.
Doing random-access reads in a file happens, but is not common and so if
you make a VM and have to choose which model to cater for, SQLIte's
model never wins. It's usually not horrible either, but a 49GB sqlite
table drop will bring out the worst in every VM - that's the only point.
We have the option of moving off OpenVZ to KVM or ESXI so if we can
understand the issue, we can make a more informed choice. Whilst our
DB has dropped down to 500MB we still need to do a fair amount of
testing and checking to make sure there are no unusual edge cases (or
bugs) based before we promote it to live.
Accurate testing will save you every time.
May I ask a curiosity... What kind of data did you store (maybe some
examples if possible) that you could condense it by ~99% like that? Did
you just throw away stuff? Were fields duplicate? Did you discover the
World's bestest new compression method? Did you simply elect to store
stuff that were unneeded or implied or somehow irrelevant and now simply
don't store it anymore? Do you possess Alien technology?
Cheers,
Ryan
_______________________________________________
sqlite-users mailing list
[email protected]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users