On 2018/07/17 1:20 PM, Rob Willett wrote:

Thanks for this. You make some interesting points about cache hits and misses. Up until now, we hadn't seen a massive performance hit, whilst some database operations didn't work as fast as we would like them to, nothing works as fast we would like.

This is actually a common thread seen in this industry. It's quite normal for engineers or devs to put a system (of which they typically have only one at the start) and then simply assume that whatever speed it is going at is the correct speed, and often only bother to check actual speed deficiencies once they hit some real time barrier (which typically only happens much later) - and this is mostly just fine.

I propose as a test in future, to make a great big DB in whatever format you intend to use, then run it on some fast hardware, and then the typical VM, see the difference and decide if it's horrible or not and need further investigation or not. That said, even in your current problem, you probably wouldn't have noticed any real performance hit right until you decided to drop that 49GB table. Even now I'm thinking, you've been using your system for ages, it's been working great, one sunny afternoon you had to wait half a day for one maintenance operation... is that really worth changing an architecture for? Half-days are ten-a-penny.


I'm not familiar with these issues with virtualisation. The VPI we use has OpenVZ at it's core (no pun intended). We can see a little on the internet about this (well one article specifically about MySQL and OpenVZ) but nothing else.

We are in the process of evaluating whether to move to a different VPS, some use OpenVZ, some use KVM, internally we use VMWare ESXI. Very little real DB work is done on the actual metal, its all virtualised.

I am not familiar with usage figures on OpenVZ, but I don't think it should be worse than any other virtualization. The problems described by others will exist for all of them, and if it is particularly worse in any area, Google should find a zillion results on the problem. The mere scantness of results tells me it probably works just fine.



Do you have any pointers to stuff we can read up on? We don't understand your comment "SQLite spent that whole time accessing your 50GB database file in an apparently random order." and would like to try and get more information about it.

Simon (and previously Richard) simply referred to the way in which an SQLite table stores information is not similar to the way a file system stores it, and as such may take hits (especially in the cache misses) because the VM is optimized to handle file access in the way that normal programs (the other 98% of stuff out there) read files - sequentially. Doing random-access reads in a file happens, but is not common and so if you make a VM and have to choose which model to cater for, SQLIte's model never wins. It's usually not horrible either, but a 49GB sqlite table drop will bring out the worst in every VM - that's the only point.


We have the option of moving off OpenVZ to KVM or ESXI so if we can understand the issue, we can make a more informed choice. Whilst our DB has dropped down to 500MB we still need to do a fair amount of testing and checking to make sure there are no unusual edge cases (or bugs) based before we promote it to live.

Accurate testing will save you every time.
May I ask a curiosity... What kind of data did you store (maybe some examples if possible) that you could condense it by ~99% like that? Did you just throw away stuff? Were fields duplicate? Did you discover the World's bestest new compression method? Did you simply elect to store stuff that were unneeded or implied or somehow irrelevant and now simply don't store it anymore? Do you possess Alien technology?



Cheers,
Ryan

_______________________________________________
sqlite-users mailing list
[email protected]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to