Re: [sqlite] Any ideas on how long it takes to drop a large table...rather an important question now...
Simon, No, we knew about normalisation, the database is normalised, that part of the design hasn't changed in years. The reasons for the massive reduction in database size is that we worked out how to handle repeating lines of data that change just enough that we thought they needed to be kept. With some small tweaks we could eliminate data that we thought we needed to preserve. We had assumed that we needed to keep everything, but by going back in, looking at what actual data we now had, we worked out we could store a small amount of extra information that stores some state change but this meant we could dump most of the database. It was something that we hadn't appreciated a few years ago, and the ever increasing database meant we had to do something about the DB size. Its taken a few months of work, mainly in a dark room with a wet towel on our foreheads, to do the tiny changes necessary to make a big difference. In hindsight the fact that SQLite is so easy and fast to use was a slight disadvantage to us, it allowed us to be a bit quick and dirty with designs, when we should have thought through some of the issues. However thats what startups are like ;) I think all the explanations to date have been helpful and appreciate the time take to answer, we're going to think a little more carefully about how we manage our database on a VM. I'm in the process of moving home so the only real hardware (of any note) is sitting in storage so the only testing we can do is either virtualised or on Mac laptops. Neither of which will help us in this instance. Rob On 17 Jul 2018, at 13:16, Simon Slavin wrote: On 17 Jul 2018, at 1:10pm, R Smith wrote: What kind of data did you store (maybe some examples if possible) that you could condense it by ~99% like that? I think that the OP's organisation discovered the 'relational' part of RDBMS and implemented normalisation. To Rob Willett: Ryan Smith's explanation is better than mine. Please read his post and ignore my wooly one. I tried three ways to get across the sequential-access vs. random-access point and wasn't really happy with anything I wrote. Simon. ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Any ideas on how long it takes to drop a large table...rather an important question now...
Dan, Thanks for the reply. pragma secure_delete; 1 sqlite> Which is a bit of a surprise as we have never seen it before. We had to look it up using your reference. No idea why that is set (if 1 means it is). Should we simply ctrl-C the deletion, turn secure_delete off and then try again? My working assumption is that since there is a -wal file we are safe to do this. Rob On 16 Jul 2018, at 21:07, Dan Kennedy wrote: On 07/17/2018 03:01 AM, rob.sql...@robertwillett.com wrote: Hi, We're doing a massive tidy on our database which is approx 50GB. One table is approx 49GB of that 50GB which we need to delete as we have recorded the information in a far better format thats approx 99% more efficient. If only we had been this clever when we started We've just 'dropped' the table and were assuming that dropping the table would be quite quick. It's not. So far we've been waiting for 30 mins and nothing has come back yet. We can see that the -wal file is upto 2.5GB. We have this terrible feeling that it'll need to get to 49GB or so before the table gets dropped. We can just about handle that in the current filesystem. We're now getting nervous about dropping this table. We had assumed that it would be a really quick and easy operation based on absolutely no checking whatsoever. When we looked on line all we could see was a reference to a very, very old and outdated page (https://sqlite.org/speed.html) which talks about speed and at the bottom of that page the comments "SQLite is slower than the other databases when it comes to dropping tables. This probably is because when SQLite drops a table, it has to go through and erase the records in the database file that deal with that table. MySQL and PostgreSQL, on the other hand, use separate files to represent each table so they can drop a table simply by deleting a file, which is much faster. On the other hand, dropping tables is not a very common operation so if SQLite takes a little longer, that is not seen as a big problem." Is this still the case, is it going to take a long time? If we assume that the table is 49GB then will we need to wait until the -wal file is at 49GB. By our estimates thats approximately 10 hours away. Any help or idea or suggestions welcomed, but please be quick. Is secure-delete turned on? https://www.sqlite.org/pragma.html#pragma_secure_delete Dan. ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Any ideas on how long it takes to drop a large table...rather an important question now...
Simon, No, we knew about normalisation, the database is normalised, that part of the design hasn't changed in years. The reasons for the massive reduction in database size is that we worked out how to handle repeating lines of data that change just enough that we thought they needed to be kept. With some small tweaks we could eliminate data that we thought we needed to preserve. We had assumed that we needed to keep everything, but by going back in, looking at what actual data we now had, we worked out we could store a small amount of extra information that stores some state change but this meant we could dump most of the database. It was something that we hadn't appreciated a few years ago, and the ever increasing database meant we had to do something about the DB size. Its taken a few months of work, mainly in a dark room with a wet towel on our foreheads, to do the tiny changes necessary to make a big difference. In hindsight the fact that SQLite is so easy and fast to use was a slight disadvantage to us, it allowed us to be a bit quick and dirty with designs, when we should have thought through some of the issues. However thats what startups are like ;) I think all the explanations to date have been helpful and appreciate the time take to answer, we're going to think a little more carefully about how we manage our database on a VM. I'm in the process of moving home so the only real hardware (of any note) is sitting in storage so the only testing we can do is either virtualised or on Mac laptops. Neither of which will help us in this instance. Rob On 17 Jul 2018, at 13:16, Simon Slavin wrote: On 17 Jul 2018, at 1:10pm, R Smith wrote: What kind of data did you store (maybe some examples if possible) that you could condense it by ~99% like that? I think that the OP's organisation discovered the 'relational' part of RDBMS and implemented normalisation. To Rob Willett: Ryan Smith's explanation is better than mine. Please read his post and ignore my wooly one. I tried three ways to get across the sequential-access vs. random-access point and wasn't really happy with anything I wrote. Simon. ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Any ideas on how long it takes to drop a large table...rather an important question now...
On 17 Jul 2018, at 1:10pm, R Smith wrote: > What kind of data did you store (maybe some examples if possible) that you > could condense it by ~99% like that? I think that the OP's organisation discovered the 'relational' part of RDBMS and implemented normalisation. To Rob Willett: Ryan Smith's explanation is better than mine. Please read his post and ignore my wooly one. I tried three ways to get across the sequential-access vs. random-access point and wasn't really happy with anything I wrote. Simon. ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Any ideas on how long it takes to drop a large table...rather an important question now...
On 2018/07/17 1:20 PM, Rob Willett wrote: Thanks for this. You make some interesting points about cache hits and misses. Up until now, we hadn't seen a massive performance hit, whilst some database operations didn't work as fast as we would like them to, nothing works as fast we would like. This is actually a common thread seen in this industry. It's quite normal for engineers or devs to put a system (of which they typically have only one at the start) and then simply assume that whatever speed it is going at is the correct speed, and often only bother to check actual speed deficiencies once they hit some real time barrier (which typically only happens much later) - and this is mostly just fine. I propose as a test in future, to make a great big DB in whatever format you intend to use, then run it on some fast hardware, and then the typical VM, see the difference and decide if it's horrible or not and need further investigation or not. That said, even in your current problem, you probably wouldn't have noticed any real performance hit right until you decided to drop that 49GB table. Even now I'm thinking, you've been using your system for ages, it's been working great, one sunny afternoon you had to wait half a day for one maintenance operation... is that really worth changing an architecture for? Half-days are ten-a-penny. I'm not familiar with these issues with virtualisation. The VPI we use has OpenVZ at it's core (no pun intended). We can see a little on the internet about this (well one article specifically about MySQL and OpenVZ) but nothing else. We are in the process of evaluating whether to move to a different VPS, some use OpenVZ, some use KVM, internally we use VMWare ESXI. Very little real DB work is done on the actual metal, its all virtualised. I am not familiar with usage figures on OpenVZ, but I don't think it should be worse than any other virtualization. The problems described by others will exist for all of them, and if it is particularly worse in any area, Google should find a zillion results on the problem. The mere scantness of results tells me it probably works just fine. Do you have any pointers to stuff we can read up on? We don't understand your comment "SQLite spent that whole time accessing your 50GB database file in an apparently random order." and would like to try and get more information about it. Simon (and previously Richard) simply referred to the way in which an SQLite table stores information is not similar to the way a file system stores it, and as such may take hits (especially in the cache misses) because the VM is optimized to handle file access in the way that normal programs (the other 98% of stuff out there) read files - sequentially. Doing random-access reads in a file happens, but is not common and so if you make a VM and have to choose which model to cater for, SQLIte's model never wins. It's usually not horrible either, but a 49GB sqlite table drop will bring out the worst in every VM - that's the only point. We have the option of moving off OpenVZ to KVM or ESXI so if we can understand the issue, we can make a more informed choice. Whilst our DB has dropped down to 500MB we still need to do a fair amount of testing and checking to make sure there are no unusual edge cases (or bugs) based before we promote it to live. Accurate testing will save you every time. May I ask a curiosity... What kind of data did you store (maybe some examples if possible) that you could condense it by ~99% like that? Did you just throw away stuff? Were fields duplicate? Did you discover the World's bestest new compression method? Did you simply elect to store stuff that were unneeded or implied or somehow irrelevant and now simply don't store it anymore? Do you possess Alien technology? Cheers, Ryan ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Any ideas on how long it takes to drop a large table...rather an important question now...
On 17 Jul 2018, at 12:20pm, Rob Willett wrote: > Do you have any pointers to stuff we can read up on? We don't understand your > comment "SQLite spent that whole time accessing your 50GB database file in an > apparently random order." and would like to try and get more information > about it. I don't know much about specific virtualisation packages and have no recommendation. I was just comparing virtual machines in general with operations on "actual metal" as you put it. SQLite database files are split up into pages. Each page contains data for exactly one table or index. So even just a table with a primary index requires two different kinds of pages, one for the row of data, one for the corresponding entry in the primary key index. And making changes to them both requires changes at other locations: the places where the page indexes are stored. Worse still, unless all your indexes are monatonic (with values in those indexes only ever increasing) anything progressing through them in order would not be reading through the database file in order. This means you get continuous cache misses: the next piece of data you need is rarely already in the virtual computer's cache, because it's usually in a part of the file far away from the one you just read. On a real computer, storage is only one motherboard connection and storage driver away from the CPU. On a virtual computer the virtual storage is on a file server far away from the processing, and each time data isn't already in the processor's cache something has to work out where, in the racks of file servers, that page of file is stored, talk to that file server, and route traffic back and forth to the virtual processor. So file operations are more slow and expensive than they are for real computers. And an operation which generates cache misses for 50Gig of data ends up doing more work done than it would for a simple "real metal" setup. Simon. ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Any ideas on how long it takes to drop a large table...rather an important question now...
Simon, Thanks for this. You make some interesting points about cache hits and misses. Up until now, we hadn't seen a massive performance hit, whilst some database operations didn't work as fast as we would like them to, nothing works as fast we would like. I'm not familiar with these issues with virtualisation. The VPI we use has OpenVZ at it's core (no pun intended). We can see a little on the internet about this (well one article specifically about MySQL and OpenVZ) but nothing else. We are in the process of evaluating whether to move to a different VPS, some use OpenVZ, some use KVM, internally we use VMWare ESXI. Very little real DB work is done on the actual metal, its all virtualised. Do you have any pointers to stuff we can read up on? We don't understand your comment "SQLite spent that whole time accessing your 50GB database file in an apparently random order." and would like to try and get more information about it. We have the option of moving off OpenVZ to KVM or ESXI so if we can understand the issue, we can make a more informed choice. Whilst our DB has dropped down to 500MB we still need to do a fair amount of testing and checking to make sure there are no unusual edge cases (or bugs) based before we promote it to live. Many thanks Rob On 17 Jul 2018, at 12:05, Simon Slavin wrote: On 17 Jul 2018, at 8:37am, Rob Willett wrote: I suspect that part of the issue is the VPS provider we use has a rate limiter on IOPS which is not normally an issue for us, but that might have slowed it down somewhat. However I don't think that it would have slowed it down by hours. Actually I think VPS had a lot to do with the time the operation took. Any kind of virtual machine takes a terrible hit during the sort of storage access involved in dropping the table. SQLite spent that whole time accessing your 50GB database file in an apparently random order. So you had nine hours of cache misses, causing the virtual machine to continually write virtual pages back to real storage and read other pages into memory. Virtual systems are optimized for cache hits, not cache misses. I can't prove it without a lot of pointless data manipulation on your type of VPS, but I think you found its least optimal operation. The good part is that now your database is less than 1GB long you're going to see a massive increase in speed since the whole database may well fit in the cache of your virtual machine. Must remember in future, when people report unusually slow operations, to ask whether they're using a virtual machine or real hardware. Simon. ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Any ideas on how long it takes to drop a large table...rather an important question now...
On 17 Jul 2018, at 8:37am, Rob Willett wrote: > I suspect that part of the issue is the VPS provider we use has a rate > limiter on IOPS which is not normally an issue for us, but that might have > slowed it down somewhat. However I don't think that it would have slowed it > down by hours. Actually I think VPS had a lot to do with the time the operation took. Any kind of virtual machine takes a terrible hit during the sort of storage access involved in dropping the table. SQLite spent that whole time accessing your 50GB database file in an apparently random order. So you had nine hours of cache misses, causing the virtual machine to continually write virtual pages back to real storage and read other pages into memory. Virtual systems are optimized for cache hits, not cache misses. I can't prove it without a lot of pointless data manipulation on your type of VPS, but I think you found its least optimal operation. The good part is that now your database is less than 1GB long you're going to see a massive increase in speed since the whole database may well fit in the cache of your virtual machine. Must remember in future, when people report unusually slow operations, to ask whether they're using a virtual machine or real hardware. Simon. ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Any ideas on how long it takes to drop a large table...rather an important question now...
The top table finally finished around 07:00 this morning. It took 9.5 hours to drop the circa 190M rows. I suspect that part of the issue is the VPS provider we use has a rate limiter on IOPS which is not normally an issue for us, but that might have slowed it down somewhat. However I don't think that it would have slowed it down by hours. Lessons to be learnt here: 1. Design your database correctly at the start. This was our biggest problem, we failed to understand how large a single table would grow and didn't manage the administration correctly and in good time. 2. Think carefully about the data you actually need. We managed to throw away 99% of our data and combined it together to get a 50GB database down to 500MB. 3. Dropping a table may not be the fastest method, creating all the other smaller tables in a separate database would have been far quicker. There's always more than one way to solve a problem. 4. Turn off secure_delete if it's on. 5. Use the mailing group for advice. Its brilliant! 6. Assess, plan, act. Thats a diving phrase but very pertinent here, assess what the problem actually is before you plan and execute. Now we have other checks to do to assure us the database is accurate for our needs. Thanks to all the people who offered advice and help Rob On 17 Jul 2018, at 7:02, Rob Willett wrote: Richard, Thanks for the comprehensive update. We appreciate that there are tradeoffs and that dropping a table is a rarer operation than most others. The upside of the Sqlite design is that we treat a database as a single file which we copy around with ease. You cannot do that with many other databases, this has a downside though of when you need to drop a table. A less charitable person would have stated that it was our own fault for designing the database and using it with approx 190M records in a single table with little thought about how we would manage it. We now know this was a dumb idea, however you live and learn. The impact of this is simply time. We've worked out how to shrink the database from 50GB to approx 1GB, after this we can prune the database on a weekly basis so this never becomes an issue. I supposed the key question is would we give up the portability of sqlite for a fast table drop? The answer is No. Rob On 16 Jul 2018, at 22:59, Richard Hipp wrote: On 7/16/18, Rob Willett wrote: It does look as if one of sqlite's weaknesses is dropping very, very large tables. Right. If every table were stored in a separate file, a DROP TABLE could be translated to a relatively fast unlink(). But then a database would be a directory full of files, rather than a single file, which would undermine a lot of the usefulness of SQLite. Furthermore, DROP TABLE is an uncommon operation. We prefer to provide a database where all content is contained in a single file and that is optimized SELECTs and for INSERTs and DELETEs of a subset of the rows in the table, as that seems to be far more useful in most cases. There are always engineering trade-offs. SQLite provides single-file databases and fast queries in exchange for slower DROP TABLEs. A filesystem is able to implement unlink() quickly because it has the entire partition available for laying out the locations of files. Space can be allocated to a file in large chunks, which makes deallocation faster. In other words, in a filesystem, the files can be positioned sparsely on disk, with lots of unused space in between the various file to accommodate growth . But in SQLite, the goal is to keep the database file size as small as possible, and to not have unnecessary unused pages in the middle of the database. Hence, space for tables much be allocated in relatively small 1-page chunks, which means that there will be a large number of chunks to deallocate when dropping a large table. If you were to construct a filesystem that tried to keep all file content tightly packed at the beginning of a partition (say, for example, to make shrinking of a partition faster) then unlink() would necessarily be slower on that filesystem. That seems like a bad engineering trade-off for a filesystem, but it is (for most applications) a good trade-off for a database such as SQLite. You can work around this. If you have one or more tables that you think might need to be DROP-ed frequently, then consider storing them in a separate database files and ATTACH-ing those separate database files to your main database. Then, to drop those tables, you can DETACH them, and then unlink() the corresponding database file. That should go much faster. -- D. Richard Hipp d...@sqlite.org ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman
Re: [sqlite] Any ideas on how long it takes to drop a large table...rather an important question now...
Richard, Thanks for the comprehensive update. We appreciate that there are tradeoffs and that dropping a table is a rarer operation than most others. The upside of the Sqlite design is that we treat a database as a single file which we copy around with ease. You cannot do that with many other databases, this has a downside though of when you need to drop a table. A less charitable person would have stated that it was our own fault for designing the database and using it with approx 190M records in a single table with little thought about how we would manage it. We now know this was a dumb idea, however you live and learn. The impact of this is simply time. We've worked out how to shrink the database from 50GB to approx 1GB, after this we can prune the database on a weekly basis so this never becomes an issue. I supposed the key question is would we give up the portability of sqlite for a fast table drop? The answer is No. Rob On 16 Jul 2018, at 22:59, Richard Hipp wrote: On 7/16/18, Rob Willett wrote: It does look as if one of sqlite's weaknesses is dropping very, very large tables. Right. If every table were stored in a separate file, a DROP TABLE could be translated to a relatively fast unlink(). But then a database would be a directory full of files, rather than a single file, which would undermine a lot of the usefulness of SQLite. Furthermore, DROP TABLE is an uncommon operation. We prefer to provide a database where all content is contained in a single file and that is optimized SELECTs and for INSERTs and DELETEs of a subset of the rows in the table, as that seems to be far more useful in most cases. There are always engineering trade-offs. SQLite provides single-file databases and fast queries in exchange for slower DROP TABLEs. A filesystem is able to implement unlink() quickly because it has the entire partition available for laying out the locations of files. Space can be allocated to a file in large chunks, which makes deallocation faster. In other words, in a filesystem, the files can be positioned sparsely on disk, with lots of unused space in between the various file to accommodate growth . But in SQLite, the goal is to keep the database file size as small as possible, and to not have unnecessary unused pages in the middle of the database. Hence, space for tables much be allocated in relatively small 1-page chunks, which means that there will be a large number of chunks to deallocate when dropping a large table. If you were to construct a filesystem that tried to keep all file content tightly packed at the beginning of a partition (say, for example, to make shrinking of a partition faster) then unlink() would necessarily be slower on that filesystem. That seems like a bad engineering trade-off for a filesystem, but it is (for most applications) a good trade-off for a database such as SQLite. You can work around this. If you have one or more tables that you think might need to be DROP-ed frequently, then consider storing them in a separate database files and ATTACH-ing those separate database files to your main database. Then, to drop those tables, you can DETACH them, and then unlink() the corresponding database file. That should go much faster. -- D. Richard Hipp d...@sqlite.org ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Any ideas on how long it takes to drop a large table...rather an important question now...
Jay, I think your approach would have been quicker. Ten hours so far and it's still deleting the table :( Rob On 17 Jul 2018, at 2:16, Jay Kreibich wrote: On Jul 16, 2018, at 3:01 PM, rob.sql...@robertwillett.com wrote: Hi, We're doing a massive tidy on our database which is approx 50GB. One table is approx 49GB of that 50GB which we need to delete as we have recorded the information in a far better format thats approx 99% more efficient. If only we had been this clever when we started…. Depending on the complexity of the scheme, it sounds like it might be easier to just dump the other tables, re-create the database without this jumbo table, and re-import the other data. -j ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Any ideas on how long it takes to drop a large table...rather an important question now...
> On Jul 16, 2018, at 3:01 PM, rob.sql...@robertwillett.com wrote: > > Hi, > > We're doing a massive tidy on our database which is approx 50GB. > > One table is approx 49GB of that 50GB which we need to delete as we have > recorded the information in a far better format thats approx 99% more > efficient. If only we had been this clever when we started…. Depending on the complexity of the scheme, it sounds like it might be easier to just dump the other tables, re-create the database without this jumbo table, and re-import the other data. -j ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Any ideas on how long it takes to drop a large table...rather an important question now...
On 7/16/18, Rob Willett wrote: > > It does look as if one of sqlite's weaknesses is dropping very, very > large tables. > Right. If every table were stored in a separate file, a DROP TABLE could be translated to a relatively fast unlink(). But then a database would be a directory full of files, rather than a single file, which would undermine a lot of the usefulness of SQLite. Furthermore, DROP TABLE is an uncommon operation. We prefer to provide a database where all content is contained in a single file and that is optimized SELECTs and for INSERTs and DELETEs of a subset of the rows in the table, as that seems to be far more useful in most cases. There are always engineering trade-offs. SQLite provides single-file databases and fast queries in exchange for slower DROP TABLEs. A filesystem is able to implement unlink() quickly because it has the entire partition available for laying out the locations of files. Space can be allocated to a file in large chunks, which makes deallocation faster. In other words, in a filesystem, the files can be positioned sparsely on disk, with lots of unused space in between the various file to accommodate growth . But in SQLite, the goal is to keep the database file size as small as possible, and to not have unnecessary unused pages in the middle of the database. Hence, space for tables much be allocated in relatively small 1-page chunks, which means that there will be a large number of chunks to deallocate when dropping a large table. If you were to construct a filesystem that tried to keep all file content tightly packed at the beginning of a partition (say, for example, to make shrinking of a partition faster) then unlink() would necessarily be slower on that filesystem. That seems like a bad engineering trade-off for a filesystem, but it is (for most applications) a good trade-off for a database such as SQLite. You can work around this. If you have one or more tables that you think might need to be DROP-ed frequently, then consider storing them in a separate database files and ATTACH-ing those separate database files to your main database. Then, to drop those tables, you can DETACH them, and then unlink() the corresponding database file. That should go much faster. -- D. Richard Hipp d...@sqlite.org ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Any ideas on how long it takes to drop a large table...rather an important question now...
Mmm It's still taking an awful long time, though the -wal file is very small. It does look as if one of sqlite's weaknesses is dropping very, very large tables. Oh well, lets let it run through the night. Rob On 16 Jul 2018, at 21:25, Rob Willett wrote: Dan, We've killed the process (kill -9). Fired up sqlite3 again, closed it down normally. The -wal files were removed. Fired up sqlite3 again, turned off secure_delete, started to drop the table again, reniced it down (long story to do with IOPS and our VPS provider) -wal file is empty, the -shm file is a steady 32768 bytes. No idea if anything is actually happening now, but at least we don't have a massive -wal file. Suspect this could take the night (its 21:24 in London), so we'll detach the screen session and come back later. Many thanks for the help. Rob On 16 Jul 2018, at 21:17, Dan Kennedy wrote: On 07/17/2018 03:12 AM, Rob Willett wrote: Dan, Thanks for the reply. pragma secure_delete; 1 sqlite> Which is a bit of a surprise as we have never seen it before. We had to look it up using your reference. No idea why that is set (if 1 means it is). Should we simply ctrl-C the deletion, turn secure_delete off and then try again? My working assumption is that since there is a -wal file we are safe to do this. That sounds fine. Without secure-delete, the wal file should be pretty small. You can always safely kill a process in the middle of an SQLite transaction. So long as you don't do anything foolhardy like deleting wal or journal files afterwards. Dan. Rob On 16 Jul 2018, at 21:07, Dan Kennedy wrote: On 07/17/2018 03:01 AM, rob.sql...@robertwillett.com wrote: Hi, We're doing a massive tidy on our database which is approx 50GB. One table is approx 49GB of that 50GB which we need to delete as we have recorded the information in a far better format thats approx 99% more efficient. If only we had been this clever when we started We've just 'dropped' the table and were assuming that dropping the table would be quite quick. It's not. So far we've been waiting for 30 mins and nothing has come back yet. We can see that the -wal file is upto 2.5GB. We have this terrible feeling that it'll need to get to 49GB or so before the table gets dropped. We can just about handle that in the current filesystem. We're now getting nervous about dropping this table. We had assumed that it would be a really quick and easy operation based on absolutely no checking whatsoever. When we looked on line all we could see was a reference to a very, very old and outdated page (https://sqlite.org/speed.html) which talks about speed and at the bottom of that page the comments "SQLite is slower than the other databases when it comes to dropping tables. This probably is because when SQLite drops a table, it has to go through and erase the records in the database file that deal with that table. MySQL and PostgreSQL, on the other hand, use separate files to represent each table so they can drop a table simply by deleting a file, which is much faster. On the other hand, dropping tables is not a very common operation so if SQLite takes a little longer, that is not seen as a big problem." Is this still the case, is it going to take a long time? If we assume that the table is 49GB then will we need to wait until the -wal file is at 49GB. By our estimates thats approximately 10 hours away. Any help or idea or suggestions welcomed, but please be quick. Is secure-delete turned on? https://www.sqlite.org/pragma.html#pragma_secure_delete Dan. ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Any ideas on how long it takes to drop a large table...rather an important question now...
Dan, We've killed the process (kill -9). Fired up sqlite3 again, closed it down normally. The -wal files were removed. Fired up sqlite3 again, turned off secure_delete, started to drop the table again, reniced it down (long story to do with IOPS and our VPS provider) -wal file is empty, the -shm file is a steady 32768 bytes. No idea if anything is actually happening now, but at least we don't have a massive -wal file. Suspect this could take the night (its 21:24 in London), so we'll detach the screen session and come back later. Many thanks for the help. Rob On 16 Jul 2018, at 21:17, Dan Kennedy wrote: On 07/17/2018 03:12 AM, Rob Willett wrote: Dan, Thanks for the reply. pragma secure_delete; 1 sqlite> Which is a bit of a surprise as we have never seen it before. We had to look it up using your reference. No idea why that is set (if 1 means it is). Should we simply ctrl-C the deletion, turn secure_delete off and then try again? My working assumption is that since there is a -wal file we are safe to do this. That sounds fine. Without secure-delete, the wal file should be pretty small. You can always safely kill a process in the middle of an SQLite transaction. So long as you don't do anything foolhardy like deleting wal or journal files afterwards. Dan. Rob On 16 Jul 2018, at 21:07, Dan Kennedy wrote: On 07/17/2018 03:01 AM, rob.sql...@robertwillett.com wrote: Hi, We're doing a massive tidy on our database which is approx 50GB. One table is approx 49GB of that 50GB which we need to delete as we have recorded the information in a far better format thats approx 99% more efficient. If only we had been this clever when we started We've just 'dropped' the table and were assuming that dropping the table would be quite quick. It's not. So far we've been waiting for 30 mins and nothing has come back yet. We can see that the -wal file is upto 2.5GB. We have this terrible feeling that it'll need to get to 49GB or so before the table gets dropped. We can just about handle that in the current filesystem. We're now getting nervous about dropping this table. We had assumed that it would be a really quick and easy operation based on absolutely no checking whatsoever. When we looked on line all we could see was a reference to a very, very old and outdated page (https://sqlite.org/speed.html) which talks about speed and at the bottom of that page the comments "SQLite is slower than the other databases when it comes to dropping tables. This probably is because when SQLite drops a table, it has to go through and erase the records in the database file that deal with that table. MySQL and PostgreSQL, on the other hand, use separate files to represent each table so they can drop a table simply by deleting a file, which is much faster. On the other hand, dropping tables is not a very common operation so if SQLite takes a little longer, that is not seen as a big problem." Is this still the case, is it going to take a long time? If we assume that the table is 49GB then will we need to wait until the -wal file is at 49GB. By our estimates thats approximately 10 hours away. Any help or idea or suggestions welcomed, but please be quick. Is secure-delete turned on? https://www.sqlite.org/pragma.html#pragma_secure_delete Dan. ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Any ideas on how long it takes to drop a large table...rather an important question now...
On 07/17/2018 03:12 AM, Rob Willett wrote: Dan, Thanks for the reply. pragma secure_delete; 1 sqlite> Which is a bit of a surprise as we have never seen it before. We had to look it up using your reference. No idea why that is set (if 1 means it is). Should we simply ctrl-C the deletion, turn secure_delete off and then try again? My working assumption is that since there is a -wal file we are safe to do this. That sounds fine. Without secure-delete, the wal file should be pretty small. You can always safely kill a process in the middle of an SQLite transaction. So long as you don't do anything foolhardy like deleting wal or journal files afterwards. Dan. Rob On 16 Jul 2018, at 21:07, Dan Kennedy wrote: On 07/17/2018 03:01 AM, rob.sql...@robertwillett.com wrote: Hi, We're doing a massive tidy on our database which is approx 50GB. One table is approx 49GB of that 50GB which we need to delete as we have recorded the information in a far better format thats approx 99% more efficient. If only we had been this clever when we started We've just 'dropped' the table and were assuming that dropping the table would be quite quick. It's not. So far we've been waiting for 30 mins and nothing has come back yet. We can see that the -wal file is upto 2.5GB. We have this terrible feeling that it'll need to get to 49GB or so before the table gets dropped. We can just about handle that in the current filesystem. We're now getting nervous about dropping this table. We had assumed that it would be a really quick and easy operation based on absolutely no checking whatsoever. When we looked on line all we could see was a reference to a very, very old and outdated page (https://sqlite.org/speed.html) which talks about speed and at the bottom of that page the comments "SQLite is slower than the other databases when it comes to dropping tables. This probably is because when SQLite drops a table, it has to go through and erase the records in the database file that deal with that table. MySQL and PostgreSQL, on the other hand, use separate files to represent each table so they can drop a table simply by deleting a file, which is much faster. On the other hand, dropping tables is not a very common operation so if SQLite takes a little longer, that is not seen as a big problem." Is this still the case, is it going to take a long time? If we assume that the table is 49GB then will we need to wait until the -wal file is at 49GB. By our estimates thats approximately 10 hours away. Any help or idea or suggestions welcomed, but please be quick. Is secure-delete turned on? https://www.sqlite.org/pragma.html#pragma_secure_delete Dan. ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Any ideas on how long it takes to drop a large table...rather an important question now...
Dan, Thanks for the reply. pragma secure_delete; 1 sqlite> Which is a bit of a surprise as we have never seen it before. We had to look it up using your reference. No idea why that is set (if 1 means it is). Should we simply ctrl-C the deletion, turn secure_delete off and then try again? My working assumption is that since there is a -wal file we are safe to do this. Rob On 16 Jul 2018, at 21:07, Dan Kennedy wrote: On 07/17/2018 03:01 AM, rob.sql...@robertwillett.com wrote: Hi, We're doing a massive tidy on our database which is approx 50GB. One table is approx 49GB of that 50GB which we need to delete as we have recorded the information in a far better format thats approx 99% more efficient. If only we had been this clever when we started We've just 'dropped' the table and were assuming that dropping the table would be quite quick. It's not. So far we've been waiting for 30 mins and nothing has come back yet. We can see that the -wal file is upto 2.5GB. We have this terrible feeling that it'll need to get to 49GB or so before the table gets dropped. We can just about handle that in the current filesystem. We're now getting nervous about dropping this table. We had assumed that it would be a really quick and easy operation based on absolutely no checking whatsoever. When we looked on line all we could see was a reference to a very, very old and outdated page (https://sqlite.org/speed.html) which talks about speed and at the bottom of that page the comments "SQLite is slower than the other databases when it comes to dropping tables. This probably is because when SQLite drops a table, it has to go through and erase the records in the database file that deal with that table. MySQL and PostgreSQL, on the other hand, use separate files to represent each table so they can drop a table simply by deleting a file, which is much faster. On the other hand, dropping tables is not a very common operation so if SQLite takes a little longer, that is not seen as a big problem." Is this still the case, is it going to take a long time? If we assume that the table is 49GB then will we need to wait until the -wal file is at 49GB. By our estimates thats approximately 10 hours away. Any help or idea or suggestions welcomed, but please be quick. Is secure-delete turned on? https://www.sqlite.org/pragma.html#pragma_secure_delete Dan. ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Any ideas on how long it takes to drop a large table...rather an important question now...
On 07/17/2018 03:01 AM, rob.sql...@robertwillett.com wrote: Hi, We're doing a massive tidy on our database which is approx 50GB. One table is approx 49GB of that 50GB which we need to delete as we have recorded the information in a far better format thats approx 99% more efficient. If only we had been this clever when we started We've just 'dropped' the table and were assuming that dropping the table would be quite quick. It's not. So far we've been waiting for 30 mins and nothing has come back yet. We can see that the -wal file is upto 2.5GB. We have this terrible feeling that it'll need to get to 49GB or so before the table gets dropped. We can just about handle that in the current filesystem. We're now getting nervous about dropping this table. We had assumed that it would be a really quick and easy operation based on absolutely no checking whatsoever. When we looked on line all we could see was a reference to a very, very old and outdated page (https://sqlite.org/speed.html) which talks about speed and at the bottom of that page the comments "SQLite is slower than the other databases when it comes to dropping tables. This probably is because when SQLite drops a table, it has to go through and erase the records in the database file that deal with that table. MySQL and PostgreSQL, on the other hand, use separate files to represent each table so they can drop a table simply by deleting a file, which is much faster. On the other hand, dropping tables is not a very common operation so if SQLite takes a little longer, that is not seen as a big problem." Is this still the case, is it going to take a long time? If we assume that the table is 49GB then will we need to wait until the -wal file is at 49GB. By our estimates thats approximately 10 hours away. Any help or idea or suggestions welcomed, but please be quick. Is secure-delete turned on? https://www.sqlite.org/pragma.html#pragma_secure_delete Dan. ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users