Marcus - thanks. I will experiment with those 2 PRAGMAs. Meanwhile - I was trying to update some columns in a table with 130 million rows and got this error -
"Error: out of memory" I am not sure why. I thought the "UPDATE" just writes out new rows and maybe a journal of the old rows for recoverability - but I am not sure why it would need very much memory. Be that as it may - and with regard to your suggestion - and in light of this error message - given that I only have 4GB of RAM on my PC - is this really enough RAM to handle tables of this size ? Or am I giving Sqlite an unreasonably small amount of resources and it's time for a serious hardware upgrade? Thanks, Udi On Thu, Feb 2, 2012 at 10:03 PM, Marcus Grimm <mgr...@medcom-online.de>wrote: > > Given how clever and compelling Sqlite is - I am testing how it scales to > > tables in the 100GB / 200 million row range. This is for a strictly "read > > only" application - but first the tables must be populated in a one time > > process. As is often the case with Big Data - the data is a little dirty > - > > so the process involves importing - selecting - counting - inspecting - > > updating some rows - deleting some rows - selecting - counting - > > inspecting, etc. until clean. > > > > Placing the Sqlite database on a traditional C: drive - IO was too slow. > > At > > 15 MB/sec - reading a 50GB table would take an hour. So I moved it to > > external Raid array where I ran across an interesting find. IO wasn't > that > > much faster - until I vaccuumed the database - which increase IO 10X to > > 150 > > MB/sec - with the same CPU utilization. > > > > This is good news for the final implementation of this read-only database > > - > > but still a dilemma at the data load phase. After a ".vaccuum" - issueing > > a > > single DML against a table - even a DELETE which deletes no rows at all - > > causes IO to drop back down to 15 MB/sec - on the table I'm selecting / > > DMLing - which makes the data loading / cleansing phase very long. > > > > So I have 2 questions - > > > > (1) Why would simple DML cause such an extreme slowdown as compared with > > "post vaccuum" speeds ? > > > > (2) Any knobs to turn to try and maintain the higher speeds post DML - > > without resorting to ".vaccuum" ? > > > You didn't tell if you already set the usual tricks to speed up > your load phase. That would be to increase the page cache and try > with reduced syncs. > See > PRAGMA cache_size > PRAGMA synchronous > > In particular the page cache should be increased dramatically > for huge DB files. > > Marcus > > > > > Thanks, > > > > Udi > > _______________________________________________ > > sqlite-users mailing list > > sqlite-users@sqlite.org > > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > > > > > _______________________________________________ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users