> Marcus - thanks. I will experiment with those 2 PRAGMAs. > > Meanwhile - I was trying to update some columns in a table with 130 > million > rows and got this error - > > "Error: out of memory"
Did you get that after you increased the cache ? Strange, I never got that during my huge DB tests. But I'm not sure what will happen if a cache is specified that would exceed the hardware (RAM) capabilities. Interesting question, though. Be careful: The cache pragma works on page-sizes, that is, it specifies for example in 1K blocks. > > I am not sure why. I thought the "UPDATE" just writes out new rows and > maybe a journal of the old rows for recoverability - but I am not sure why > it would need very much memory. It may depend on the update if it involves indexes loads and updates. > > Be that as it may - and with regard to your suggestion - and in light of > this error message - given that I only have 4GB of RAM on my PC - is this > really enough RAM to handle tables of this size ? Or am I giving Sqlite an > unreasonably small amount of resources and it's time for a serious > hardware > upgrade? I don't think that you will be able to specify more than 2-3 GB cache, depending on your OS, unless you compile a 64 bit version of sqlite (I never did that). However, a 100 GB sqlite DB file should not be any problem, except the reported slow down. Marcus > > Thanks, > > Udi > > On Thu, Feb 2, 2012 at 10:03 PM, Marcus Grimm > <mgr...@medcom-online.de>wrote: > >> > Given how clever and compelling Sqlite is - I am testing how it scales >> to >> > tables in the 100GB / 200 million row range. This is for a strictly >> "read >> > only" application - but first the tables must be populated in a one >> time >> > process. As is often the case with Big Data - the data is a little >> dirty >> - >> > so the process involves importing - selecting - counting - inspecting >> - >> > updating some rows - deleting some rows - selecting - counting - >> > inspecting, etc. until clean. >> > >> > Placing the Sqlite database on a traditional C: drive - IO was too >> slow. >> > At >> > 15 MB/sec - reading a 50GB table would take an hour. So I moved it to >> > external Raid array where I ran across an interesting find. IO wasn't >> that >> > much faster - until I vaccuumed the database - which increase IO 10X >> to >> > 150 >> > MB/sec - with the same CPU utilization. >> > >> > This is good news for the final implementation of this read-only >> database >> > - >> > but still a dilemma at the data load phase. After a ".vaccuum" - >> issueing >> > a >> > single DML against a table - even a DELETE which deletes no rows at >> all - >> > causes IO to drop back down to 15 MB/sec - on the table I'm selecting >> / >> > DMLing - which makes the data loading / cleansing phase very long. >> > >> > So I have 2 questions - >> > >> > (1) Why would simple DML cause such an extreme slowdown as compared >> with >> > "post vaccuum" speeds ? >> > >> > (2) Any knobs to turn to try and maintain the higher speeds post DML - >> > without resorting to ".vaccuum" ? >> >> >> You didn't tell if you already set the usual tricks to speed up >> your load phase. That would be to increase the page cache and try >> with reduced syncs. >> See >> PRAGMA cache_size >> PRAGMA synchronous >> >> In particular the page cache should be increased dramatically >> for huge DB files. >> >> Marcus >> >> > >> > Thanks, >> > >> > Udi >> > _______________________________________________ >> > sqlite-users mailing list >> > sqlite-users@sqlite.org >> > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users >> > >> >> >> _______________________________________________ >> sqlite-users mailing list >> sqlite-users@sqlite.org >> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users >> > _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users