> Given how clever and compelling Sqlite is - I am testing how it scales to
> tables in the 100GB / 200 million row range. This is for a strictly "read
> only" application - but first the tables must be populated in a one time
> process. As is often the case with Big Data - the data is a little dirty -
> so the process involves importing - selecting - counting - inspecting -
> updating some rows - deleting some rows - selecting - counting -
> inspecting, etc. until clean.
>
> Placing the Sqlite database on a traditional C: drive - IO was too slow.
> At
> 15 MB/sec - reading a 50GB table would take an hour. So I moved it to
> external Raid array where I ran across an interesting find. IO wasn't that
> much faster - until I vaccuumed the database - which increase IO 10X to
> 150
> MB/sec - with the same CPU utilization.
>
> This is good news for the final implementation of this read-only database
> -
> but still a dilemma at the data load phase. After a ".vaccuum" - issueing
> a
> single DML against a table - even a DELETE which deletes no rows at all -
> causes IO to drop back down to 15 MB/sec - on the table I'm selecting /
> DMLing - which makes the data loading / cleansing phase very long.
>
> So I have 2 questions -
>
> (1) Why would simple DML cause such an extreme slowdown as compared with
> "post vaccuum" speeds ?
>
> (2) Any knobs to turn to try and maintain the higher speeds post DML -
> without resorting to ".vaccuum" ?


You didn't tell if you already set the usual tricks to speed up
your load phase. That would be to increase the page cache and try
with reduced syncs.
See
PRAGMA cache_size
PRAGMA synchronous

In particular the page cache should be increased dramatically
for huge DB files.

Marcus

>
> Thanks,
>
> Udi
> _______________________________________________
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>


_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to