> Marcus - thanks. I will experiment with those 2 PRAGMAs.
>
> Meanwhile - I was trying to update some columns in a table with 130
> million
> rows and got this error -
>
> "Error: out of memory"

Did you get that after you increased the cache ?

Strange, I never got that during my huge DB tests.
But I'm not sure what will happen if a cache is
specified that would exceed the hardware (RAM) capabilities.
Interesting question, though.

Be careful: The cache pragma works on page-sizes, that is, it
specifies for example in 1K blocks.

>
> I am not sure why. I thought the "UPDATE" just writes out new rows and
> maybe a journal of the old rows for recoverability - but I am not sure why
> it would need very much memory.

It may depend on the update if it involves indexes loads and updates.

>
> Be that as it may - and with regard to your suggestion - and in light of
> this error message - given that I only have 4GB of RAM on my PC - is this
> really enough RAM to handle tables of this size ? Or am I giving Sqlite an
> unreasonably small amount of resources and it's time for a serious
> hardware
> upgrade?

I don't think that you will be able to specify more than 2-3 GB
cache, depending on your OS, unless you compile a 64 bit version
of sqlite (I never did that).
However, a 100 GB sqlite DB file should not be any problem, except
the reported slow down.

Marcus

>
> Thanks,
>
> Udi
>
> On Thu, Feb 2, 2012 at 10:03 PM, Marcus Grimm
> <mgr...@medcom-online.de>wrote:
>
>> > Given how clever and compelling Sqlite is - I am testing how it scales
>> to
>> > tables in the 100GB / 200 million row range. This is for a strictly
>> "read
>> > only" application - but first the tables must be populated in a one
>> time
>> > process. As is often the case with Big Data - the data is a little
>> dirty
>> -
>> > so the process involves importing - selecting - counting - inspecting
>> -
>> > updating some rows - deleting some rows - selecting - counting -
>> > inspecting, etc. until clean.
>> >
>> > Placing the Sqlite database on a traditional C: drive - IO was too
>> slow.
>> > At
>> > 15 MB/sec - reading a 50GB table would take an hour. So I moved it to
>> > external Raid array where I ran across an interesting find. IO wasn't
>> that
>> > much faster - until I vaccuumed the database - which increase IO 10X
>> to
>> > 150
>> > MB/sec - with the same CPU utilization.
>> >
>> > This is good news for the final implementation of this read-only
>> database
>> > -
>> > but still a dilemma at the data load phase. After a ".vaccuum" -
>> issueing
>> > a
>> > single DML against a table - even a DELETE which deletes no rows at
>> all -
>> > causes IO to drop back down to 15 MB/sec - on the table I'm selecting
>> /
>> > DMLing - which makes the data loading / cleansing phase very long.
>> >
>> > So I have 2 questions -
>> >
>> > (1) Why would simple DML cause such an extreme slowdown as compared
>> with
>> > "post vaccuum" speeds ?
>> >
>> > (2) Any knobs to turn to try and maintain the higher speeds post DML -
>> > without resorting to ".vaccuum" ?
>>
>>
>> You didn't tell if you already set the usual tricks to speed up
>> your load phase. That would be to increase the page cache and try
>> with reduced syncs.
>> See
>> PRAGMA cache_size
>> PRAGMA synchronous
>>
>> In particular the page cache should be increased dramatically
>> for huge DB files.
>>
>> Marcus
>>
>> >
>> > Thanks,
>> >
>> > Udi
>> > _______________________________________________
>> > sqlite-users mailing list
>> > sqlite-users@sqlite.org
>> > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>> >
>>
>>
>> _______________________________________________
>> sqlite-users mailing list
>> sqlite-users@sqlite.org
>> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>>
>


_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to