Re: [sqlite] speeding up row by row lookup in a large db

P Kishor Sat, 21 Mar 2009 14:37:20 -0700

On Sat, Mar 21, 2009 at 2:07 PM, Nicolas Williams
<[email protected]> wrote:
> On Sat, Mar 21, 2009 at 01:55:32PM -0500, P Kishor wrote:
>> On Sat, Mar 21, 2009 at 1:25 PM, Nicolas Williams
>> <[email protected]> wrote:
>> > On Sat, Mar 21, 2009 at 12:14:43PM -0500, P Kishor wrote:
>> >> If I can't improve 33 ms per query, then I can experiment with
>> >
>> > 33ms per-query sounds like you're not caching enough of the database in
>> > memory.  What's the cache size?  Can you jack it up?
>>
>> hmmm... never thought of the cache size. From the docs...
>>
>> "PRAGMA default_cache_size = Number-of-pages;
>> [...]
>> So, any suggestions on what cache size I should experiment with? And,
>> does this have to be set *before* the db is created? From the above
>> description it sounds like I can set the cache_size at any time.
>
> Well, take the number of pages in your DB (for a vacuumed DB: file size
> / page size) and see if setting cache_size == that improves things.
>
> (default_cache_size is stored in the DB so that connections that don't
> specify a cache_size get what default_cache_size says.)
>
>> > Does the entire
>> > dataset fit in memory?  If so, why isn't it all in memory?  Or if it is
>> > all in memory, what's SQLite3 spending its time on?
>>
>> How do I take a db on disk and load it all in memory? How is that
>> done? I have never done that before. Seems like I can create a db in
>> memory with
>
> "The online-backup interface can be used to copy content from a disk
> file into an in-memory database or vice versa and it can make a hot
> backup of a live database."
>
> http://sqlite.org/backup.html
>
> But the thing is, you might just set the cache size large enough and let
> it warm up as you go -- the effect should be the same if your DB doesn't
> grow very fast.
>
>>                                                           Also, isn't
>> there a 2 GB limit to the amount of RAM that 32-bit processes can
>> address?
>
> Even so, 1GB of cache is much better than the 2000 page (x 1KB page
> size == 2MB) default.
>
> Also, you might consider going 64-bit.
>
>> By the way, even though I have a lot of computing horsepower, I would
>> like to work toward a solution that would work reasonably well even
>> without access to a cluster. While no one would expect lightning fast
>> responses for model runs over millions of cells, it would be nice to
>> cut the time from several hours down to sub-hour levels. But that is a
>> longer road to tread.
>
> Memory is the key for a large DB using SQLite3.
>
> If you're building a distributed application the SQLite3 is probably the
> wrong tool to use (though you could use SQLite3 with a local copy of a
> DB if replication is easy because your dataset is mostly read-only, say)
>



So, I increased the cache_size to 1048576 but got the same results...
30 odd SELECTs per second.

Then I created an in-memory db and copied all the data from the
on-disk db to memory. I didn't use the backup API... simply opened a
db connection to an in-memory db, then created all the tables and
indexes, ATTACHed the on-disk db and did an INSERT .. SELECT * FROM
attached db. Interestingly, the same results --

[04:24 PM] ~/Data/carbonmodel$perl carbonmodel.pl
Creating in memory tables... done.
Transferring data to memory... done. Took: 90 wallclock secs (75.88
usr +  8.44 sys = 84.32 CPU)
Creating indexes... done. Took: 38 wallclock secs (23.82 usr + 13.36
sys = 37.18 CPU)
Prepare load testing
...timethis 1000: 33 wallclock secs (30.74 usr +  1.02 sys = 31.76
CPU) @ 31.49/s (n=1000)

Maybe that's what it is then... with the amount of data I have per
SELECT, it is the same result whether the db is on disk or in memory.

I will try out with PostGres and report back on what I get.

-- 
Puneet Kishor http://www.punkish.org/
Nelson Institute for Environmental Studies http://www.nelson.wisc.edu/
Carbon Model http://carbonmodel.org/
Open Source Geospatial Foundation http://www.osgeo.org/
Sent from: Madison WI United States.
_______________________________________________
sqlite-users mailing list
[email protected]
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] speeding up row by row lookup in a large db

Reply via email to