On Sat, Mar 21, 2009 at 2:07 PM, Nicolas Williams <[email protected]> wrote: > On Sat, Mar 21, 2009 at 01:55:32PM -0500, P Kishor wrote: >> On Sat, Mar 21, 2009 at 1:25 PM, Nicolas Williams >> <[email protected]> wrote: >> > On Sat, Mar 21, 2009 at 12:14:43PM -0500, P Kishor wrote: >> >> If I can't improve 33 ms per query, then I can experiment with >> > >> > 33ms per-query sounds like you're not caching enough of the database in >> > memory. What's the cache size? Can you jack it up? >> >> hmmm... never thought of the cache size. From the docs... >> >> "PRAGMA default_cache_size = Number-of-pages; >> [...] >> So, any suggestions on what cache size I should experiment with? And, >> does this have to be set *before* the db is created? From the above >> description it sounds like I can set the cache_size at any time. > > Well, take the number of pages in your DB (for a vacuumed DB: file size > / page size) and see if setting cache_size == that improves things. > > (default_cache_size is stored in the DB so that connections that don't > specify a cache_size get what default_cache_size says.) > >> > Does the entire >> > dataset fit in memory? If so, why isn't it all in memory? Or if it is >> > all in memory, what's SQLite3 spending its time on? >> >> How do I take a db on disk and load it all in memory? How is that >> done? I have never done that before. Seems like I can create a db in >> memory with > > "The online-backup interface can be used to copy content from a disk > file into an in-memory database or vice versa and it can make a hot > backup of a live database." > > http://sqlite.org/backup.html > > But the thing is, you might just set the cache size large enough and let > it warm up as you go -- the effect should be the same if your DB doesn't > grow very fast. > >> Also, isn't >> there a 2 GB limit to the amount of RAM that 32-bit processes can >> address? > > Even so, 1GB of cache is much better than the 2000 page (x 1KB page > size == 2MB) default. > > Also, you might consider going 64-bit. > >> By the way, even though I have a lot of computing horsepower, I would >> like to work toward a solution that would work reasonably well even >> without access to a cluster. While no one would expect lightning fast >> responses for model runs over millions of cells, it would be nice to >> cut the time from several hours down to sub-hour levels. But that is a >> longer road to tread. > > Memory is the key for a large DB using SQLite3. > > If you're building a distributed application the SQLite3 is probably the > wrong tool to use (though you could use SQLite3 with a local copy of a > DB if replication is easy because your dataset is mostly read-only, say) >
So, I increased the cache_size to 1048576 but got the same results... 30 odd SELECTs per second. Then I created an in-memory db and copied all the data from the on-disk db to memory. I didn't use the backup API... simply opened a db connection to an in-memory db, then created all the tables and indexes, ATTACHed the on-disk db and did an INSERT .. SELECT * FROM attached db. Interestingly, the same results -- [04:24 PM] ~/Data/carbonmodel$perl carbonmodel.pl Creating in memory tables... done. Transferring data to memory... done. Took: 90 wallclock secs (75.88 usr + 8.44 sys = 84.32 CPU) Creating indexes... done. Took: 38 wallclock secs (23.82 usr + 13.36 sys = 37.18 CPU) Prepare load testing ...timethis 1000: 33 wallclock secs (30.74 usr + 1.02 sys = 31.76 CPU) @ 31.49/s (n=1000) Maybe that's what it is then... with the amount of data I have per SELECT, it is the same result whether the db is on disk or in memory. I will try out with PostGres and report back on what I get. -- Puneet Kishor http://www.punkish.org/ Nelson Institute for Environmental Studies http://www.nelson.wisc.edu/ Carbon Model http://carbonmodel.org/ Open Source Geospatial Foundation http://www.osgeo.org/ Sent from: Madison WI United States. _______________________________________________ sqlite-users mailing list [email protected] http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

