On Mon, Apr 14, 2008 at 09:51:40AM +0200, Ralf Junker scratched on the wall: > >> * OPTION 1: PRAGMA cache_size = 10000000; > >> > >> Advantage: SQLite will use ample memory, but no more than that. > >> > >> Disadvantage: Difficulty to establish exact memory requirements in > >> advance. The help states that "Each page uses about 1.5K of > >> memory.", but I found this to be wrong. Memory usage obviously > >> depends on the page size, and my measurement shows that there is an > >> additional small overhead of undocumented size. Is there a formula > >> to calculate the required memory for a cache_size of x? > > I'd be curious if you know an answer to this, too?
Given that the default page size is 1K, it seems like the statement "Each page uses about 1.5K of memory" would account for most of the overhead. > >> * OPTION 2: sqlite3_soft_heap_limit(max_mem_bytes); > >> > >> Disadvantage: My tests indicate that SQLite slows down drastically > >> when it hits the memory limit. Inserts drop from a few hundred per > >> second to just one or two per sec. > > > >That is an odd result. How did you test it? > > I set up a high cache_size and a lower sqlite3_soft_heap_limit() and > started inserting blobs. Be aware that the default setting for this is "unlimited", so any value set is only going to reduce the amount of available memory. I understand you might try to do this to keep the whole memory image in RAM, but as long as the page cache is something realistic, I'm not sure any additional limits are advisable. Also, I'm not sure I would consider this test valid. If the system runs up against the soft heap limit, it is going recover memory from the page cache. Setting up a large page cache and a low soft heap limit is going to cause the memory allocator and page cache to fight with each other. Things may really go south in this specific case, as the first pages the recovery system is going to go after are those that don't require a sync, i.e. those pages that have been read but not written to. That would include most of the internal B-Tree index pages that are providing your performance gain for the indexed inserts. =================================================== On Mon, Apr 14, 2008 at 09:38:50AM +0200, Ralf Junker scratched on the wall: > Roger Binns wrote: > > >Are you using a 32 bit or 64 bit process. > > 32, but it does not matter to the problem. When you give examples like "PRAGMA cache_size = 10000000;", or 10M pages which would take something on the order of 15 GB of memory to service with the default 1K page size, a few of us start to wonder. Not only is that impossible with a 32-bit system, if your database is only 6GB, that cache setting is much larger than the database itself, which only has about six million pages. > >Also is there a requirement to create the database in the filesystem? > > Yes. > > > If not you could ensure your swap is sufficiently large (I use a > > mininmum of 16GB on my machines :-) and create in a tmpfs filesystem, > > and then copy the database to persistent storage when you are done. > > The aim is to avoid slow swap memory but use fast RAM only. Yes, but that's impossible with a 6GB database (and a 32bit system). Some of it is going to sit on disk. The big advantage of using tmpfs is that it is linked directly to the operating system, so it will use as much RAM as possible (but never more) and require very little tuning. > >I am curious why you think memory is the bottleneck anyway! > > It has often been pointed out on this list that inserts into > indexed tables (regular or FTS) run faster with a high page cache. > My own tests 2nd this. A few 100 MB more or less can make an > difference of more than 100%. Given that the default page cache is 2000 pages, or on the order of 3MB, it seems that you're hitting some serious limits. If hundreds of megabytes (!) is giving you a return on the order of 2x, then there is no magic bullet-- you aren't going to find a setting that suddenly gives you a 10x speedup. You're hitting diminishing returns in a serious kind of way. Personally, I'd pick a number, like half your RAM size or ~1.5GB* (whichever is smaller), set the page cache, and be done with it. It sounds like you've already found most of the other PRAGMAs that are going to get you something. You might be able to tune the size of your INSERT transactions, but if you're around 1000 or so, going higher isn't likely to buy you too much. * This upper limit is OS-dependent. I'd use 1.5GB on Windows and older Linux systems, 2.5GB for some of the more modern Linux systems, 3.5GB for Mac OS X. Like all things performance tuning, unless you have a really good idea of how the system is working, you're just shooting in the dark. -j -- Jay A. Kreibich < J A Y @ K R E I B I.C H > "'People who live in bamboo houses should not throw pandas.' Jesus said that." - "The Ninja", www.AskANinja.com, "Special Delivery 10: Pop!Tech 2006" _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users