Re: [sqlite] How to set memory usage as high as possible -- but not too high?

Jay A. Kreibich Mon, 14 Apr 2008 08:56:19 -0700

On Mon, Apr 14, 2008 at 09:51:40AM +0200, Ralf Junker scratched on the wall:
> >> * OPTION 1: PRAGMA cache_size = 10000000;
> >>
> >> Advantage: SQLite will use ample memory, but no more than that.
> >>
> >> Disadvantage: Difficulty to establish exact memory requirements in  
> >> advance. The help states that "Each page uses about 1.5K of  
> >> memory.", but I found this to be wrong. Memory usage obviously  
> >> depends on the page size, and my measurement shows that there is an  
> >> additional small overhead of undocumented size. Is there a formula  
> >> to calculate the required memory for a cache_size of x?
> 
> I'd be curious if you know an answer to this, too?

  Given that the default page size is 1K, it seems like the statement
  "Each page uses about 1.5K of memory" would account for most of the
  overhead.

> >> * OPTION 2: sqlite3_soft_heap_limit(max_mem_bytes);
> >>
> >> Disadvantage: My tests indicate that SQLite slows down drastically  
> >> when it hits the memory limit. Inserts drop from a few hundred per  
> >> second to just one or two per sec.
> >
> >That is an odd result. How did you test it? 
> 
> I set up a high cache_size and a lower sqlite3_soft_heap_limit() and 
> started inserting blobs.

  Be aware that the default setting for this is "unlimited", so any
  value set is only going to reduce the amount of available memory.
  I understand you might try to do this to keep the whole memory image
  in RAM, but as long as the page cache is something realistic, I'm not
  sure any additional limits are advisable.

  Also, I'm not sure I would consider this test valid.  If the system
  runs up against the soft heap limit, it is going recover memory from
  the page cache.  Setting up a large page cache and a low soft heap
  limit is going to cause the memory allocator and page cache to fight
  with each other.

  Things may really go south in this specific case, as the first pages
  the recovery system is going to go after are those that don't require
  a sync, i.e. those pages that have been read but not written to.
  That would include most of the internal B-Tree index pages that are
  providing your performance gain for the indexed inserts.

 ===================================================

On Mon, Apr 14, 2008 at 09:38:50AM +0200, Ralf Junker scratched on the wall:
> Roger Binns wrote:
> 
> >Are you using a 32 bit or 64 bit process.
> 
> 32, but it does not matter to the problem.

  When you give examples like "PRAGMA cache_size = 10000000;", or 10M
  pages which would take something on the order of 15 GB of memory to
  service with the default 1K page size, a few of us start to wonder.

  Not only is that impossible with a 32-bit system, if your database is
  only 6GB, that cache setting is much larger than the database itself,
  which only has about six million pages.

> >Also is there a requirement to create the database in the filesystem?
> 
> Yes.
> 
> > If not you could ensure your swap is sufficiently large (I use a
> > mininmum of 16GB on my machines :-) and create in a tmpfs filesystem,
> > and then copy the database to persistent storage when you are done.
> 
> The aim is to avoid slow swap memory but use fast RAM only.

  Yes, but that's impossible with a 6GB database (and a 32bit system).
  Some of it is going to sit on disk.  The big advantage of using tmpfs
  is that it is linked directly to the operating system, so it will use
  as much RAM as possible (but never more) and require very little tuning.

> >I am curious why you think memory is the bottleneck anyway!
> 
> It has often been pointed out on this list that inserts into
> indexed tables (regular or FTS) run faster with a high page cache.
> My own tests 2nd this. A few 100 MB more or less can make an
> difference of more than 100%.

  Given that the default page cache is 2000 pages, or on the order of
  3MB, it seems that you're hitting some serious limits.  If hundreds
  of megabytes (!) is giving you a return on the order of 2x, then there
  is no magic bullet-- you aren't going to find a setting that suddenly
  gives you a 10x speedup.  You're hitting diminishing returns in a
  serious kind of way.

  Personally, I'd pick a number, like half your RAM size or ~1.5GB*
  (whichever is smaller), set the page cache, and be done with it.
  It sounds like you've already found most of the other PRAGMAs that
  are going to get you something.  You might be able to tune the size
  of your INSERT transactions, but if you're around 1000 or so, going
  higher isn't likely to buy you too much.

  * This upper limit is OS-dependent.  I'd use 1.5GB on Windows and older
    Linux systems, 2.5GB for some of the more modern Linux systems, 3.5GB
    for Mac OS X.

  Like all things performance tuning, unless you have a really good
  idea of how the system is working, you're just shooting in the dark.

   -j

-- 
Jay A. Kreibich < J A Y  @  K R E I B I.C H >

"'People who live in bamboo houses should not throw pandas.' Jesus said that."
   - "The Ninja", www.AskANinja.com, "Special Delivery 10: Pop!Tech 2006"
_______________________________________________
sqlite-users mailing list
[email protected]
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] How to set memory usage as high as possible -- but not too high?

Reply via email to