On Thu, 9 Mar 2006 [EMAIL PROTECTED] wrote:

>----- Original Message ----
>> From: Christian Smith <[EMAIL PROTECTED]>
>
>> Is this a likely usage scenario? Will your application regularly
>> umount/mount the filesystem between transactions? While sounding
>> facetious, I'm not trying to. Your otherwise excellent example is let down
>> by a probably unreasonable usage pattern.
>
>My example was just that, a test case trying to explain why my actual app
>goes to a crowl.
>The umount/mount is there to insure that I am not benchmarking the OS
>caching abilities.


But SQLite depends on the OS caching abilities for much of it's
performance. Removing it is like saying SQLite is rubbish on Intel
processors after testing on a i486.


>What this does is that it gives a better idea on the performance of the
>caching of sqlite.


The SQLite cache is very restricted. It only caches data over a single
transaction, discarding it on COMMIT. This is because another process may
update the file between transactions.

The file format has a file generation number, which could be used to delay
the discarding of the SQLite cache between transactions, but I believe it
is as yet unimplemented.


>
>> The pages containing nodes closer to the root of the affected btrees will
>> be cached pretty quickly, but the random access pattern and constant
>> random updates will make for very non-localised updates to the DB file,
>> which would require lots of seeks on sync.
>
>> You might be well advised to look at how long it takes to do the updates,
>> then the COMMIT seperately. I'd hazard a guess that you'll be spending the
>> majority of the time in the COMMIT commands. Something like:
>
>All right, so I added the intermediate timing to see where the time is spend.
>
>My results show that on the typical 7 seconds operation (after the
>numbers stabilise) 2 seconds are spend between the BEGIN / END and 5
>seconds to perform the COMMIT. The first transactions show things like:
>
>
>When I saw this, what I tried later on was to play around with
>PRAGMA synchronous =OFF and PRAGMA synchronous = NORMAL
>
>While synchronous = NORMAL didn't show any real improvement on the
>timings compared to FULL full async mode was dramatically faster:
>the random test that was taking 600+ seconds to execute now runs in 172
>seconds (and running the global sync after the test only took a couple
>seconds, meaning most data was already written on disk). I looked at
>timings, and now the time is basically spend between the BEGIN/END, the
>COMMIT is very fast.
>
>I was very excited about this result, so I tried async access on my
>actual app (that holds 8M+ records), but I was disapointed to see that
>async actually even slow things down?!


By much? This surprises me. Perhaps there is another bottleneck in the app
that isn't covered by this test case?

Async access isn't recommended anyway, as it leaves data vulnerable to OS
crashes or power failure.


>
>I also tried to mount the filesystem (reiser) in "journal" mode
>(rw,noatime,data=journal), I had it in "writeback", and things get slower
>with this setting.
>
>So what I tried next was to simply run my app after doing a
>cat generaldb.db > /dev/null
>(this takes about 15 seconds on my system, it's a 600Megs file)
>
>This has the effect of preloading the db file in memory in the OS caches.
>
>After I do this, the app becomes very fast (as in 4 times faster).
>
>note: I noticed that even if I specify some ridiculous number like
>PRAGMA cache_size = 450000 (that should be enough to cache the whole db by the 
>way)
>for the cache, the memory footprint of my app stays pretty low, as if it was 
>not caching much!?
>
>So at this point, I think the caching/polling subsytem seems to be the problem.


See above. The cache lives as long as the transaction.


>
>Oh also, a way to improve performance:
>reads can be reordered, so if sqlite was doing async reads of several
>places in the db file at the same time, the OS/drive would reorder things
>the same way they are reordered when writing for a good boost in
>performance.


This would add complication to SQLite. We depend on the OS cache to shield
us from having to do async IO. It's difficult to share SIGIO, so an app
that also did async IO would have problems with SQLite. In general, read
performance is not a problem in SQLite. The biggest bottleneck is
transaction COMMIT, due to the synchronous nature of the operation.


General observations/questions:
- What did you expect from SQLite? Can you give indication of performance
  expectations?
- What did you use previously? How does SQLite compare on performance?
- Have you tested other databases? What sort of performance did you get
  from those?
- If you're not batching, or can't batch updates, then SQLite may very
  well not be performant enough due to inherent restrictions of it's
  architecture.
- An idea of what your app does in general would be of assistance in
  helping you find your bottleneck. A specific test case is good if this
  is indeed the issue, but it doesn't sound like it is.

>
>Nicolas
>

Christian


-- 
    /"\
    \ /    ASCII RIBBON CAMPAIGN - AGAINST HTML MAIL
     X                           - AGAINST MS ATTACHMENTS
    / \

Reply via email to