[EMAIL PROTECTED] uttered:

I have 300K records in my database and the blob text file (invoices) in a separate database. I query the database over a network or USB stick and the performance is good. However when I moved my database to CD media, the performance dropped drastically. It takes 2 minutes sometime to retreive the result. The CPU is not busy during the time, and the memory usage increases slowly.


Is the cdrom noisy at this point? Making intermittent whirring noises? Some cdroms are noisier than others.



When the user selects a row, the system queries the blob database, and here I am able to retreive the text, convert it into a tiff file, and load it into my program within 1.5sec or less from the CD. Copying the database to the hard drive and querying that resumes with normal performance. Querying the CD version (immeadiately) and the performance lags again. The LED on the CD rom blinking only occasionally, not constantly like when I copy a file.


If the LED only lights when data is being transferred, then what you're seeing is the latency of individual reads. This is not surprising, as cdroms have horrendously slow seek times. When copying regular files, the file itself is likely to be contiguous on the CD, therefore no seeks are required. A SQLite database, on the other hand, has tree structure scattered around the file. Seeks are common and slow. Vacuuming, as DRH suggests, will reduce the seeks as tables will be more contiguous.

The OS should shield you from this. One way to possibly increase performance is to the prime the OS cache by reading in the CD file in it's entirety (just read the raw file). Hopefully, the file should fit in the OS's memory cache, and subsequent SQLite reads can be satisfied from the OS cache, though that may be unfeasable with a database of your size.

You might also want to increase the database page size when creating the database in the first place, as you'll have less actual pages and hence less seeks for a given database size.



Both databases have 300k records, the first(invoice no indexed) has 8 retrieval fields, while the second has just the ID and blob. The average row size is 60-80 bytes, while the blob has on average 500bytes. I am looking to improve the performance, and reasons for the failings. It does sound like cacheing mentioned in the thread ...



Only in that the required data is not in the cache, and has to be bought in from high latency media.


Christian


--
    /"\
    \ /    ASCII RIBBON CAMPAIGN - AGAINST HTML MAIL
     X                           - AGAINST MS ATTACHMENTS
    / \

-----------------------------------------------------------------------------
To unsubscribe, send email to [EMAIL PROTECTED]
-----------------------------------------------------------------------------

Reply via email to