On Thu, 16 Mar 2006 [EMAIL PROTECTED] wrote: >Sorry it took me some time to get back to this thread.
No problem. I missed your reply anyway:) > >----- Original Message ---- >From: Christian Smith <[EMAIL PROTECTED]> > >> When your database does not fit in memory, yes, you're right, the OS may >> well get caching wrong, and in the worst way possible. Two things though: >> - SQLite does have known scaling issues when using multi-gigabyte >> databases. >> - Memory is cheap. If it don't fit, spend that few hundred dollars a few >> days of your time is worth and buy another few gigs or RAM. > >The machine has 2 GB of RAM and the table that seems to cause the >problems is less than 700 megs (proven by the fact that if I precache >this database, things get zippy). To me it seems like the problem is >related to the way the reads are done, but I can be wrong: to me it seems >that caches never really get "hot" (and with nothing else running on the >machine, the OS is pretty much caching all reads done by sqlite). With the whole database in OS cache, reads should not be a problem unless locked out by a writer (see below)... > >> >> >Right now, sqlite shows performance that is on par with a simple >> >filesystem structure (except it's much faster to backup because >> >traversing a multimilion file structure takes several hours). I was >> >expecting a few things by moving to sqlite: >> >* getting a better ramp up (cold cache) performance than a dump >> > filesystem structure. > >> Nothing will speed up the physical disks. > >I agree that disk i/o is the bottleneck, but what can be worked around is >the latencies needed to seek in the file and the way operations are send >out to the disk to help the OS cache more useful information and/or have >a better caching mechanism. As SQLite is designed to be simple and embeddable, advanced IO scheduling is unlikely to be a feature. > >> >* having a [much] higher throughput (so that I can combine multiple >> > servers into one), as the records are small and there is no >> > fragmentation of the folders here. >> >* having consistant latencies (filesystems tend to vary a lot). >> > >> >> - Have you tested other databases? What sort of performance did you get >> > from those? >> >I am in the process of setting up mysql with our app, I will keep you >> >posted on the result. >> >> Prepare to be disappointed, IMO. The most important thing when dealing >> with big fast databases is disk IO, and anything (MySQL, PGSQL, Oracle, >> SQLite) will find disks a bottleneck. > >The reason it took me a while to reply to this thread was that I setup a >MySQL server (5.0.19, run on the same machine) and adapted my application >to run with it. > >The results on the exact same operations that I did before: >* importing of the data is faster with MySQL, and performance doesn't > degrade exponentially with the number or rows (insertions at the > beginning and at the end of the import operation are of similar speed). > sqlite and MySQL started at around the same speed, but after a few > million inserts, sqlite becomes slow. >* cached cleared, on my typical run test (read&write combination), MySQL > ran 2 minutes faster than sqlite (6 minutes vs 8 minutes), getting nice > latencies after about 3 minutes (and that's where the difference is, I > think). I think that after 3 minutes, MySQL manage to have most > critical data cached in RAM. >* with precached dbs (file sizes are very similar for sqlite and MySQL, > for the difference that MySQL separates the data and index into 2 > files), MySQL is faster too (1m30 vs 2 minutes). >* latencies stay low (as in, rarely blocks for > 4 seconds, sqlite would > block for > 10 seconds especially for the early queries when there is > no cache). > >Maybe the reason MySQL performs better is simply because they use 2 files >instead of 1. This could simply be because SQLite is designed for small(er) datasets. > >It does make sense: if the index/offsets are all together on disk, then >the OS can very efficiently cache this information for us. Was splitting >the file into 2 ever tried for sqlite? I mean, there is already the >-journal file, so why not having a -index file as well? One philosophy behind SQLite is for zero configuration and a single file database. Having multiple files would make little difference to latencies unless the files were on seperate media and IO scheduled in parallel. > >> > To accomodate a lot of users, I have one main DB that holds all users -> >> > unique ID The other dbs are a partitioned db really, so that later >> > threads conflict only when accessing users within the same range and the >> > dbs don't get out of proportion (because blobs are used I thought a lot >> > of records could become a performance bottleneck). If I precache the big >> > table name -> uid, the import of 650,000 records takes about 2 minutes. >> >If I don't precache it, it takes about 8 minutes. I start/stop >> >transactions every X seconds (right now, I have it set at 10 seconds). >> >> Are these name->uid lookups used in a read-only fashion after import? If >> so, then multiple threads can indeed read the table in parallel, and you >> should have good performance. > >A lot of the time this table is accessed read only, yes. I tried to have >several threads access the table at the same time before, and the >performance was a bit lower (maybe because transactions occur at the same >time, and block the readers?), but also, and this was why I rolled back >this change, the busy handler that does a simple "usleep", causes >starvation and thus potential high latencies. A writer will block readers. A writer can do a lot of it's work in parallel with readers, only locking the database once it needs to flush the cache (either at COMMIT or cache spill). A writer locking the database will block subsequent readers until the lock is released. Again, batching updates can potentially minimise the time spent with the database locked. In a threaded environment, performance is likely to be lower for each individual thread because SQLite is largely CPU bound (especially when the database is in the OS cache), and the threads will compete with each other for CPU time. As the number of CPUs increase, so should aggregate read performance also increase. When you say you have threads, are all the database access (read and write) done from the same process? If so, you can use thread synchronization within the busy callback to give fine grained control on polling. For example, signal a condition variable to indicate when a transaction is completed (use pthread_cond_broadcast), and in the busy handler, do a pthread_cond_timedwait on the condition variable. Thus, your busy handler will be awakend whenever another thread finishes it's transaction, but will not sleep unbounded nor poll too frequently to starve the locker of CPU time. > >> >> For now, I'd say: >> - If the import/update case is predominant, look at another DB. I'd >> suggest PostgreSQL, as it's MVCC lockless row protocol allows concurrent >> updates to disjoint sets of rows in the same table. >> - If read-only access is the predominant use, SQLite may well fit the >> bill, and take the performance hit when importing data. > >What has been confusing me since the beginning is that sqlite performs >well for databases that are rarelly updated. I think the performance page >should underline this fact. > >Also, I think the performance page should be updated to reflect the fact >that sqlite might not be the solution for bigger databases (and maybe add >a section of tests that last more than 5 seconds :)). I'll see if I can >take the test scripts from the performance page and change the parameters >to try to run a 10 minutes test or so (with clearing of caches, probably >reformatting of the partition used for the data, between tests, so that >all engines are tested equally). > >I am a bit disapointed that I can't use sqlite at this point: I really >liked the fact that there is no setup for the enduser, and thought that >because I was not doing complex sql queries, I could get decent >performance (like I said earlier, low latency is really what I need >here). Of course, if anyone comes with an idea, I am more than willing to >try it and help improve this great library. Handling of busy waiting using usleep appears to be a problem for latency. You don't say exactly how long you sleep for, but if you sleep for too short a period, you can starve the locker of CPU time. Conversely, sleeping too long means you can miss the unlocked period between write cycles. Perhaps you could make it a configurable parameter? Or, if you have a single process, use condition variables as described above. But, if it ain't working for you, it ain't working for you. Without application specifics (like code, schema, example data) it is difficult to give specific recommendations on tuning. > > Nicolas > Christian -- /"\ \ / ASCII RIBBON CAMPAIGN - AGAINST HTML MAIL X - AGAINST MS ATTACHMENTS / \