> I understand that I'm experiencing thrashing, but I'm curious as to why I'm > experiencing it. My understanding is that the row_id PRIMARY KEY column has > an index on it by default. Inserting into this index doesn't cause > thrashing, but inserting into the secondary index does. I hoped that > enabling the shared cache mode and periodically doing the select operation > on it would keep it in SQLITE's cache. Based on what I saw from I/O stat, > top, and my timing measurements, that doesn't seem to be the case. In fact, > my cache "priming" experiment showed that it didn't help at all, despite a > cache size of 500 MB and the index being on a column with an INTEGER pragma. > The transaction immediately following the cache priming would be just as > slow as the transactions far away from it. Why would a transaction inserting > 1024 rows of 0.5 KB each result in the eviction of that index, despite the > abundance of cache?
The OS disk cache is not effective because the index fragments are currently scattered across all pages of the table it is indexing. Even VACUUM (as it is currently written in sqlite/src/vacuum.c) won't help such indexes be contiguous in the file. The following patch to VACUUM allows pages of non-inline indexes explicitly made with CREATE INDEX to be contiguous in the file, which can improve VACUUM times and non-inline index locality of reference somewhat: http://www.sqlite.org/cvstrac/tktview?tn=2075 It works by moving the index creation after the table row inserts (as one might do manually) so the index pages are not intermixed with table pages. Unfortunately this patch will not aid inline indexes made within the CREATE TABLE statement (i.e., UNIQUE). Nor will this patch help the situation when rows are being first inserted into a table (prior to VACUUM). So it won't help table population speed at all. > The solution you propose doesn't work when your table has multiple indices > of this type. One can imagine altering the schema as to store each indexed > column in a separate table, with a column containing the rowid to the main > table, but that would result in a significant amount of redundant overhead. Don't discount this workaround without first doing timings and database size experiments. If SQLite could be persuaded to reserve contiguous blocks of pages in advance for exclusive use for each index, insert times could improve for table with more than one index. The database file would be larger while the tables are being populated, but it could be made to shrink to a smaller size with an explicit VACUUM. I think this time/space trade-off would be acceptable to most users. This could be controlled via a PRAGMA if you want the classic paging behavior. ____________________________________________________________________________________ Do you Yahoo!? Everyone is raving about the all-new Yahoo! Mail beta. http://new.mail.yahoo.com ----------------------------------------------------------------------------- To unsubscribe, send email to [EMAIL PROTECTED] -----------------------------------------------------------------------------