<[EMAIL PROTECTED]> wrote: > The problem (I suspect) is that you have an index on Table3. > As you insert to records into Table3, those record go at the > end, which is very efficient. But the index entries have to > be inserted in index order, which means they get scattered > out all through the index. This works fine as long as the > index is small enough to fit in cache (either SQLite's private > cache, or failing that your operating systems disk cache.) > Once the index gets too big to fit in cache, you tend to > start thrashing. > > The problem is a lack of locality of reference in the index. > Each index insertion requires O(logN) disk reads and writes. > This is no big deal as long as a "disk read and write" is > serviced from cache but gets to be a very big deal when it > actually involves real disk I/O. You start to get real disk > I/O when the index loses locality of reference and exceeds > the size of your cache. I do not know how to fix this and > still make the indices useful.
If you remember I had this problem with my app, the work around I found is to read the whole DB on startup (well, the one used for indexing my users) to help the OS cache it better (for 10M users it's about 600 megs, so it's still reasonable). Now as a separate file for indexes/primary keys is not an option (even though, I still think it would be the easiest solution, that's how mysql does it, I don't know about postgresql), an alternative would be to change the layout in the file so that indexes/primary keys are stored together in the file and the rest of the rows somewhere else in the file (the file could grow by having an alternation of index/key data and other data). At first this might seem slower, but in practice it won't (well, it might just be worth a try): the area of the file that contains indexes/primary keys will be accessed all the time and thus cached by the OS while the rest of the data will be less cached and discarded from memory. So even though it looks like 2 seeks and 2 reads (or writes) would be needed for every row, in practice, the operations on the index/key part will be from cache and will be faster (and when writing using transactions, the added seeks should not really increase the time by much). Nicolas