"Paul Harris" <[EMAIL PROTECTED]> wrote: > > > A thought along the same lines, can sqlite create a unique index that > is hash-based? this would provide the UNIQUE support, but it wouldn't > provide a sorted index. > > That should resolve the massive-insert-too-slow problem, and > afterwards he can create a sorted index on the column if he needs > ordered lookups. >
SQLite *could* create a hash index. But I seriously doubt that would do anything to solve this problem. In fact, it would probably make the problem worse. The problem arises from the lose of cache locality. The size of the database has grown to the point where the disk cache can no longer hold the entire database file. Each insert must check and update random pages within the database file. Because the cache cannot hold the whole database, accessing the pages that must be checked and updated involves real disk I/O rather than just a copy of a page out of cache memory. Doing real disk I/O takes much, much longer than copying memory out of cache. The way to fix this problem is to improve the locality of reference and thus get the cache working better. This can be done using B-Trees. I just haven't found the time to do it yet. But because hashing is pseudo-random, hashing actually makes locality or reference worse and makes the problem much harder to solve. -- D. Richard Hipp <[EMAIL PROTECTED]> ----------------------------------------------------------------------------- To unsubscribe, send email to [EMAIL PROTECTED] -----------------------------------------------------------------------------