Re: [sqlite] Difference in these indices?

Ken Wed, 28 Mar 2007 06:58:00 -0800

Joe, 
   
  In general the storage concept would be nice to extend this to a more general 
case.
   
  Extend the sql language to allow storage to a specified file. This would 
allow the user to easily seperate the data vs indices. 
  ie Create table  ... ( column.... ) storage-clause
       
  storage-clause := store as (fileName, pageSize, initialAlocation)
   
   
  Ken

Joe Wilson <[EMAIL PROTECTED]> wrote:
  --- [EMAIL PROTECTED] wrote:
> If you insert records in order of ascending integer primary
> key, then the inserts will be very fast. If you insert records
> where the integer primary key is randomized, inserts will be
> reasonably fast until the size of your table exceeds the size
> of your disk cache. Then each insert will need to do multiple
> reads and writes to disk as it tries to figure out where in
> your massive table is the right place to put the new record,
> and then make space for that new record. All this disk I/O
> will slow things down dramatically.

SQLite is extremely fast in most area of SQL functionality except for 
bulk inserts of non-ordered data into multi-indexes tables, where it 
is very slow. This pre-sorting-before-insert trick only works for single 
indexed tables, and is not useful in the general case. It also defeats
the purpose of using a database - why should the user have to pre-sort
data to achieve acceptable performance?

In the current SQLite insert algorithm there's way too much shuffling 
around of disk pages in order to minimize database footprint at 
all costs. There ought to be a PRAGMA to prefer maximizing insert 
speed when inserting, and if it doubles or quadruples the database 
size - so be it. Most embedded devices may opt for minimum space; but 
most PC-based apps would likely opt for maximum insert speed.

I believe 2 measures when applied together would significantly 
improve insert speed of indexed rows:

1. Allot each index/btree a contiguous region of the database file in 
which to grow without conflicting with the other indexes' pages and pages 
of the underlying table itself. (Putting indexes in separate files if 
only temporarily for bulk insert would be better, but hey, it's SQLite).

2. Adjust the btree algorithm to create larger gaps in the data when you
make space for new keys in the btree. Such page-sized gaps could minimize 
much of the page shuffling currently done.

When the inserts are done, the user need only issue a VACUUM to bring 
the database file back to its normal "packed" sized without such gaps 
if they should so choose.

____________________________________________________________________________________
TV dinner still cooling? 
Check out "Tonight's Picks" on Yahoo! TV.
http://tv.yahoo.com/

-----------------------------------------------------------------------------
To unsubscribe, send email to [EMAIL PROTECTED]
-----------------------------------------------------------------------------

Re: [sqlite] Difference in these indices?

Reply via email to