On Sat, Feb 13, 2010 at 11:21:25AM -0800, Roger Binns scratched on the wall:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> Jérôme Magnin wrote:
> > SQLite provides decent indexing times for such tables with up to 1M 
> > rows, 
> 
> As another data point, my data set is 10M records and the 6 indices I need
> are created in under two minutes.  (I also analyze.)  The database is 2GB
> but the least amount of memory I have is 4GB.  I also use a 32kb page size,
> larger cache etc.

  A larger cache makes a huge difference with index creation.  As the
  website indicates, there are some known issues with the index sorting
  algorithm that, to my knowledge, have never bubbled to the top of the
  development team's TODO list.  The existing insert algorithm does its
  job, but tends to need access to a large number of pages which can cause
  major cache thrashing if the index pages overflow the cache size.

  If you haven't already try increasing the default cache size by 100x or
  higher.  Just be sure you stay in physical memory.

> > I therefore had to develop an alternate datasource type (based on flat 
> > files) in order for my system to be able to efficiently handle big 
> > files. Which is a pity since SQLite provides great features I still 
> > would like to be able to rely upon when dealing with large files.
> 
> You can also make virtual tables which will let you have the appearance of
> everything being in SQLite.

  Yes... Virtual Tables will end up doing flat table-scans in most cases
  (unless you get very complex with the code) but is often extremely fast,
  especially if you're able to mmap the file.  I've used them with
  great success for many "read-only" type tables where the typical
  access pattern is a scan and aggregation.  Log files, for example.

  Although, if you're talking about creating multiple indexes, I assume
  that isn't going to work in this situation.

  Virtual Tables that link to external data sources are really cool
  and hugely useful, however.  I'm surprised they aren't used more.
  If nothing else, they make great data importers.

   -j

-- 
Jay A. Kreibich < J A Y  @  K R E I B I.C H >

"Our opponent is an alien starship packed with atomic bombs.  We have
 a protractor."   "I'll go home and see if I can scrounge up a ruler
 and a piece of string."  --from Anathem by Neal Stephenson
_______________________________________________
sqlite-users mailing list
[email protected]
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to