On Sat, Feb 13, 2010 at 11:21:25AM -0800, Roger Binns scratched on the wall: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Jérôme Magnin wrote: > > SQLite provides decent indexing times for such tables with up to 1M > > rows, > > As another data point, my data set is 10M records and the 6 indices I need > are created in under two minutes. (I also analyze.) The database is 2GB > but the least amount of memory I have is 4GB. I also use a 32kb page size, > larger cache etc.
A larger cache makes a huge difference with index creation. As the website indicates, there are some known issues with the index sorting algorithm that, to my knowledge, have never bubbled to the top of the development team's TODO list. The existing insert algorithm does its job, but tends to need access to a large number of pages which can cause major cache thrashing if the index pages overflow the cache size. If you haven't already try increasing the default cache size by 100x or higher. Just be sure you stay in physical memory. > > I therefore had to develop an alternate datasource type (based on flat > > files) in order for my system to be able to efficiently handle big > > files. Which is a pity since SQLite provides great features I still > > would like to be able to rely upon when dealing with large files. > > You can also make virtual tables which will let you have the appearance of > everything being in SQLite. Yes... Virtual Tables will end up doing flat table-scans in most cases (unless you get very complex with the code) but is often extremely fast, especially if you're able to mmap the file. I've used them with great success for many "read-only" type tables where the typical access pattern is a scan and aggregation. Log files, for example. Although, if you're talking about creating multiple indexes, I assume that isn't going to work in this situation. Virtual Tables that link to external data sources are really cool and hugely useful, however. I'm surprised they aren't used more. If nothing else, they make great data importers. -j -- Jay A. Kreibich < J A Y @ K R E I B I.C H > "Our opponent is an alien starship packed with atomic bombs. We have a protractor." "I'll go home and see if I can scrounge up a ruler and a piece of string." --from Anathem by Neal Stephenson _______________________________________________ sqlite-users mailing list [email protected] http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

