Hi all, I have a very simple schema. I need to assign a unique identifier to a large collection of strings, each at most 80-bytes, although typically shorter.
The problem is I have 112 million of them. My schema looks as follows: CREATE TABLE rawfen ( fen VARCHAR(80) ); CREATE INDEX rawfen_idx_fen ON rawfen(fen); Unforuntately, data loading this table takes virtually forever. After 24 hours, its not finished, and that is inserting rows in transactions of 100,000 rows per transaction. I tried dropping the index, and building it after row insertion. That has two problems. First, since I have no index, I can't guarantee string uniqueness (and I'd like to). Second, it still doesn't solve my speed problem. Insertion without the index takes a little over an hour, but the index creation never finishes. Well, I gave it 6 hours and it was unclear if it was making any progress. I've read elsewhere that this is a data locality issue, which certainly makes sense. And in those threads, a suggestion has been made to insert in sorted order. But it's unclear to me exactly what the sorting function would need to be - it's likely my sorting function (say strcmp()) wouldn't match what sqlite expects. It's also a little unclear if I can even sort this many keys externally with any ease. After all, that's why I'm trying to use a database. Last, is this something that is likely to affect me if I port over to another database? Do others, say mySQL for instance, handle this better? Does anyone have any suggestions? Upgrading my server isn't an option - it already has 4Gig of memory and very fast disks in a RAID-5 configuration (with a 512MB cache on the raid controller). Thanks! -- View this message in context: http://www.nabble.com/Index-creation-on-huge-table-will-never-finish.-tf3444218.html#a9604705 Sent from the SQLite mailing list archive at Nabble.com. ----------------------------------------------------------------------------- To unsubscribe, send email to [EMAIL PROTECTED] -----------------------------------------------------------------------------

