Message: 4 Date: Thu, 26 Jun 2008 12:15:00 -0700 From: "Cory Nelson" <[EMAIL PROTECTED]> Subject: Re: [sqlite] sqlite DB creation performance To: "General Discussion of SQLite Database" <sqlite-users@sqlite.org> Message-ID: <[EMAIL PROTECTED]> Content-Type: text/plain; charset=UTF-8
>On Thu, Jun 26, 2008 at 11:58 AM, Brooks, Phil <[EMAIL PROTECTED]> wrote: >> Can anyone suggest additional things I might try to: >> >> 1) reduce db creation time >> - or - >> 2) reduce the size of the file? >> >Import speed is easily fixed - don't index until after the data is >inserted. It will be *much* faster. This actually makes remarkably little difference. When you think about it, this makes sense - in either case, you are simply adding 40 million strings to a tree structure. It doesn't matter much if you are doing it before or after the original records have been created. >Unfortunately this is a worst-case scenario for when it comes to file >size. Indexes store a duplicate of the data they are indexing, so >each of those indexes are copying a large portion of data. One way to >improve size might be to have a separate table just mapping strings to >integers, and use integers in your main table. But again, I need to index on strings? Jay Kreibich suggested creating a hash number and making that a key for the match table, so I would create another column in the table that would contain a hashed integer, create an index on that, then when I need to do a query, I could select on the hashed index and then compare each record returned against my query string. This is actually similar to the solution I am using now in my ondisk file format. I did manage to speed things up a bit more by adding: PRAGMA synchronous = OFF; PRAGMA temp_store = MEMORY; Adding to the CACHE, as Jay Kreibich also suggested, also helped: PRAGMA cache_size=16384; So now I am down close to parity with my original file format - about 30-40 minutes. Again I haven't broken it down to figure out which one saved how much time. I will try creating a hashed string number to put into my .csv now. _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users