Message: 4
Date: Thu, 26 Jun 2008 12:15:00 -0700
From: "Cory Nelson" <[EMAIL PROTECTED]>
Subject: Re: [sqlite] sqlite DB creation performance
To: "General Discussion of SQLite Database" <sqlite-users@sqlite.org>
Message-ID:
        <[EMAIL PROTECTED]>
Content-Type: text/plain; charset=UTF-8

>On Thu, Jun 26, 2008 at 11:58 AM, Brooks, Phil <[EMAIL PROTECTED]>
wrote:
>> Can anyone suggest additional things I might try to:
>>
>>    1) reduce db creation time
>> - or -
>>    2) reduce the size of the file?
>>

>Import speed is easily fixed - don't index until after the data is
>inserted.  It will be *much* faster.

This actually makes remarkably little difference.  When you think about
it, this makes sense - in either case, you are simply adding 40 million 
strings to a tree structure.  It doesn't matter much if you are doing 
it before or after the original records have been created.

>Unfortunately this is a worst-case scenario for when it comes to file
>size.  Indexes store a duplicate of the data they are indexing, so
>each of those indexes are copying a large portion of data.  One way to
>improve size might be to have a separate table just mapping strings to
>integers, and use integers in your main table.

But again, I need to index on strings?  Jay Kreibich suggested creating
a hash number and making that a key for the match table, so I would
create
another column in the table that would contain a hashed integer, create
an
index on that, then when I need to do a query, I could select on the 
hashed index and then compare each record returned against my query
string.  This is actually similar to the solution I am using now in my
ondisk file format.

I did manage to speed things up a bit more by adding:

PRAGMA synchronous = OFF;
PRAGMA temp_store = MEMORY;

Adding to the CACHE, as Jay Kreibich also suggested, also helped:

PRAGMA cache_size=16384;

So now I am down close to parity with my original file format - about
30-40
minutes.  Again I haven't broken it down to figure out which one saved
how 
much time.

I will try creating a hashed string number to put into my .csv now.

_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to