Re: [sqlite] Index creation on huge table will never finish.

John Stanton Thu, 22 Mar 2007 09:32:40 -0800

A fast technique to achieve your objective is to perform what I believeis called a "monkey puzzle" sort. The data is not moved, instead anarray of descriptors to each element is sorted. The output is realizedby scanning the list of descriptors and picking up the associated recordfrom the input list.

Using a modest machine your application should run in less than tenminutes using that method. One way we use it is as a first stage inbuilding a B-Tree index rapidly.


Chris Jones wrote:

Thanks everyone for your feedback.

I ended up doing a presort on the data, and then adding the data in order.At first I was a little concerned about how I was going to implement an

external sort on a data set that huge, and realized that the unix "sort"
command can handle large files, and in fact does it pretty efficiently.

So, I did a "sort -u -S 1800M fenout.txt > fenoutsort.txt"

The sort took about 45 minutes, which is acceptable for me (it was much
longer without the -S option to tell it to make use of more memory), and
then loading the table was very efficient.  Inserting all the rows into my

table in sorted order took only 18 minutes.

So, all in all, I can now load the table in just about an hour, which is
great news for me.

Thanks!
Chris



-----------------------------------------------------------------------------
To unsubscribe, send email to [EMAIL PROTECTED]
-----------------------------------------------------------------------------

Re: [sqlite] Index creation on huge table will never finish.

Reply via email to