Re: [sqlite] Index creation on huge table will never finish.

Eduardo Morras Thu, 22 Mar 2007 06:17:51 -0800

At 04:47 22/03/2007, you wrote:

I don't think that solves my problem.  Sure, it guarantees that the IDs are
unique, but not the strings.


My whole goal is to be able to create a unique identifier for each string,
in such a way that I dont have the same string listed twice, with different
identifiers.

In your solution, there is no way to lookup a string to see if it already
exists, since there is no index on the string.

Thanks,
Chris

So you have a file with data, a large collection of strings 112millions, each at most 80-bytes, although typically

shorter.

How do you manage repeated data? Replace? First In? Modify string to be unique?

You want put them in a sqlite3 database, but each string must be onlyonce. The problem i see here is if you have a data file with repeatedstrings or not. I think that a grep or a perl script can help you alot cleaning your data first. Then import to database will be fast.


HTH




--------------------------------------------------------------------------
"Hemos encontrado al enemigo y somos nosotros"



-----------------------------------------------------------------------------
To unsubscribe, send email to [EMAIL PROTECTED]
-----------------------------------------------------------------------------

Re: [sqlite] Index creation on huge table will never finish.

Reply via email to